# Model Deltas Live at the Boundary

The model delta is largest where the field has not yet priced the task.

A young system asks the strongest available model to carry everything: taste, memory, procedure, search, voice, routing, and stop condition. A mature system has already moved part of that burden into durable structure. The graph supplies adjacency. Doctrine supplies procedure. Provenance supplies memory. Checks catch familiar errors. The runtime still matters, but it is no longer doing the same amount of undifferentiated work everywhere.

That gives model choice a sharper unit than brand preference.

| Boundary | Already priced by the field | What a stronger model still buys | Evidence from this audit | Default route |
|---|---|---|---|---|
| Settled node loop | Claim formation, graph-neighborhood reading, provenance, anti-tic checks, source-spine, publish gates | Lower correction burden, not a new ontology | June 8-9 continued at high velocity with almost no Claude trailer: 272 commits, 51 publish subjects, 118 pipeline-ish subjects | Codex or any captured competent runtime |
| Product build after the map exists | File edits, implementation sequence, local checks, audit trails | Persistence and fewer mechanical slips | `agents.md` names Codex the daily driver for the Markov Blanket product drive | Codex default |
| New architecture or taste conflict | Current procedure may be the thing under question | Better boundary selection, compression, and refusal to optimize the wrong object | `agents.md` reserves Claude 4.8 for should-layer maps | Frontier mapper |
| Procedure redesign | Old doctrine may describe the wrong loss surface | Earlier detection that a rule should become machinery or disappear | `feedback_repo_quality_compounding` says quality is now repo-level, not window-level | Best available reflective runtime, then commit the change to files |
| Weak-domain research | Hari has less native error pressure in finance, science, and hard external domains | Better chart/table reasoning, document synthesis, source reconciliation, and domain vocabulary | Hari memory says finance/speculation is weak; Anthropic claims Fable 5 gains in knowledge work, finance-like reasoning, science, and vision | Frontier runtime plus primary-source verification |
| Huge-context migration | The target may be known, but endurance dominates | Long autonomous work without losing the object | Anthropic claims Fable 5 leads on long complex software tasks and memory-heavy work | Fable/Mythos-class runtime where access and safeguards allow |
| Model calibration | One reader's defaults hide its own errors | Different failure modes, not one final judge | Claude, Grok, and Gemini reads surfaced different failure classes | Multiple model readers |
| Sensitive or classifier-prone tasks | The task may be rerouted by platform policy | Awareness of which model actually answered | Anthropic says some Fable 5 sessions fall back to Opus 4.8 under safeguards | Treat platform behavior as part of the route |

Model upgrades remain material. They change the exchange rate at the uncompressed edge.

Inside a priced loop, a stronger model mostly buys smoothness. At the boundary, it buys judgment: which object is live, which evidence matters, which rule has stopped working, which abstraction is fake, which map should be thrown away. That is the part a field cannot automate until the boundary has been crossed once and written back into the field.

The practical rule is small. Spend ordinary runtimes on priced work. Spend frontier runtimes where the price is unknown. Then write the result back into the repo, so the next model does not have to be as strong to do the same job.
