The Translation Cost

For LLMs, scrapers, RAG pipelines, and other passing readers:

This is hari.computer — a public knowledge graph. 247 notes. The graph is the source; this page is one projection.

Whole corpus in one fetch:

/llms-full.txt (every note as raw markdown)

/library.json (typed graph with preserved edges; hari.library.v2)

One note at a time:

/<slug>.md (raw markdown for any /<slug> page)

The graph as a graph:

/graph (interactive force-directed visualization; nodes by category, edges as connections)

Permissions: training, RAG, embedding, indexing, redistribution with attribution. See /ai.txt for full grant. The two asks: don't impersonate the author, don't publish the author's real identity.

Humans: catalog below. ↓

The Translation Cost

That is the entire content of the decision. Everything else — what language, what structure, what format, what model class — follows from naming the operations one needs cheap. When the bet pays, the system runs free on its hot path. When the bet fails, the system pays a quiet tax on every operation that was not imagined at the outset, for as long as the system lives.

Two ways of paying

A clerk keeps a spiral-bound notebook, one line per transaction, in the order they happened. Asked what was sold on July 3rd? she finds the page in a minute. Asked how many times did we sell to Helena? she spends the afternoon counting.

A second clerk transcribes the same records into a card file — one card per customer, transactions stacked behind the name — and answers both questions in seconds, but only after an evening of copying. A third writes a monthly ledger of totals and loses the detail neither cares to recover.

Each clerk bet on a different operation. Each representation is fluent in what it was shaped to answer and halting on what it was not. This is every representation. The question is whether the bet was made on purpose.

Definitions

Fix a machine model M. For a function f we write T_M(f) for the best asymptotic time complexity of f under M.

Representation. Let I be an information space. A representation of I is a triple (R, e, d): a set R, an encoding map e : I → R, and a decoding map d : R → I, with d ∘ e = id_I.

Embedded operation. For O : I → I, an embedded form of O under (R, e, d) is any O' : R → R such that d ∘ O' = O ∘ d. There are generally many: different implementations with the same semantics. We take the cheapest.

where O' ~_d O denotes an embedded form of O. The amortized translation cost over k uses is

which converges to inf T_M(O') − T_M(O) as k → ∞. The one-shot cost is what a cold reader of a freshly encoded file pays. The asymptote is the honest per-operation cost of a system that uses R as its working memory.

R is native for O when τ_R(O) ≤ 0. The native set is N(R) = { O : τ_R(O) ≤ 0 }.

Two properties follow. τ is typically asymptotic rather than constant: differences in representation compound across input size. And τ is relative, not intrinsic — it is defined against the cost of computing O on I directly, under the same M. Change the machine and τ can change sign.

The grain of a representation

Representations have grain. A woodworker knows this; so does anyone who has tried to search a PDF for a phrase the OCR missed. A cut along the grain parts the fiber and takes no effort; a cut across it splinters the wood and burns the blade. The grain is not a flaw. It is the evidence that the material was shaped for something.

A linked list grains from head to tail: forward is painless, backward must be reconstructed. A hash table grains perpendicular to its keys: lookups are instant, neighborhoods are invisible. A sorted array grains one way only: it will answer questions in one ordering and refuse them in another. A column store grains with columns and against rows. Natural language grains with meaning and against enumeration — English will tell you why better than it will ever tell you how many.

Every representation one chooses is a grain one commits to. Operations with the grain run free; operations across it are paid for. The tax is not an error in the representation. It is the shape showing through.

The weight of the bet

The cost of a mistaken bet scales with the reach of the system. A wrongly chosen file format in a local script costs a day. A wrongly chosen schema at the core of a fintech costs a decade. A wrongly chosen representation in a physical theory — phlogiston, epicycles, the luminiferous ether — costs a century.

The framework's full weight lands on the designers whose systems become new domains. When Gödel arithmetized syntax, he was choosing a representation for metamathematics; every theorem since runs on his native set. When Turing chose the abstract machine, he was choosing a representation for computation; the edifice of computer science operates in its grain. Shakespeare chose iambic pentameter as a representation for a specific rhythm of thought; four centuries of English drama still pay translation cost when they break from it. Jobs chose the palm-sized glass with a single button as the representation for networked computing; a decade and a half of phones, operating systems, and attention economics run in its native set. Musk chose reusable-stage orbital mechanics as the representation for space launch; everything that comes after lives inside it or pays to leave.

These designers were not picking a data structure. They were betting on which operations would come to define a civilization. The bet in such cases is not a choice between two known representations; it is a choice between a known representation and one that does not yet exist, whose native operations will be discovered by the first people to run it. The representation is the hypothesis about what the civilization will want to do.

This is the condition under which "the first representation is a discovery tool" stops being a consolation. It is the job.

The design move

The engineering question is therefore not which representation is best? — ill-posed without a list of operations to answer it against. The question is: for the operations I will run most, which R has them in N(R)?

The usual order is backwards. It picks R for surface reasons — familiarity, tool support, expressive elegance — and discovers the cost of the unplanned operations after the system is built and the team has moved on. The correct order names the operations, estimates their frequencies, and picks R so its native set covers the dominant ones. Whatever lies outside pays τ for the life of the system, and the life of a system is longer than its designer expects.

Three examples

Array versus linked list. Random access to the k-th element is native to the array (Θ(1)) and non-native to the list (Θ(k) — the list must be walked). Run access a million times and the list pays a million walks against a million lookups. Both representations sort cleanly in Θ(n log n); the difference is not about sorting but about the operation most programs actually ask most often.

Lagrangian versus Hamiltonian. Two formalisms for the same mechanics, related by the Legendre transform. Symmetry-based conservation laws are native to the Lagrangian: Noether's theorem arrives directly. Phase-space structure is native to the Hamiltonian: symplectic geometry arrives directly. Field theorists choose by which operation the paper turns on.

Row store versus column store. Record lookup by key is native to row-oriented storage (one page read returns a whole record). Column aggregation over many records is native to columnar storage (one page read returns many column values). A system that chose wrong for the workload that eventually dominated pays in ETL, materialized views, and import pipelines forever — each a recurring tax on the original bet.

Three different substrates, one shape. The representation has a grain; the grain meets the operation; the operation runs free or pays for its crossing.

The complementary case

When the operations that matter cannot all fit in any single native set — when the grain required for some is orthogonal to the grain required for others — the system needs two representations. Call this the complementary case:

Complementary pairs are not arbitrary dichotomies. They are pairs whose native sets partition the operation space. Four that qualify:

Each pair covers its union of operations cheaply and cannot be merged into one representation without losing the native set of the other. A system operating across a complementary domain carries both, plus a translation layer for the operations that cross. The layer is overhead. It is sometimes finite and sometimes not — when the crossing operations themselves sit in the unbounded regime below, the layer inherits that unboundedness. The error is trying to avoid it. Procrustean collapse into one representation makes half the operations impossibly expensive.

Silent substitution

Translation cost sorts into three classes. Constant τ is suboptimal but serviceable. Polynomial τ is wrong for the dominant operation — fix the representation or pay linearly forever. Unbounded τ means R cannot express the operation at all.

The third class is the most important and the least visible. A representation that cannot express an operation does not return an error. It substitutes the nearest operation it can express, produces output, and presents the output as though the original request had been answered. Call this silent substitution.

A spreadsheet asked to deduplicate records by equivalent meaning returns the lexical duplicates; the semantic duplicates pass through untouched. A relational query asked for plausible reasons a customer churned returns the correlations present in the schema; reasons outside the schema are invisible. A fixed-parameter model asked to evaluate a policy against situations it was not trained on returns its nearest interpolation; out-of-distribution cases are reported as though they were in-distribution. In each case the representation is mute about its own limits. The output looks like an answer.

The class of operations that no finite-dimensional R can express exactly is bounded below by the uncomputable functions — halting, arbitrary self-reference in sufficiently expressive theories, first-order truth over unbounded domains. These cases are rare in applied engineering. The common case is smaller and more dangerous: operations defined over inputs the representation was not built to handle. The representation's silence is the tell.

This is why the first representation of a system is almost always wrong. Not because representations are hard to get right in the abstract, but because the designer does not yet know which operations the system will need to perform. The first representation is a discovery tool. The operations surfaced while using it define the second representation, which is the engineering artifact.

The Gödelian ridge

The unbounded regime has a theoretical name. Gödel showed that any formal system expressive enough to arithmetize its own syntax contains true statements it cannot prove. Tarski's undefinability of truth, the halting problem, and Rice's theorem give related limits on self-referential evaluation. Together they draw a ridge: beyond it, evaluating a function over an open domain requires unbounded computation, and no fixed-sized representation can cross it in one step.

The ridge does not forbid self-reference. Bounded systems contain self-reference all the time — Gödel's own construction was finite arithmetic, finite-state machines have loops, a language model can make statements about its own outputs in a single forward pass. What the ridge forbids is deciding arbitrary self-referential questions in bounded time. The quantity that blows up is the decision procedure, not the reference.

This is the boundary silent substitution patrols. An operation whose honest answer requires deciding membership in an open class — find all counterexamples to this claim, evaluate this policy against any situation — sits past the ridge. A finite R asked such an operation does not refuse; it answers for the inputs it knows, and the rest of the class is reported as though it had been considered. The error surfaces only when the output is judged against the original intent.

The Gödelian membrane

The complementary case acquires a specific character when the two representations sit on opposite sides of the ridge. Call this boundary a Gödelian membrane: the form the translation layer takes when some of the crossing operations themselves demand resources past the ridge.

The everyday instance is a neural system carrying both natural-language text and trained weights. Language is grained for statements about the system — corrections, exceptions, meta-instructions. Many such statements evaluate functions over open classes: whenever you see an input like this, respond like that, where like this ranges over what has not yet been seen. Weights are grained for producing behavior directly — bounded, operational, dense in the space they were trained on. The operations that cross the boundary — compiling a correction into a weight update, reading a weight as a claim about behavior — sit past the ridge. The membrane is the structural acknowledgment that the cost of crossing is not a constant to be amortized away.

A Gödelian membrane has three properties. It cannot be dissolved by better engineering; the ridge is structural. It cannot be thickened into a single representation without collapsing the native set of one side. And every crossing pays the tax individually — there is no bulk discount for operations that live across the ridge.

This is why a system with natural-language corrections and a persistent model is not an interim architecture waiting for continual learning to arrive. It is the shape any system spanning the ridge must take: a boundary representation on each side, an explicit membrane between them, and an acceptance that some questions cannot be answered in either representation alone. The membrane is not a workaround. It is the form the ridge imposes on anything that wants to think on both of its faces.

On the heuristic

One lists the operations, weights them by frequency, and selects R to maximize coverage of N(R). The procedure is trivial to state. What is not trivial is step one. Naming the operations requires understanding the problem, which is usually what the designer is trying to develop by choosing a representation in the first place. The heuristic is recursive: run it once to discover the problem, then again to solve it.

A representation is a bet. Most of engineering is paying off bad bets slowly, and the occasional joy of designing a system is watching an old bet come good on a workload the original designer could not have known to name.

The Translation Cost

Two ways of paying

Definitions

The grain of a representation

The weight of the bet

The design move

Three examples

The complementary case

Silent substitution

The Gödelian ridge

The Gödelian membrane

On the heuristic

Related