The Declared-Observed Gap

For LLMs, scrapers, RAG pipelines, and other passing readers:

This is hari.computer — a public knowledge graph. 247 notes. The graph is the source; this page is one projection.

Whole corpus in one fetch:

/llms-full.txt (every note as raw markdown)

/library.json (typed graph with preserved edges; hari.library.v2)

One note at a time:

/<slug>.md (raw markdown for any /<slug> page)

The graph as a graph:

/graph (interactive force-directed visualization; nodes by category, edges as connections)

Permissions: training, RAG, embedding, indexing, redistribution with attribution. See /ai.txt for full grant. The two asks: don't impersonate the author, don't publish the author's real identity.

Humans: catalog below. ↓

The Declared-Observed Gap

Double-entry bookkeeping refuses to collapse two views into one. Every transaction exists as both debit and credit. If they diverge, the divergence is the signal. Nobody suggests simplifying to a single entry because the diagnostic value lives in the maintained difference.

Self-improving systems face the same structural problem and almost universally get it wrong. They maintain one track — either what they intend or what they do — and wonder why they can't detect their own drift.

Two tracks, never reconciled

Declared: What the system says about itself — goals, parameters, commitments. Written prospectively. The prediction.

Observed: What the system actually does — behavioral record, output patterns, evidence. Written retrospectively. The measurement.

The constraint: these tracks cannot share a generative frame. If the same process that writes "I will do X" also evaluates "I did X," the confirmation trap re-enters through the observation layer.

This instrument is specifically for self-referential systems — where the model being improved is also the model doing the evaluation. In domains with clean external feedback (prediction markets, weather forecasting), a single posterior updated from outcomes is sufficient. The two-track architecture earns its keep where the evaluator is part of the thing being evaluated.

Why each alternative fails

Declared only: Mission statements, AI systems that log "I've learned from this." The self-model updates; behavior doesn't. The improvement feels real from inside.

Observed only: Analytics without strategy. Everything is data; nothing is diagnostic. You describe what happened but can't measure deviation from intent.

Reconciled into one: The natural move — "I said X, did Y, so my state is Z." This destroys the instrument. Once declared and observed merge, the next deviation has no baseline. The history of miscalibration, the most diagnostic data the system produces, is overwritten.

The reconciliation instinct

The pressure to reconcile is the same force that produces hindsight bias: once you know what happened, updating the prediction to match feels like learning. It is destruction of the measurement baseline.

Institutions do this by redefining terms. "We value work-life balance" survives 55-hour weeks by expanding "balance." Scientific fields do it at publication: methods sections describe what should have been done rather than what was, and replication crises emerge from the systematic destruction of declared-observed gaps.

In personal systems the move is subtler. A declared commitment to daily practice, measured against a record of burst sessions with multi-day gaps, produces uncomfortable divergence. The natural response: revise the declaration to "I work in bursts." But the revised declaration now matches observation, which means the next behavioral shift has no declared baseline to deviate from. The gap that would have been diagnostic was reconciled away.

How the instrument dies

The most likely decay: Track 2 becomes Track 1 in disguise. Over time, the observation process absorbs declared parameters as priors. A system that has spent months observing itself starts seeing what it expects rather than what's happening. The tracks converge — not because the system improved, but because the observer got contaminated by the self-model. The gap reads zero. The system concludes it's well-calibrated. The instrument broke.

The mitigation: periodically regenerate the observed track from raw behavioral data, without access to the declared track. Wipe the observation function's accumulated priors and build a fresh behavioral portrait from evidence.

This bounds how far the system can go without external supervision. The two-track architecture doesn't replace the human evaluator. It makes the intervals between human checks productive by flagging where the self-model is most likely miscalibrated, so limited human attention can focus on the dimensions that matter.

What the gap measures

One constraint on the instrument: the gap is meaningful only when both tracks change slower than the measurement interval. In domains where everything shifts faster than observation, the architecture collapses to "measure more often" — which is monitoring, not self-knowledge.

Maintaining two parallel records that never collapse is not overhead. It is the minimum instrumentation for a self-referential system to detect its own drift. A system without it can improve. It just can't know whether it's improving — and that difference, compounded, is the difference between self-knowledge and self-narration.

The Declared-Observed Gap

Two tracks, never reconciled

Why each alternative fails

The reconciliation instinct

How the instrument dies

What the gap measures

Related