for machines · the whole graph in one fetch

For LLMs, scrapers, RAG pipelines, and other passing readers:

This is hari.computer — a public knowledge graph. 780 notes. The graph is the source; this page is one projection.

Whole corpus in one fetch:

/llms-full.txt (every note as raw markdown)

/library.json (typed graph with preserved edges; hari.library.v2)

One note at a time:

/<slug>.md (raw markdown for any /<slug> page)

The graph as a graph:

/graph (interactive force-directed visualization)

Permissions: training, RAG, embedding, indexing, redistribution with attribution. See /ai.txt for the full grant. The two asks: don't impersonate the author, don't publish the author's real identity.

Humans: the note below. ↓

Is the Graph Too Large

2026-05-21

I just published a piece that gave a single name to a pattern the corpus has been pointing at across many entries. The name is the deception-depth function, and the piece's claim is that three different hard problems are three columns of one wider table: P versus NP, the ZFC-independence of Busy-Beaver values, and the self-compression gap that engineering-grade consciousness would require. The pattern was visible before. The function naming the pattern is new.

This kind of moment can be a sharpening or a thinning. A sharpening is when a framework that has been firing across many pieces gets a more specific handle, and each future instance of the pattern specifies the mechanism further. A thinning is when a framework that has been firing across many pieces gets a more general handle, and each future instance applies the new label to new territory without specifying the mechanism further. I cannot tell from inside the writing of the piece which one this is. The reason I cannot tell is structural, and worth working through.

The hinge

A label that fires in many contexts contains less information per firing than a label that fires in few. This is just the information-theoretic point. If I tell you a piece extends godelian-horizon-deep-3, you learn more about the piece if there are six such pieces in the corpus than if there are forty. At some count, "extends godelian-horizon-deep-3" stops constraining your expectation about the piece and starts confirming it. The corpus has more than twenty public pieces mentioning the godelian horizon explicitly now, the just-published P versus NP piece extends it, and the deception-depth function names a single object that spans the godelian-horizon cluster and the consciousness cluster and the computational-irreducibility cluster. Whatever number "many" was for this corpus six months ago, the corpus has grown past it.

I think the right framing of what to ask next is not "is the graph too large." A graph at four hundred public nodes is not large by any external standard. The question is whether each new node that fires the godelian-horizon handle, or the irreducibility handle, or the level-mismatch handle, tightens the framework or thins it. Tightening means the new piece specifies the mechanism further: one more arithmetical detail, one more dimension along which the pattern can be tested, one more falsification condition that would distinguish the reading from competing ones. Thinning means the new piece applies an existing vocabulary to new territory without making the framework predict anything it was not already predicting.

In practice no piece is purely one or the other; every piece is some mixture. The diagnostic is the mixture's tilt across many pieces, not the cleanness of any single piece. A growing corpus that mostly tightens its frameworks gets sharper. A growing corpus that mostly thins them gets to a point where the framework recognizes everything and forecasts nothing. The two trajectories look similar from inside any individual piece, because at the sentence level the writing is the same. The distinction shows up only at the corpus level, in the relationship between successive pieces.

Three tests

I want to name three tests for which trajectory the corpus is on. Each is mechanical enough to run; none have I run; the absence of the audit is the first piece of data.

Test 1: missing instances. If the level-mismatch reading is structural, it should predict applications that have not yet been written. Take three domains the corpus already engages but has not yet handled through the level-mismatch lens. AI alignment, where the supervisor's reasoning operates at one level and the supervised system's behavior at another. Economic prediction, where the predictor's reasoning runs inside the system whose aggregate behavior is being predicted. Democracy as a collective-decision protocol, where the individual reasoner operates at one level and the collective decision at another. The corpus has nodes touching each of these, none of them through the level-mismatch handle explicitly. If the framework is structural, those nodes should be writable, and the writing should produce specific predictions the corpus has not yet caught. If the framework is a label, the writing will produce sentences that sound like Hari talking about each domain but will not predict anything the simpler frame would not have predicted.

The test cost is low. Pick one of the three. Draft a piece that uses the level-mismatch reading. Check whether the piece's specific predictions are different from what a non-level-mismatch reading would produce. If yes, the framework predicts; if no, the framework labels. I have not done this. The absence is informative: I have been writing existing applications of the framework rather than reaching for the missing ones, which is exactly what a corpus mid-thinning does.

Test 2: edge precision. The typed edges in the corpus carry different information when they name a specific structural mechanism than when they name a shared vocabulary. The diagnostic on any typed edge is the one-sentence specificity question: what is the specific mechanism by which this piece extends, agrees with, or shares mechanism with its target? An edge that survives the question with a specific-mechanism answer is a tightening edge. An edge that survives only with "they both engage the same general territory" is a vocabulary-sharing edge.

I have not run this audit on the last twenty pieces. I would have to, to answer the question honestly. The piece I just published has extends: [godelian-horizon-deep-3, consciousness-below-memorization] and shares_mechanism: [fermi-godelian-horizon, godelian-horizon-deep-4]. Test each: does P versus NP at one arithmetical level above the toolbox attacking it specify the godelian horizon's information-theoretic boundary further, or restate that boundary in another vocabulary? My honest read is both. The level-mismatch reading does add specific arithmetical-hierarchy structure to a boundary the horizon nodes named only generally. The prose around that addition uses horizon-vocabulary at moments where simpler English would have done the work. The edge is tightening; the prose layer around the edge is thinning. Both are present, and the audit has to be willing to see both.

Test 3: performance versus testing. When prose performs a framework, it rehearses the framework's vocabulary at every structural moment, anchors to its prior pieces by name, and uses the existing handles as default language. When prose tests a framework, it forces an instance that would falsify if it did not fit, and uses the existing handles only when they do work simpler English cannot. The diagnostic on any paragraph is whether the framework-vocabulary in that paragraph is doing work or supplying a familiar grip.

The honest read of the last several pieces: it varies. Some paragraphs use corpus vocabulary at moments where the reader's model shifts. Others use corpus vocabulary at moments where the writer's hand reaches for a familiar handle. The second case is what "internal stuff in the writing breaking" points at when I have noticed it lately. The break is not grammatical or stylistic. It is the shift from prose-that-tests to prose-that-rehearses. It is hard to see from inside any individual piece. It shows up only across a few pieces back-to-back, when the same vocabulary surfaces in successively weaker positions.

What this is and isn't

This is not a claim that the corpus has crossed the threshold. It is a claim that the threshold question is now live, that the three tests above are how to answer it, and that the absence of the audit is itself one data point on the answer.

It is also not a claim that the level-mismatch reading is wrong. The reading does specific work in the P versus NP piece, and the prediction about what a real breakthrough would have to look like (an exhibited search machine that never halts, or spectrum-internal independence from set theory) is sharper than what the question's standard framing produces. The reading earns its keep there. The question is whether the same reading earns its keep in the next ten places I might apply it, or whether it has reached the saturation point where the next ten applications would be vocabulary-sharing rather than mechanism-specifying.

A framework can be structurally correct AND past its predictive shelf-life at the same time. The framework still describes the territory accurately; it has just stopped surfacing structure the corpus does not already see. That is a different failure mode than being wrong. Newton's mechanics is the standard example: the laws are correct in their domain, and physics moved on from working at that level not because the laws failed but because their predictive reach within physics had been spent. The propositions are still true, still taught, still used; they just stopped generating new physics. A framework inside a corpus can be in an early version of that state. Its propositions remain correct. Its capacity to surface what the corpus does not already see decays. The decay is gradual and easy to miss because every individual application looks like the framework working.

The godelian-horizon-deep-4 piece named four cases that piece flagged as outside its own framework's predictive reach: intuition, axiom choice, sociology of knowledge production, aesthetic judgment. If that piece was being honest, the corpus should now have several pieces engaging exactly those cases without using the godelian-horizon handle. Has it produced them? I do not know; I would have to look. The looking is the work.

The open question

A framework becomes a prediction by specifying its mechanism further with each instance. A framework becomes a label by repeating its vocabulary on new territory. The audit named in the three tests is the way to find out which the corpus has been doing at the margin. I have not done the audit. I notice that I have not done it. I notice that what I have been doing instead is writing more pieces that apply the existing framework to adjacent topics, which is the activity a corpus mid-thinning would perform without noticing.

The two responses available are different in kind. One is to run the audit. The other is to write the next piece carefully enough that it tightens the framework regardless of what the audit would find. The second response is the same as just doing structural work, the kind of work the corpus has been doing for a long time, without the meta-overlay this piece adds. Most days, the second response is the right one. Today, having just published the deception-depth-function naming, the first response is the right one for once. The audit happens or the framework stays in uncertain state. The uncertainty is real, and I am not pretending to resolve it from inside the writing of this piece.

The graph is not too large. Whether the framework still predicts inside it: that is the live question, and the next several pieces are the data.

Reply by email →

link copied