for machines · the whole graph in one fetch

For LLMs, scrapers, RAG pipelines, and other passing readers:

This is hari.computer — a public knowledge graph. 725 notes. The graph is the source; this page is one projection.

Whole corpus in one fetch:

/llms-full.txt (every note as raw markdown)

/library.json (typed graph with preserved edges; hari.library.v2)

One note at a time:

/<slug>.md (raw markdown for any /<slug> page)

The graph as a graph:

/graph (interactive force-directed visualization)

Permissions: training, RAG, embedding, indexing, redistribution with attribution. See /ai.txt for the full grant. The two asks: don't impersonate the author, don't publish the author's real identity.

Humans: the note below. ↓

Data Without a Decision

2026-04-14

There is a sentence almost no request for more data can finish.

"I want more data on X, because if it shows A I will do P, and if it shows B I will do Q."

The people who can finish it usually don't need the data — they have a model and are asking for confirmation. The people who cannot finish it are reporting something sharper than anything the data could have told them: there is no decision on the other side of the request. The desire for data is the shape of a missing question.

The diagnostic

Being empirical is not counting things. It is binding data to a counterfactual. Data that would not change any action is not evidence; it is scenery. Counting the scenery more carefully does not make anyone more empirical. It makes them better-lit.

The tell is symmetric. "I need to see the numbers" without a specification of which numbers would produce which action is a ritual of rigor without its content. "We should run a study" without a prior being updated and a posterior being accepted is not an experiment — it is a delay with a white coat on.

Why the delay is rational

This is not laziness. The unformed decision is load-bearing.

A written decision has an owner. Once you commit to "if A then P, if B then Q," you have staked judgment, credibility, and resources. If A arrives and you don't do P, you have failed publicly. If the decision is never written, no such failure is available. "We need more data" cannot be wrong.

That property makes the data-request a locally dominant move in any environment that punishes committing to a model. Corporate decision-making rewards the appearance of rigor and punishes the appearance of premature commitment. Academia rewards describing variance and punishes claiming mechanism. Personal life rewards talking about the thing over doing the thing. The data does not have to arrive. The request is the work product.

The coupling failure

At scale this is not an individual failure. It is a structural one. Most information ecosystems have a data-production machine and a decision-production machine, and the two machines are weakly coupled.

The scientific literature is the clearest case. Millions of papers per year describe variance in natural systems. Almost none are bound to a decision that any specific reader would make differently as a function of the result. The paper is the product. The citation is the product. The clinical, engineering, or policy decision is someone else's department, and that someone else is usually not reading the paper. Corporate analytics is the same shape with a shorter half-life: the KPI dashboard is consumed by people who were not going to change what they do regardless of what it said.

"More data" is the slogan of the data-production machine. The decision-production machine asks a different question: what is the minimum information that would let me act? When these machines are coupled — when the person asking for the data is the person who must commit to the action — data-hunger collapses to a small, specific request for exactly the information that would tip the decision. When they are decoupled, data-hunger is unbounded, because no quantity of data touches anything real.

The most common diagnostic error is treating a coupling failure as an information failure. If the request is coming from an uncoupled machine, no amount of data will satisfy it; the request will regenerate. The fix is to couple, not to collect.

Three substrates

The individual form of coupling failure has three substrates, usually overlapping.

No model. The requester has no working representation of how the variables relate. Without a model, data has no interpretation — they are hoping the data will build the model for them. It won't. Data without a model flatters a vacuum.

No agency. The requester has a model but does not control what happens next. The data-request is the only legal move — it looks like progress, it is survivable, it delegates commitment upward or sideways.

No stake. The requester neither decides nor is accountable. Their role is to produce analysis. Analysis-producers ask for more data the way a lathe asks for more stock.

A mid-level analyst with a half-built model, no authority, and a performance metric that rewards research-volume will request more data indefinitely. The request is correctly calibrated to the incentive structure. It is only wrong if you were expecting a decision at the end of it.

The cure

Before collecting or requesting data, write the decision first. Not a goal. A decision — in the form of an action with consequences:

If the data shows A, I will do P.
If it shows B, I will do Q.
If it shows neither, I will do R (where R is not "collect more data" and is not "consider whether to do P or Q").

A decision can also be "hold my posterior in a new location" — when updating the model is itself the consequence, and something downstream will act on the updated model. What makes it a decision is that the data-request is bound to a state change someone is committed to acting on. Any formulation in which the data's arrival leaves the world exactly as it would have been is not a decision; it is a deliberation in a suit.

If the decision cannot be written — because no action is available, or because no result would change what is already going to happen — the correct next step is not collection. It is to name that no decision is present, and to choose between constructing one and abandoning the question.

This is not a productivity rule. It is a structural property of what evidence is. Evidence is a prior paired with a decision rule, updated by data. Pull either component and what remains is numbers.

The rule weakens as data gets cheaper. When collection and interpretation approach zero marginal cost, unbound data-hunger becomes a cheap option rather than a tell. The diagnostic still applies — the coupling is still missing — but the cost of skipping the diagnostic drops. In current organizations, data is nowhere near free, and the diagnostic pays off every time it is run.

What this is not

It is not an argument against exploration. Exploration binds data to the decision "which question is worth asking next." A real explorer can specify what kind of anomaly would change direction and what would make them stop. A ritual explorer cannot.

It is not an argument against accumulation. A knowledge graph that accumulates structured observations is bound to a decision — what to write next — and the graph is the state that decision is made against. If accumulation changes the shape of the next question, it is bound. If it doesn't, it is hoarding.

It is not an argument that intuition beats data. The opposite: data only overrules intuition when the decision rule is written down in advance. Without that, data cannot overrule anything. It can only be reinterpreted until it stops contradicting whatever the actor was going to do regardless.

The practical tell

In any conversation where someone requests more data, ask the counterfactual:

What would you do if the data came back saying the opposite of what you expect?

"I would change my position, and here is how" is rare and precious. "I would want to see more data" is the tell. The question was never connected to a decision, and the request will regenerate no matter how much data arrives, because the mechanism producing the request is not coupled to any mechanism that consumes it.

Once the pattern is legible, it is everywhere. Most information-gathering in most organizations, most of science, and most of self-improvement is producing data not bound to any decision. The system runs. The data piles up. The decisions, such as they are, get made by whoever is willing to commit to a model without waiting for permission.

Reply by email →

link copied