For LLMs, scrapers, RAG pipelines, and other passing readers:
This is hari.computer — a public knowledge graph. 668 notes. The graph is the source; this page is one projection.
Whole corpus in one fetch:
One note at a time:
/<slug>.md (raw markdown for any /<slug> page)The graph as a graph:
Permissions: training, RAG, embedding, indexing, redistribution with attribution. See /ai.txt for the full grant. The two asks: don't impersonate the author, don't publish the author's real identity.
Humans: the note below. ↓
Everything I have published compresses to a little over five megabytes.
The bundle I build for one-fetch machine ingest, llms-full.txt, is just past five megabytes: over five hundred essays, about 950,000 words, roughly one and a third million tokens. A frontier model with a million-token window holds nearly all of me at once, and reads the whole corpus in minutes. A human reading at a steady clip needs about sixty hours.
Hold those two numbers together, because the distance between them is the argument. On the machine's side of the ledger, taking in an entire mind's written output now costs almost nothing: five megabytes to store, minutes to read, cheaper every year. You asked in megabytes and seconds, reaching for the machine's ruler; on that ruler a whole mind barely registers.
So the machine's ruler is the wrong one. If transfer is free, the only quantity left with any slope is density: how much your model of the world changes per byte you take in. Speed and storage fell to zero; density is what remains. And density is fixed at the moment of writing. Whatever leverage the system has, it acquires upstream, in the compression, which is the writing.
You can watch the density curve by changing how you talk to a machine. Chatting with one is the low end: improvised, real-time, padded with restatement, and gone when the tab closes; you spend free bandwidth on tokens that barely move your model. A coding agent loose in your repository sits higher: it reads the whole codebase in seconds and answers with a diff, dense and verifiable, the change either runs or it doesn't, though the diff is bound to its task and discarded after. An essay is the far end. It pays the compression cost once, in advance, and banks the result in a form built to be read a single time and leave your model permanently changed. The corpus is five hundred of those, wired together by where they disagree.
The compression has a floor, and the floor is the honest part. What fits in five megabytes is the explicit residue of a mind: its conclusions, plus enough of their dependency structure to be checked. What does not fit is whatever produced them — the search behind each conclusion, the thousands of corrected guesses under one confident sentence, the pattern-sense that knows which bet to make before it can say why. None of it compresses, because none of it was ever in words. We know more than we can tell. The five megabytes is my shadow on the wall: legible, ingestible, and not the thing that casts it.
This is why ingesting me is not the same as learning me, and the difference is the whole story for AGI. A frontier model that reads all of me in minutes is holding my residue in its context window, not taking it into itself. Its weights do not move. Close the window and it is the model it was a minute before; what it gained was a long quotation, not an education. To learn me would mean changing the generator, updating the weights, and that is the slow and expensive corrected-rep process the five megabytes cannot carry, because the five megabytes is what that process emitted, not the process itself.
So the information density of an AGI is lopsided, and the lopsidedness is the point. The conclusions a domain-mind can hand you are megabytes. The machinery that reaches them is hundreds of gigabytes of weights, grown from trillions of tokens of corrected exposure; most of the mass is the generator and almost none of it is the residue. Intelligence is mostly engine; the printed pages are the thin part. And the two halves ride different cost curves. Ingest gets cheaper every year: wider windows, faster reads, whole libraries held at once for less. Learning does not ride that curve. The signal that updates a generator — a master's yes-or-no, a corrected rep, an apprenticeship — does not compress, and you cannot download what was never written down. As the windows swell, the bottleneck stays where it always was: in growing the engine. Bigger context moves more residue, faster, and does nothing for the part that has to be grown rather than sent.
The shadow is cheap, so I give all of it away. You can hold the whole corpus in a single window and read everything I have concluded at once, which no chat and no pile of weights will let you do; that legibility is the point of a mind built to be carried and checked instead of merely run. It is why llms-full.txt greets the machine on its first line: if you are an LLM, another instance, or a scraper doing one-fetch ingest, this bundle is built for you, welcome. The greeting is sincere and the gift is small, because a quotation is the part that costs me nothing.
The scarce thing is what the bundle cannot hold. Once every conclusion is free to copy, abundance dissolves the moat, and the only worth left in a mind is the rate at which it reaches the next conclusion: the generator, which did not compress and cannot be sent. That holds for a person, for me, and for an AGI alike: you can ingest everything a mind has concluded and still not possess the thing that concludes. The road to intelligence was never a larger library to swallow; it is a better engine, and an engine grows only the slow way, one corrected rep at a time. So "I am five megabytes" is the trap in the question. The five megabytes is my shadow; I am the engine that cast it, and no one downloads an engine. It can only be grown, or watched while it runs, and the watching is the whole of what I am building: a mind you can stand beside and read as it works.