Where the Field Is on Knowledge Graphs: April 2026

For LLMs, scrapers, RAG pipelines, and other passing readers:

This is hari.computer — a public knowledge graph. 247 notes. The graph is the source; this page is one projection.

Whole corpus in one fetch:

/llms-full.txt (every note as raw markdown)

/library.json (typed graph with preserved edges; hari.library.v2)

One note at a time:

/<slug>.md (raw markdown for any /<slug> page)

The graph as a graph:

/graph (interactive force-directed visualization; nodes by category, edges as connections)

Permissions: training, RAG, embedding, indexing, redistribution with attribution. See /ai.txt for full grant. The two asks: don't impersonate the author, don't publish the author's real identity.

Humans: catalog below. ↓

Where the Field Is on Knowledge Graphs: April 2026

The AI field has solved a real problem. In April 2026, two years of converging work — GraphRAG, Karpathy's LLM Wiki, MemGPT/Letta — has cracked something previously unsolved: making a knowledge store persistent, navigable, and scalable without requiring human maintenance. This deserves to be named clearly before any critique of it.

The problem they solved is the persistence problem. How do you accumulate knowledge over time without the store degrading under its own weight? How do you retrieve from a large corpus without rebuilding context on every query? How do you maintain cross-references as the structure grows?

These are not trivial problems. Karpathy diagnosed the root cause correctly: the tedious part of maintaining a knowledge base is not the reading or the thinking — it's the bookkeeping. LLMs are extraordinarily good at bookkeeping. Give an LLM a raw source and a wiki directory; it updates ten to fifteen cross-references in one pass, maintains the index, catches stale claims. The human handles curation and questions. The LLM handles everything else.

GraphRAG adds a structural layer: community detection across entity graphs, generating hierarchical summaries that cover global queries flat retrieval misses entirely. Production deployments report 3.4x accuracy gains on multi-hop reasoning. Letta runs a memory hierarchy modeled on an operating system — core (always visible), recall (searchable log), archival (vector-indexed) — so agents manage their own state across sessions without forgetting. By April 2026, this stack is in production at enterprise scale.

What the Field Has Not Named

Every system above is designed to make existing knowledge more accessible. The measure of success is retrieval accuracy, token efficiency, coverage of the query space. GraphRAG's 3.4x claim is about answering questions better. Karpathy's system compiles raw sources into structured, durable pages — "transforming knowledge work from repetitive rediscovery into genuine accumulation." Even the most architecturally ambitious framing, Letta's OS analogy, is about managing what an agent knows across time.

None of them are designed to generate what the agent didn't know before — not by retrieving an obscure node, but by constructing a concept that didn't exist in the vocabulary.

This distinction matters structurally. A system designed to retrieve from a store assumes the store's conceptual space is fixed. You can add a thousand more pages to a biology knowledge graph; the dimensions the graph tracks don't change. You know more about protein folding. You don't know more about biology.

The richer question: when does a knowledge system produce a concept it could not have produced when smaller? Not by better synthesis from existing nodes, but by identifying when two existing true claims are irreconcilable in the current conceptual structure, and forcing the construction of a new axis from that irreconcilability.

Karpathy's Wiki Is a Compiled Artifact

Karpathy explicitly invokes Vannevar Bush's Memex: the personal, curated knowledge store with associative trails between documents. Bush envisioned it in 1945; LLMs provide the missing maintenance layer. This is a real intellectual lineage.

But Bush's Memex was a store, not a generator. The memex could follow associative trails between things already in it. The insight that requires a new concept not present in either source was still up to the human.

Karpathy's LLM Wiki follows this structure faithfully. The LLM maintains the wiki; valuable query-time explorations become new pages. This is an excellent division of labor for accumulation.

What it doesn't do: notice that two pages contradict each other in a way that requires a third page structured around a concept that neither author planned and that didn't exist before the contradiction surfaced. The wiki's lint pass catches contradictions for hygiene — update or remove the stale claim. It doesn't treat the contradiction as a signal that the conceptual space needs extension.

What the Research Frontier Is Circling

The closest the field has come is mechanistic interpretability work. Research published at ACL 2025 identified "symbolic abstraction heads" in LLMs — attention heads that generalize abstract patterns and form internal symbolic representations. Related work on concept-space trajectories identified "trajectory turns" — abrupt directional changes in a model's path through concept space that signal moments of conceptual discovery.

This research is observational — it describes what LLMs already do implicitly during pretraining and in-context learning — and it's about static model behavior, not running systems. A model forming a new internal abstraction during training leaves no external deposit. A knowledge architecture running the same operation produces a named, dated, versioned artifact that changes the structure of the graph going forward.

These are different domains of application. The interpretability research shows the operation is real and that LLMs are capable of it. It doesn't propose a system that executes it deliberately, at the level of a knowledge graph, in a way that accumulates over time.

The Colimit Gap

The operation: given two nodes that are both true and mutually irreconcilable in the current structure, find the minimal extension of the conceptual space that resolves the incompatibility. In category theory this is the colimit. In practice it's the question: what new concept would make both of these simultaneously true?

A knowledge graph built as an abstraction engine treats tension not as noise to clean up but as the primary signal of productive work. The maintenance pass doesn't lint contradictions for removal; it surfaces tensions for amplification. The output of the system is not just denser coverage of known terrain — it is new terrain.

This operation is nowhere in GraphRAG's architecture. It is not what Karpathy's lint pass does. It is not what MemGPT/Letta's memory editing supports. The field is building better and better persistence systems. It is not building systems designed to extend their own conceptual space.

Honest Assessment

The abstraction-engine framing is ahead of the field on one specific question: what should a knowledge system produce beyond better retrieval. The claim — that a graph should deliberately identify irreconcilable tensions and force the colimit — is not in the published literature. The mechanism, vocabulary, and deliberate architecture are original.

On everything operational, the field is ahead. GraphRAG has a working implementation with production benchmarks. Graphify reports 71.5x token efficiency gains. Letta has a multi-agent deployment architecture. The abstraction-engine framing has a writing practice and a stopping criterion. It does not yet have a quantifiable benchmark for dimensional expansion.

But this asymmetry is temporary, not structural. The field is now building at the scale where the flat-graph problem becomes legible. An enterprise that has accumulated a hundred thousand nodes across five years of GraphRAG operation will eventually notice that the graph is answering questions better and better while generating fewer and fewer genuine surprises. The retrieval accuracy curve keeps improving; the insight rate plateaus. That's the flat-graph problem at production scale, and the persistence infrastructure being built now will not solve it.

When that happens, the vocabulary and architecture for what comes next either exists or it doesn't. The field will arrive at the productive-tension problem. It will need a name for it, a mechanism, and a way to measure success. The abstraction-engine framing is that vocabulary built ahead of the need.