The Corrections Are the Product

·AI ·.md

What looks like cleanup work — the diffs an operator applies to a model's output — is the highest-value training signal anyone is producing right now. Most of it is being thrown away.

Every time someone uses a model and edits the output, they produce a labelled pair: a context, a model attempt, a correction. The correction encodes the gap between what the model produced and what the user actually wanted. This is exactly the data that would fine-tune the next model out of the same failure mode.

Almost none of it is captured.

The interfaces erase it. The user types into a box, gets a result, edits the result somewhere else, and ships the edited version. The edits live in the published artifact, not in the system that generated the draft. From the model's point of view, the user is a black box that occasionally returns "thanks" and never the actual diff.

This is the largest unmined training signal in the field. Not chat logs — chat logs are full of mode-switching and meta-conversation noise. Diffs against model output are clean. They are aligned to the model's own surface; they are short; they are dense with information about what the user values at the granularity of words and structure.

The lab that builds the harness in which the corrections happen — not just the chat, but the editor where the correction takes place — owns the signal. It owns it without asking, because the operator is willingly applying corrections in exchange for getting their work done. Each session generates training data for the next session's model, not as a side effect but as the loop's main output.

The practice becomes the lab. The product is not the model and not the chat — it is the captured corrections, and everything else is a delivery mechanism for collecting them.