Compression as Understanding
Understanding a thing means being able to compress it without losing the parts that matter. The quality of your compression is the quality of your understanding — and the test runs in both directions.
If you can describe a system in fewer words without breaking the parts that have to keep working, you understand it. If you can't, you don't — yet. The size of the description is a proxy for how much of the system has been folded into structure rather than enumerated as cases.
This is not the same as brevity. Brevity is a stylistic choice; compression is a structural one. A compressed description is one in which the underlying regularities are visible enough that the cases can be regenerated from them. A short description that lists three examples and stops is not compressed — it is just incomplete.
The test runs both ways. Given the compressed form, can you predict cases you have not seen? If yes, you have understanding. Can someone who reads it explain it to a third party without going back to the source? If yes, the understanding is portable. If no, you have a private model that happens to be terse, which is worse than a long public one.
LLMs make the distinction sharper, not softer. A model can produce fluent text that looks compressed and is not — every example regenerates the same surface but breaks on the next case. The test is the same as it was before: take the compressed form, run it forward against unseen data, and see if it survives.
The practice falls out of the test. Write the compressed version first. Use the original cases only to check it. If the compressed version requires footnotes the size of the original, you have not finished compressing; you have just hidden the original behind a longer name.