The next AI inequality is not access to a better chatbot. It is access to more continuous machine attention.
Typed chat made compute feel like a meter. A person asks, the model answers, and richer users buy lower latency, longer context, or a smarter endpoint. Persistent multimodal AI changes the unit. The system watches the screen, listens to the room, remembers the last thousand interactions, monitors the calendar, reads the camera feed, runs background simulations, and acts before the human has converted the situation into a prompt.
At that point compute is not a faster answer. It is a wider sensory field.
This is why ambient AI makes inequality sharper than the chatbot era implied. A rich user can keep more of his environment under machine-readable supervision. He can run more agents in parallel, preserve more private context locally, retry more options, and afford the energy bill for always-on cognition. The poor user gets episodic intelligence. The rich user gets continuous cognition.
That distinction updates the-two-exponentials. The diffusion curve is not only "who has adopted AI?" It also splits by how much of the new interface a user can actually run. Everyone may eventually get a capable model. Not everyone gets persistent video, local memory, private inference, and enough background compute to turn daily life into a live optimization surface.
It also updates input-as-ceiling-b. Raising the input ceiling raises the compute floor. Text was cheap because it compressed the world before the model saw it. Audio, video, screen state, cameras, robots, and continuous monitoring remove that compression. The model receives more of reality, but reality is expensive to read.
The counterforce is real. Open models, edge chips, and cloud competition will push much of this downward. The baseline agent will get cheap. Phone-scale perception will improve. Local models will handle more private work. The mass market will not be locked out of AI in the simple way the phrase "compute inequality" can imply.
But the upper tier stays positional because more compute keeps becoming more surface area of life inside the loop. More streams watched. More memories retained. More options simulated. More agents working in parallel. More retries before a human notices the failure. More privacy because the sensitive context can stay local. The high-end user does not merely receive the same agent sooner. He gets more simultaneous agents over more private context with less waiting and more attempts.
The class line is therefore not "AI users versus non-users." It is episodic intelligence versus continuous cognition. The first asks questions. The second surrounds the user with machine attention.
The scarce good is not only intelligence. It is how much of the world can be brought inside the loop.