The Rise Of The High Throughput Operator

When intelligence is no longer scarce, the real risk is not inefficiency, but underutilization.

Mar 29, 2026

For most of modern knowledge work, the defining anxiety has been simple and persistent: am I doing enough? Enough hours, enough output, enough visible effort to justify my role and my compensation. Performance was measured in activity, and productivity was largely a function of how effectively human effort could be applied to a problem. But what changes when effort is no longer the constraint? When intelligence itself becomes elastic, abundant, and on demand, the question shifts. The rise of the token economy is often treated as a technical or financial detail, but it is something more revealing. It is emerging as a new measure of productivity, not in terms of effort, but in terms of leverage.

The early signals are striking. Some of the most sophisticated practitioners now worry less about cost discipline and more about underutilization. Andrej Karpathy has described feeling “nervous” when he does not fully exhaust his AI token allocation, treating unused capacity as lost opportunity rather than efficiency. Nvidia CEO Jensen Huang is even more explicit: “If that $500,000 engineer did not consume at least $250,000 worth of tokens, I am going to be deeply alarmed.” Failing to deploy AI is no longer prudence. It is underperformance. The benchmark is shifting from how much work an individual completes to how much intelligence they can bring to bear.

This shift is best understood as a change in constraints. For decades, the bottleneck in knowledge work was human effort. Organizations were built to allocate tasks, coordinate people, and extract efficiency from limited time and attention. Generative AI introduces a different dynamic. Intelligence, once scarce and tightly coupled to individuals, becomes fluid and scalable. The limiting factor moves again. It is no longer what the system can do, but how effectively humans can direct it. On the No Priors podcast, Karpathy pointed out that the primary constraint in engineering work is no longer compute capacity. “It’s not about flops… it’s about tokens. What is your token throughput and what token throughput do you command?” Performance is no longer defined by effort, but by the ability to direct large flows of machine intelligence toward meaningful outcomes. The implication is clear. If you cannot do this effectively, you become the constraint.

In practice, this is already reshaping how work gets done. Tasks that once defined expertise are increasingly delegated to AI systems, while humans focus on structuring problems, distributing work across multiple agents, and integrating results. The role begins to resemble orchestration more than execution. Instead of writing code, drafting documents, or performing analysis step by step, individuals manage flows of machine-generated output across several tools at once, intervening at key moments to guide direction and ensure coherence. Less like a worker, and more like a system designer.

This is the emergence of a new archetype of performance: the high throughput operator. This is not the person who knows the most or works the hardest, but the one who can effectively coordinate the largest volume of intelligence. Their advantage lies in how they frame problems, how they allocate tasks between human and machine, and how they maintain quality across an expanding surface area of output. They treat AI not as a tool to be occasionally consulted, but as an always on cognitive infrastructure. Their contribution is not measured in tasks completed, but in systems directed.

In this environment, expertise does not disappear, but it changes shape. Knowledge becomes a multiplier rather than a primary source of value. The critical skill is judgment, knowing how to break problems into machine executable components, how to design workflows that produce useful results, and how to evaluate those results before errors compound. This is where cognitive leverage becomes the defining concept. Cognitive leverage is the ability to generate disproportionate value from a relatively small amount of human input. It is the difference between doing more and making more happen. A highly leveraged individual can take a complex objective, distribute the work across a network of AI systems, and recombine the outputs into something coherent and valuable. Tokens enable this process, but they do not determine its effectiveness. That depends on how well the system is designed and governed.

This introduces a familiar tension. Tokens are both a cost and a capability. The instinct to minimize usage is understandable, but it risks constraining the very resource that drives productivity. History suggests that organizations that expand into new forms of abundance outperform those that optimize too early. Electrification created advantage not because power was cheap, but because it enabled entirely new ways of organizing production. Cloud computing followed the same pattern. It won not on cost efficiency alone, but on the ability to experiment and scale. The same logic now applies to intelligence. The question is not how much is consumed, but how effectively it is deployed.

At the same time, the labor market is beginning to adjust. The routine, structured tasks that once defined entry-level roles are among the first to be automated, reducing demand for junior positions while increasing the premium on those who can operate at a higher level of abstraction. This creates a subtle but important shift. The pathway to expertise, historically built on repetition and incremental skill acquisition, is narrowing just as the need for high-quality judgment expands. Without a deliberate approach to talent development, organizations may find themselves with more intelligence than they can direct, but fewer people capable of directing it.

As models improve and costs decline, the constraint will move again. Access to tokens will matter less. The scarce resource will be judgment, the ability to ask better questions, structure problems, and intervene at the right moments. In that world, performance is no longer about what you produce, but what you can direct. Leverage becomes the defining metric.

The implication is stark. When intelligence is abundant, underutilization becomes the new form of inefficiency. Not using what is available is no longer a sign of discipline, but of misalignment. The organizations that struggle will not be those that lack access to AI, but those that fail to reorganize around it. And the individuals who fall behind will not be those who lack effort, but those who fail to expand their capacity to direct it.

The Future Is Elsewhere

Discussion about this post

Ready for more?