The Unaccounted-for Economics of Agentic AI | Insights

Uber burned through its entire 2026 AI budget in four months. Not because the tools didn't work, but because nobody had worked out what they would actually cost. Here is the hidden reality of agentic AI consumption.

Uber burned through its entire 2026 AI budget in four months.

Not because the tools didn’t work. Because they worked exactly as they were supposed to, and nobody had worked out what that would actually cost.

By spring 2026, most of Uber’s engineers were using AI agents daily, and the budget meant to last the year was reportedly gone by April. Microsoft hit the same wall from the other direction - it opened AI coding agents to thousands of engineers, usage took off, and it pulled back once the bill caught up with the enthusiasm.

These aren’t stories about AI failing. They’re two organisations that made a big AI decision without a complete cost picture, and found out what was missing when the invoice arrived.

And what was missing wasn’t a number they’d got wrong. It was a kind of cost they’d never had to manage before.

Uber and Microsoft budget vs AI cost

A New Kind of Cost Discipline

If your organisation is moving toward agentic AI, you are going to have to think, cost-manage and forecast in a way you have not had to before - not because the underlying principle is new, but because you can no longer afford to leave it to the engineers.

The principle itself is old. Anyone who has optimised a heavy database query knows it: two queries can return the same answer, and one can cost many times more than the other, purely because of how it was built. Bad design costs money every time it runs. What’s changed is that most organisations could get away with ignoring that, because the absolute cost was small enough to absorb. It was a rounding error. It stayed an engineering concern and never reached the budget conversation.

Agentic AI removes that slack. It is metered, and it runs at volume, continuously. A chatbot answers a question and stops. An agent plans, calls tools, checks its own work, handles failures, retries, and loops until the task is done with every step is a separate, billable call to the model, resending the whole accumulated context each time.

Gartner’s March 2026 analysis found agentic AI uses 5 to 30 times more tokens per task than a standard chatbot. EY puts it in money: a simple AI interaction cost around $0.04 in 2023; a complex orchestrated agentic one in 2026, about $1.20 - roughly thirty times more.

And the fact that the price per token is falling fast isn’t helping anyway. Total bills are rising regardless because consumption is growing faster than price drops. The price on the tin goes down; the invoice goes up.

With AI, usage is the cost.

For decades, most technology has been a known quantity in budgeting terms - a licence or a per-seat fee, agreed in advance, where using it more cost nothing. That shaped what organisations told their people: use the tools you’ve paid for as much as possible. With metered AI, that instinct is precisely the wrong one. The discipline of caring how the thing is built, and how much it consumes to do its job, is no longer something leadership can delegate downward and forget.

A new kind of cost discipline

“Isn’t this just cloud cost management?”

None of this is new.

That’s the objection, and partly it’s right - it’s the right place to start. The discipline itself has felt the shift: in 2026 the FinOps Foundation reported that 98% of its practitioners now manage AI spend, up from 63% a year earlier, and changed its founding mission from advancing “the people who manage the value of cloud” to “the value of technology.”

But treating token cost as just another line for the cloud-cost team under-reads it - because FinOps manages the bill, and it operates downstream of the thing generating the bill.

No amount of cost-tagging fixes a badly written query. Only rewriting it does.

That’s the whole point. The lever that matters most sits upstream, in how the thing was built - and that’s true of an agentic workflow exactly as it is of a database query. A cost-management function can tell you a workflow cost six figures last month and which team to charge. It cannot tell you it costs that much because the process underneath it has a dozen exception paths the agent loops through. Reducing that isn’t a cost-management exercise. It’s process and design work, a long way upstream of where cost management reaches.

Cloud cost management and FinOps vs AI process design

Agents Don’t Fix Broken Processes. They Run Them Faster, and Bill You for It.

Here’s what the demos never show: the actual process.

Not the clean version in the diagram. The real one, with the exception handled by email because the system doesn’t support it, the workaround two people know about and nobody wrote down, the data field filled in inconsistently for years because nobody agreed what it means, the approval step that exists because of something that happened in 2019.

An agent doesn’t see any of that as a problem to solve. It sees it as the process, and it runs it - faithfully, at machine speed, reasoning through every exception and interpreting every inconsistency. Every one of those steps is tokens. The mess doesn’t disappear. It gets executed, and the bill reflects every extra step.

These are the decisions deferred, the processes never redesigned, the workarounds quietly tolerated for years. We call it organisational debt. It accumulates precisely because dealing with it takes courage and rarely feels convenient - and automating over it doesn’t make it go away. It means an agent is now executing the answer nobody gave, at cost, continuously.

This is often the work we are brought in to do. Organisations come for the system, the integration, or now the agent - and frequently the harder problem sits underneath, in processes that were never quite designed so much as accumulated. An agent makes that choice less forgiving: automate a clean process and it scales; automate a broken one and you have paid to run the problem at speed.

Doing This Well Isn’t a Reason to Wait. It’s a Reason to Plan Properly.

None of this is an argument against agentic AI. The ambition is right, the technology is real, and there’s a genuine advantage in moving early and deliberately. The argument is about how you get there.

The encouraging part is that the upfront work pays for itself twice. A clean, well-understood process is cheaper to automate - fewer loops, fewer tokens burned on confusion - and cheaper to run, every day, for the life of the system.

Performant doesn’t have to mean expensive - but cheap-to-run is something you design in at the start, not something you discover later.

And more expenditure doesn’t buy a better result. Research on agentic systems has found accuracy often peaks at intermediate cost and then degrades - beyond a point, the extra tokens reflect unproductive exploration rather than deeper reasoning. It matches what we see structuring large data estates and optimising complex queries: the costliest approach is rarely the best one, and much of the expenditure turns out to be effort the system never needed to make.

The Cost You Don’t Control

There’s one more cost that doesn’t appear on any internal spreadsheet (or whatever more modern equivalent you’re using), because it isn’t set internally.

Most organisations aren’t buying tokens directly. They’re buying a SaaS product they already use, which has quietly (or not so quietly!) added AI features - which licenses a model from a provider, which rents computing power from a big cloud company, which depends on a very small number of chip makers underneath. The price you pay for a convenient new feature in a familiar tool sits on top of a supply chain four or five layers deep, and the cost and reliability of every layer depends on the one below it.

Almost nobody maps their supply chains to that depth today - not for AI, and not for plenty of things more critical than AI. It’s been survivable until now. What changes the calculus is that this cost is variable, runs continuously, and is set several suppliers away in contracts you’ll never see. And right now that supply is contended: through 2026, demand for AI compute has run ahead of supply, with providers rationing capacity and prioritising some customers over others. Supply you assumed was limitless turns out to be fought over, several layers above you.

You can’t control the AI supply chain. But you can’t control the weather either, and that has never been a reason not to check the forecast.

Not controlling something is not the same as not managing it. Awareness comes first - and there’s a real discipline to working out where your exposure sits, what your options are, and what to ask your vendors before you’re locked in.

The AI supply chain and dependencies

The Questions Worth Sitting With

Before committing, the questions worth sitting with:

Do we actually know what this will cost in production, day in and day out - rather than what it cost in the pilot? The two differ by an order of magnitude, and only one is in the business case.
Is the process we’re about to automate one we genuinely understand and have had the courage to fix - or one we’ve learned to live with, worked around, and quietly deferred?
And who, in our organisation, actually owns the answer to all of this?

These aren’t questions a tool answers or a procurement exercise resolves. They’re questions of understanding - of having thought the problem through before committing to it. That’s the part the market risks skipping.

Plenty of firms can build you an agent. Far fewer will tell you what it will actually cost to run, which of your processes are ready for it, where your exposure sits, and who needs to own it - before a line of anything is built.

The technology is ready. The harder and more valuable question is whether the thinking around it is.

References

Fortson, D. (2026). Could AI rationing be the pin that pops the feared tech bubble? The Sunday Times. https://www.thetimes.com/business-money/technology/article/ai-rationing-tech-bubble
Hagey, K. & Jin, B. (2026). OpenAI Considers Drastic Price Cuts, Anticipating War for Users With Anthropic. The Wall Street Journal. https://www.wsj.com/tech/ai/openai-token-pricing-anthropic
Gartner. (2026). Gartner Predicts the Cost of Inference for Large Language Models Will Fall Significantly by 2030. Press release, 25 March 2026. https://www.gartner.com/en/newsroom
EY. (2026). The hidden economics of agentic AI: understanding token costs. https://www.ey.com/en_us/insights/ai/agentic-ai-token-costs
FinOps Foundation. (2026). The State of FinOps 2026. https://data.finops.org
Fortune. (2026). Microsoft’s AI cost problem: why tokens and agents are blowing past budgets. https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tokens-agents
Kapoor, S. et al. (2026). How Do AI Agents Spend Your Money? Cost-Accuracy Trade-offs in Agentic Systems. https://arxiv.org/abs/2604.22750