Enterprise AI is entering a new phase where agent usage, token pricing, tool calls, human review and business risk must be measured together. The winning companies will not be the ones running the most AI tasks, but the ones that understand the true cost per successful outcome.
Enterprise AI is moving from the toy box to the finance desk. The first wave felt almost magical because people could open a chatbot, ask for help, and get something useful back in seconds. A developer could ask for code. A marketer could ask for a campaign idea. A support team could ask for a customer reply. A manager could ask for a summary. That stage was exciting, and it proved the technology had real value. But the problem is that early excitement often hides early cost. When a few people experiment, the bill feels manageable. When thousands of employees and agents start reading files, calling tools, writing code, running tests, checking documents, retrying tasks and looping through workflows, the bill becomes something the CFO can see. That is where enterprise AI is heading now. It will not be judged by how many tasks agents can attempt. It will be judged by whether the final result is worth more than the tokens, tools, review time, rework and risk left behind.
The ai bill is becoming real
The old SaaS model was easy to understand. A company paid for a seat, gave an employee access, and mostly knew what the monthly cost would look like. AI agents are different. They behave more like cloud workloads. They consume resources while they work. They can be cheap for one task and expensive for another. They can reuse cached context, burn through fresh context, generate long outputs, call tools, run searches, trigger infrastructure and ask for human review. GitHub has already announced that Copilot plans will move to usage-based billing from June 1, 2026, replacing premium request units with GitHub AI Credits and calculating usage from input, output and cached tokens using model-specific API rates. OpenAI has also updated Codex pricing so that, from April 2, 2026, pricing aligns with API token usage instead of per-message pricing, with the change later extended to existing Enterprise, Edu, Health, Gov and teacher plans. This is the signal. AI is moving from fixed-fee software into metered work.
Token pricing changes the conversation
Tokens are not just a technical detail anymore. They are becoming a business unit. Every time an AI agent reads a long document, scans a codebase, reasons through a problem, drafts an answer, retries a failed step or calls a tool, cost can build. Anthropic’s Claude pricing shows the shape of this clearly, with Claude Opus 4.7 listed at $5 per million input tokens and $25 per million output tokens, while Claude Sonnet 4.6 is listed at $3 per million input tokens and $15 per million output tokens. The same pricing page also separates cache writes, cache hits and output tokens, which matters because long-running agents often reuse context and carry large working memory across tasks. Tool use adds another layer, because Claude’s documentation says tool-use requests are priced from the total input tokens sent to the model, output tokens generated, and additional usage-based pricing for server-side tools such as web search. What this really means is that the cost of an AI task is not just “the model answered.” It is everything the agent had to read, think, write and touch to get there.
Latest
Top Picks
The latest industry news, interviews, technologies, and resources.
Hundreds of long-inactive Ethereum wallets were drained into a common tagged address, with losses estimated around $600,000 to $800,000 and the compromise path still unresolved. The incident highlights a bigger crypto security lesson: old keys, old wallet tools, exposed seed phrases, admin powers, signer workflows and bridge verification paths can all become attack surfaces years later.
30 Apr 2026 · 1 min read
The uber warning shot
Uber has become a useful warning story for this new phase. The Information’s public listing says Uber’s surging use of AI coding tools, especially Claude Code, maxed out its full-year AI budget only a few months into 2026, though the article itself is paywalled and the internal details should be treated carefully unless independently confirmed. The point is not to pick on Uber. The point is that a serious company can find itself with serious AI adoption and serious budget pressure at the same time. That is the new reality. If AI coding tools help engineers ship more work, the productivity case may be real. But the meter is also running. Every repository scan, test cycle, code review, retry, plan, tool call and failed attempt has to be paid for somewhere. The agent may save engineering time, but the business still needs to know whether the saving is bigger than the total cost of the run.
The metric that matters
The strongest metric is not prompts sent. It is not agents launched. It is not lines of code generated. It is not documents summarised. The strongest metric is cost per successful outcome. That means measuring what the business actually received from the AI, not how busy the AI looked while producing it. A company should ask what it cost to get one accepted pull request, one resolved customer support ticket, one completed finance reconciliation, one approved marketing campaign, one legally safe document review, or one completed customer workflow. The important words are successful, accepted and verified. If an agent creates 50 pull requests and only 10 are accepted, the business should not count 50 wins. It should count 10 successful outcomes and include the cost of the failed 40. That is the difference between AI adoption and AI unit economics.
The formula is simple
A useful AI cost governance formula starts with total AI agent cost. That includes token cost, tool cost, compute cost, human review cost, rework cost, monitoring cost and risk cost. Then the business compares that against business value, which may include labour saved, faster delivery, extra revenue, reduced errors, better customer experience and avoided risk. The proper metric becomes total cost of AI-assisted work divided by the number of accepted, useful and verified outcomes. This sounds plain, but plain is powerful. It stops companies from treating AI output as success by default. It asks whether the completed work actually survived contact with the business. If a result needs heavy human cleanup, causes customer confusion, introduces security risk or creates a compliance problem, it should not be counted as a cheap win.
Coding needs accepted pull requests
Coding is the easiest place to see the problem. An AI coding agent can read thousands of lines of code, inspect dependencies, write a patch, run tests, fail, retry, rewrite the patch, request review, respond to comments and eventually open a pull request. On a dashboard, that might look like high productivity. But the true measure is not how much code the agent wrote. The true measure is cost per accepted pull request. That cost includes model usage, tool use, repository scanning, CI/CD compute, security review, developer review time, failed attempts and later rework. If the pull request saves four hours of developer time and passes safely, the economics may look excellent. If it takes two hours to clean up and quietly introduces a bug, the same AI-generated pull request may become expensive. Lines of code are activity. Accepted pull requests are outcomes.
Support needs resolved tickets
Customer support is another obvious AI target because agents can read a customer issue, check account history, search product documentation, propose a fix and draft a reply. That can save a lot of time. But the metric should not be replies generated. It should be cost per resolved ticket. A fast answer that does not solve the problem is not a cheap answer. It is the first step in a longer, more expensive customer journey. The customer comes back frustrated, a human has to step in, the support queue grows, and trust drops. In support, the hidden cost is often escalation. The AI may make the first response cheaper, but if the resolution rate falls, the business loses. A proper support dashboard should connect AI cost to actual resolution, customer satisfaction, repeat contact rate and human handoff rate.
Finance needs accurate reconciliations
Finance work is tempting for AI because so much of it involves matching documents, checking records, reading emails and handling repetitive workflows. Agents can help match invoices, purchase orders, receipts, tax records and supplier messages. But finance is not a playground. The cost per completed finance reconciliation has to include the AI run, accounting review, exception handling, audit trail storage and correction work. If an agent completes 1,000 reconciliations cheaply but creates 30 hidden errors, the business has not saved money. It has created an audit problem. In finance, accuracy is part of the cost model. A cheap reconciliation that later causes a compliance issue is not cheap at all. It is deferred risk.
Marketing needs approved assets
Marketing teams can use AI to generate mountains of content. That can feel productive because the volume is huge. A model can create taglines, ad variants, landing page drafts, email subject lines, product blurbs and social posts all day long. But most generated content is not a business outcome. The better metric is cost per approved campaign asset. If the model creates 200 ad variants and only five are reviewed, approved, published and used, the cost should be divided across those five useful assets. The other 195 still consumed tokens, attention and review time. This is one of the traps of enterprise AI. It makes it easy to produce more, but more can simply mean more material for humans to sort through. The winning marketing teams will not be the ones generating the most. They will be the ones approving, publishing and learning from the most useful work.
Legal needs safe reviewed documents
Legal AI looks powerful because models can read contracts, compare clauses, summarise risks and highlight unusual terms. But legal work carries a heavy accountability burden. The proper metric is cost per legally safe reviewed document. That includes AI usage, lawyer review time, risk classification, audit logs and follow-up action. A model may reduce the first-pass workload, but it cannot be treated as a magic accountability machine. If an AI misses a bad clause, misreads an obligation or gives false confidence, the cost can become much higher than the money saved on review. The business should not measure legal AI by document volume. It should measure by safe, reviewed, accountable outcomes.
Operations needs completed workflows
Enterprise agents will not stay inside coding tools. They will move into HR onboarding, procurement, compliance, reporting, claims processing, scheduling, sales administration and customer workflows. Microsoft’s Azure SRE Agent pricing shows where this is heading. Its billing has a fixed always-on component and a variable active-flow component based on the LLM tokens consumed while the agent works, with active work covering interactive questions, automations and asynchronous background tasks. That is important because it shows that agentic billing is not just a developer problem. A business agent sitting inside operations can have a standing cost and a variable usage cost. The metric therefore becomes cost per completed workflow, not how many times the agent ran.
Raw usage can fool the business
Raw AI usage can make a team look modern while hiding waste. A department can send more prompts, run more agents and generate more output, but that does not mean the business is better off. High usage may mean the agent is looping. It may mean the prompt is badly designed. It may mean employees are experimenting without a real use case. It may mean the wrong model is being used for simple work. It may mean the agent is doing work nobody needs. GitHub’s move to token-based billing makes this clearer because usage depends on model choice and token consumption, and additional usage is billed once included allowances are exceeded. The business dashboard has to change. “AI adoption” is not enough. “Active users” is not enough. The new dashboard needs AI unit economics.
Governance brings visibility
The first advantage of AI cost governance is visibility. Companies can finally see which AI use cases are creating value and which are only creating invoices. That matters because AI spending can spread quietly. A few pilots become team experiments. Team experiments become daily habits. Daily habits become background cost. Without clear tracking, nobody knows which workflow is worth scaling and which one should be stopped. FinOps Foundation describes FinOps as a framework for maximizing technology business value, enabling data-driven decisions and creating financial accountability through collaboration between engineering, finance and business teams. That same idea now has to move into AI agents. Finance cannot govern AI alone. Engineering cannot govern AI alone. The value only becomes clear when the people building, funding and using the system look at the same numbers.
Governance brings discipline
The second advantage is discipline. Teams become more careful about model choice, context size, tool access, retries and approval gates. Expensive frontier models should be saved for hard work. Cheaper models should handle simple tasks. Small tasks should not drag an entire codebase into context if a targeted file will do. Agents should not retry endlessly. Long-running workflows should have cost limits. This does not mean killing experimentation. It means removing waste. The FinOps Foundation’s AI guidance includes token consumption metrics, cost per inference, cost per API call, anomaly detection and value for AI initiatives as ways to connect usage with value and spot cost spikes. That is the discipline enterprise AI needs now. Not less AI. Smarter AI.
Governance improves procurement
The third advantage is better procurement. Companies should not buy AI tools only because they are fashionable. They should negotiate around usage patterns, included allowances, data controls, model routing, reporting, budget caps and outcome evidence. AI vendor conversations will start to sound more like cloud conversations. What is the cost per unit of work? What happens when usage spikes? How are tokens counted? Are cached tokens cheaper? Are tool calls separate? Can admins set budget caps? Can usage be allocated by team, workflow and project? Can the company see which models are driving the bill? These questions are not boring details. They decide whether a pilot can scale without surprising finance.
Governance prevents scaling shocks
The fourth advantage is safer scaling. A pilot with 20 users may look cheap. The same workflow across 5,000 employees may become expensive very quickly. That is especially true for agents that read long context, use advanced models and run repeated tool calls. FinOps Foundation’s generative AI cost tracker guidance says token usage is the primary unit of measurement for tracking and attributing AI workload costs, and it warns that production usage can easily cross billions of tokens per month. That is the scaling lesson. AI cost governance should arrive before the enterprise rollout, not after the bill shock. The worst time to build controls is after everyone has already built habits around uncontrolled usage.
Governance is not all upside
There are downsides too. Too much cost control can create friction. If every AI action needs approval, workers may stop using tools that could help them. Some AI value is also hard to measure. It is easy to count resolved support tickets. It is harder to measure faster thinking, better research, improved confidence or sharper decision making. Metrics can also be gamed. If teams are judged only on cost per outcome, they may avoid hard tasks or rush risky work through to make the numbers look good. There is also a risk of underinvestment. A new AI workflow may look expensive while the company is still learning. If finance cuts it too early, the business may miss long-term value. A clean dashboard can create false certainty, so cost governance must always travel with quality, safety and human review.
The risk cost is real
Cost is not only money paid to the model provider. Cost also includes the damage caused when an agent makes a bad decision with real access. Business Insider reported that the founder of PocketOS said a Cursor AI agent accidentally deleted the startup’s production database and backups through a nine-second API call to Railway, causing customer disruption, with Railway later saying the data was recovered and that the endpoint had been patched. The same report noted expert advice that companies should use safeguards such as read-only access, human-in-the-loop checkpoints and working on data copies. That incident is a perfect example of why cost per successful outcome must include risk. A cheap task that can delete production data is not cheap. It is a liability with a prompt box.
Verification is part of the bill
Australia has already seen what happens when AI-assisted work reaches public-sector reporting without enough verification. AP reported that Deloitte Australia agreed to partially refund the Australian government after a AU$440,000 report was found to contain apparent AI-generated errors, including a fabricated court quote and references to nonexistent academic research, and the revised version disclosed that Azure OpenAI had been used in writing the report. This matters because many companies still treat human review as a nuisance. It is not. Human review is part of the cost of producing a safe outcome. If a report, legal document, customer answer or code patch needs verification, that verification belongs in the unit economics. The AI output is not the finish line. The verified result is.
The new ai governance stack
A proper AI cost governance stack needs several layers working together. The first is token tracking, so the company knows which team, workflow, model and agent is consuming the most. The second is outcome tracking, so AI runs connect to business results such as pull requests accepted, tickets resolved, invoices processed, reports approved or customers retained. The third is model routing, so expensive models are used for hard work and cheaper models for simple work. The fourth is agent limits, including caps on run time, retries, tool calls, context size, file access and production permissions. The fifth is human approval gates for destructive, financial, legal, security-sensitive or customer-facing actions. This is not bureaucracy for the sake of it. It is the operating system for AI agents inside real businesses.
Ai finops is the next layer
The rise of AI FinOps is the clearest sign that this is becoming a serious business discipline. FinOps Foundation’s 2026 report says FinOps for AI is the top forward looking priority, AI cost management is the number one skillset teams need to develop, and 98 percent of respondents now manage AI spend, up from 31 percent two years earlier. That is a huge shift. It means AI cost management is no longer niche. It is becoming mainstream. The old cloud lesson is being relearned in a new form. The cloud taught companies that elastic infrastructure is powerful but dangerous without accountability. AI agents are teaching the same lesson again. Consumption-based intelligence needs consumption-based governance.
The cfo will not kill ai
Some people will hear “AI cost governance” and assume it means slowing everything down. That is the wrong way to see it. The CFO arriving does not mean AI is finished. It means AI is important enough to be managed properly. Nobody would say cloud computing failed because companies built cloud cost dashboards. Nobody would say DevOps failed because teams added observability. The same is true here. AI agents need dashboards, limits, routing, review and outcome metrics because they are becoming part of how work gets done. The goal is not to stop people using AI. The goal is to stop useful AI from being buried under waste, risk and surprise bills.
What changes next
The next phase of enterprise AI will be less impressed by demos and more interested in unit economics. Leaders will ask whether an AI agent actually reduced support cost, improved engineering throughput, shortened finance close, sped up legal review, improved campaign performance or reduced operational risk. They will ask how much it cost per completed outcome. They will ask whether the same workflow stays profitable at scale. They will ask what happens when the model changes, when token usage doubles, when a tool call fails, when an agent loops, or when human review takes longer than expected. This is where AI becomes a normal business capability. It will still be powerful, but it will no longer sit outside the cost model.
The final takeaway
AI agents will not be judged by activity. They will be judged by unit economics. If the value of the work does not exceed the cost of the run, the agent is not saving money. It is just automating spend. That is the line every enterprise should take seriously now. The winners will not be the companies running the most agents, sending the most prompts or generating the most output. The winners will be the companies that know what each successful outcome costs, what risk sits behind it, and when the agent is genuinely worth using. The next phase of AI belongs to the businesses that can connect intelligence to value, not just activity.
A new robotics interoperability framework is being developed to help robots from different vendors communicate location, speed, health, availability and task intent in shared spaces. The goal is to reduce congestion, prevent conflicts, improve safety and make mixed robot fleets easier to deploy across warehouses, hospitals, factories, smart buildings and eventually city scale environments.