AI Is Nearly Free. That’s why we’re making our most expensive mistakes right now

AI Is Nearly Free. That's why we're making our most expensive mistakes right now

We are living through an exceptional era. The most expensive technology in human history has been handed to us for almost nothing — and that is becoming a problem. When the price tag is invisible, learning doesn’t happen. When it becomes visible, being too late will cost you.

Text by Martti Asikainen & Sami Masala, 20.5.2026 | Photo by Adobe Stock Photos

AI is currently like a hotel buffet you can stroll up to whenever you like and eat as much as you can manage. There’s no bill at the end, because you’ve already paid the subscription fee. Your plate groans under all manner of delicacies, some of which you’ll leave untouched, and none of which you’d ever order from an à la carte menu if you were paying by the dish.

With AI, the feeling of something being too good to be true is familiar. The reason is simple: it is too good to be true. Silicon Valley has for some time been talking about a phenomenon known as “tokenmaxxing”, in which managers evaluate their employees’ productivity by the number of AI tokens consumed (Roose 2026; Lorrimar & Smartt 2026). The more you consume, the better. Georgia Tech professor Mark Riedl mused in an interview with The Verge about whether the era of near-free AI is approaching its end (Tangermann 2026).

The answer is far from certain, but the signs are there. At present, using millions of tokens costs a couple of euros at most. Autonomous agents run around the clock at near-zero cost. Prompts can be imprecise, lengthy, and inconsistent — but it doesn’t matter, because costs remain low. This is a strategy that will come to an end. And when it does, the real advantage will most likely belong to those who have learnt to operate efficiently rather than simply consume.

Training an AI model costs hundreds of millions of dollars. Scalable inference costs tens of millions per month. And yet using GPT-4-level intelligence costs a fraction of a cent per query, and many services are entirely free. You don’t need to be Paul Krugman or Joel Mokyr to understand that this makes no economic sense — unless you examine the structure of the industry.

Why AI is so cheap right now — and why that will change

AI laboratories are not making a profit at current prices. They operate on a continuous funding-round model, in which each round of venture capital buys the next year of below-cost access, growing the user base and justifying the next round at an even higher valuation. The structure is familiar. Amazon did it with e-commerce. Uber did it with taxis. Spotify did it with music. AI laboratories are doing it with intelligence itself.

The process involves four phases that recur in every major technological disruption. The first is the acquisition phase — the one we are living through now. Laboratories price tokens below their true cost, flood the market with free tiers and cheap APIs. The goal is for developers to build on top of your platform (e.g. AI agents), businesses to integrate your API, and ecosystems to form around your model. Lock-in accumulates quietly.

The second is the consolidation phase, in which smaller players are acquired or go bankrupt. Those who survive face institutional investor pressure to demonstrate profitability, which leads to the third phase: repricing. Prices anchor to real costs plus margin, and as a result, free tiers shrink or disappear entirely.

Companies that have integrated AI workflows, or that have built their business around selling AI-assisted services, will suddenly face real cost items they never budgeted for. As token prices multiply, they will be forced to charge their customers considerably more — or abandon the business as unviable.

The fourth is the fundamentals phase, when teams are no longer judged by how quickly they build with AI, but by how cost-efficiently they do so. At this point, token efficiency becomes the competitive advantage that is currently imagined to come from simply using AI and integrating it into existing workflows.

The first signs of this shift are already visible. In April 2026, GitHub announced it would be moving its Copilot development tool to usage-based billing from June onwards — and acknowledged at the same time that a single agentic session can already cost more than an entire month’s subscription fee (GitHub Blog, 2026).

At this point, many will want to push back and argue that the trajectory can be contested: hardware costs are falling, model efficiency is improving, and competition may keep prices low for longer than we expect. This is also true — but current pricing almost certainly does not reflect a stable market equilibrium. Venture-capital-backed below-cost access is a temporary market strategy, and the technology requires the world’s rarest metals, the supply of which will not increase.

Cheapness erodes the very skills it claims to strengthen

Cheap tokens are actively degrading foundational skills. This is an uncomfortable truth that none of the loudest voices in the AI hype machine wants to say aloud. When you can paste an error message into a chat window and receive a working fix in three seconds, you never build the debugging capacity that comes from two hours of tracing errors and staring at the screen.

If you can generate a data processing pipeline with a single prompt, you will rarely wrestle with the architectural trade-offs that make you a better systems designer. And when you receive an analysis in thirty seconds, you never develop the cognitive acuity that comes from working through a complex argument independently (e.g. Kosmyna et al. 2025; Lee et al. 2025; Gerlich 2025).

This isn’t just about programmers. The same phenomenon applies to analysts, specialists, and managers who outsource their thinking before they’ve had the chance to form their own view (Einhorn 2025). And when the tool is almost free, there is no price signal to slow that drift.

A concrete example illustrates the difference. A poorly worded prompt, a vague request with an entire dataset pasted in at once, can consume 15,000 tokens even for a modest task, whereas a well-crafted prompt, with the output precisely defined and data fed in stages, produces the same result in 800 tokens. Taken in isolation, the difference is practically meaningless at current prices, but when scaled to thousands of daily calls, the arithmetic of repricing can blow up into a budget crisis overnight.

Act before the bill arrives

Of course, the answer isn’t to stop using AI. That would be like refusing to use a calculator because it makes arithmetic too easy. What matters more is that we use this cheap era for learning, not just for building. That is why we recommend treating every prompt as though it were costing you real money.

Rather than thinking in terms of capability models, it is worth adopting a risk-based approach to cost models. You cannot optimise what you don’t measure at all. Integrating a token counter into your applications is therefore not a bad idea. When it comes to AI integrations, the question of what this costs when scaled by a factor of one hundred or a thousand is rarely asked, but these are questions that will face all of us within a few years.

A good prompt is not just a cost saving, it is good thinking. When you force yourself to articulate the precise task, the precise format, and the precise constraints, you are doing essentially the same work as writing a clean function, whilst simultaneously building a skill that will never become obsolete. The same logic applies to architecture. A developer who understands what happens inside the abstractions designs a system with a fraction of the calls. This is not merely a nuanced difference — at the moment of repricing, it will also be a budgetary one.

Even in the age of AI, it is sometimes important to return to first principles. AI is good at concealing our weaknesses, but there will come a point when it becomes clear who has built understanding, and who has built dependency on external actors who can set the price tag on their business. The AI buffet will not last for ever, but what you learn while it does will stay with you.

Authors

Martti Asikainen

Communications Lead
Finnish AI Region
martti.asikainen@haaga-helia.fi

Sami Masala

CEO, Co-founder
AI Think
s ami.masala@aithink.fi

References

Einhorn, C. S. 2025. When Working With AI, Act Like a Decision-Maker—Not a Tool-User. Published in Harvard Business Review on 31 October 2025. Harvard Business Publishing. Accessed 11 May 2026.

Gerlich, M. 2025. AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies, 15(1), 6. MDPI.

GitHub. 2026. GitHub Copilot is moving to usage-based billing. The GitHub Blog. Published April 2026. Accessed 5 May 2026.

Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X-H., Beresnitzky, A. V., Braunstein, I. & Maes, P. 2025. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task. arXiv.

Lee, H-P., Sarkar, A., Tankelvitch, L., Drosos, I., Rintel, S., Banks, R. & Wilson, N. 2025. The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25) (Art. 1121, pp. 1–22). ACM.

Lorrimar, V. & Smartt, T. 2026. Silicon Valley’s AI ‘tokenmaxxing’ obsession has a big problem – and philosophers saw it coming. Published in The Conversation on 10 May 2026. Accessed 11 May 2026.

Roose, K. 2026. More! More! More! Tech Workers Max Out Their A.I. Use. Published in The New York Times on 20 March 2026. Accessed 11 May 2026.

Tangermann, V. 2026. The Horrible Economics of AI Are Starting to Come Crashing Down. Futurism. Accessed 11 May 2026.

PrevPrevious

Finnish AI Region
2022-2025.
Medialle