Revenue Systems ArchitectureMay 7, 2026·10 min read

The AI Pricing Trilemma: Usage, Credits, or Subscriptions

Why AI companies can’t just copy the SaaS pricing playbook

The SaaS pricing playbook took about fifteen years to solidify. Companies figured out per-seat models, good-better-best tiering, annual contract incentives, and expansion revenue loops. It was messy and imperfect, but it was a known system. You could benchmark it. You could model it. Finance teams could forecast with reasonable confidence because the cost of serving user number 501 was roughly the same as serving user number 1.

AI broke that assumption. And with it, most of what we thought we knew about software pricing.

The cost of serving each AI customer is variable, unpredictable, and in some cases wildly asymmetric. One user sends a few chat prompts a day. Another user runs complex reasoning tasks, code generation pipelines, and deep research workflows that consume 50x the compute. Under a flat subscription, both pay the same. The first user is a cash cow. The second is a margin disaster. Sam Altman said publicly that OpenAI is losing money on ChatGPT Pro subscriptions at $200/month because “people use it much more than we expected.” When the CEO of the most well-funded AI company on earth admits he personally chose a price point that loses money, something structural has changed about how software gets monetized.

This is the pricing trilemma that every AI company now faces. Three models, each with real strengths, each with real failure modes, and no clean winner.

The subscription trap

Subscriptions feel safe. They produce the recurring revenue that investors drool over, that finance teams can forecast, and that public market analysts use to assign multiples. The entire SaaS valuation framework rests on ARR, and ARR rests on subscriptions.

But flat subscriptions create an ugly dynamic in AI products. The data on this is consistent: 70-80% of token consumption comes from just 10% of users. Those power users can become deeply unprofitable without guardrails. And AI gross margins are already under pressure. Replit, Cursor, and others have reported margins that make traditional SaaS companies wince. There is very little room for further erosion.

The math gets worse as AI models improve. Many people predicted that AI costs would drop 10x every year, which would have let companies sacrifice margin temporarily and win on distribution. That prediction has not held up. Older models get cheaper, yes. But nobody wants yesterday’s model. GPT-5 costs roughly $10 per million output tokens, which is not dramatically cheaper than where GPT-4o sat in early 2024. Meanwhile, token consumption keeps exploding as users trust AI with bigger, more complex tasks. We went from summarizing meeting notes to full code generation and multi-step agentic workflows. The ceiling on per-user cost is not converging. It is diverging.

The companies that have stuck with pure subscriptions are finding this out the hard way. They end up doing one of two things: either they introduce hidden usage caps (which creates customer frustration and trust issues) or they eat the margin hit (which makes the unit economics progressively worse as the product gets better). Neither path is sustainable.

One bright spot: subscription models do create strong upgrade pressure in the first month. We’re seeing AI companies with very high initial upgrade rates because users hit value fast, often within their first few prompts, and realize they need more capacity. But the downstream behavior is messy. Users upgrade when creativity strikes, downgrade when they hit a slow period, and the result is an upgrade-downgrade dance that makes churn metrics unreliable. The subscription model assumes consistent monthly usage. AI usage is inherently spiky.

Usage-based pricing and the anxiety problem

Pure usage-based pricing solves the margin problem. If a user consumes 50x the compute, they pay 50x more. Costs and revenue stay aligned. The economics are clean.

The GTM implications are anything but clean.

Usage-based pricing creates billing anxiety. When every API call, every token, every prompt has a visible cost, users start self-censoring. They hesitate before running that extra query. They second-guess whether the output is worth the credits. This is a real behavioral effect, and it works directly against the product’s goals. AI products want maximum engagement. Usage-based pricing penalizes it.

There is also a sales problem. How do you forecast revenue when you cannot predict customer usage? How does a rep build a pipeline model when the deal value is a range rather than a number? How does procurement approve a budget when costs are inherently unpredictable? Nothing scares procurement more than runaway costs with no visibility, and that fear often kills deals before they start.

The data shows this clearly. As of mid-2025, 54% of AI products are already monetized beyond traditional seat-based subscriptions. But the breakdown is telling: 25% are pure usage-based, 22% are hybrid, and only 7% use outcome-based pricing. The market is experimenting aggressively, but the pure usage model is not winning by default. Most companies that start with pure usage end up layering subscription elements back in, because the revenue volatility and customer anxiety are too severe.

There is a version of usage-based pricing that works better: charging for outputs rather than inputs. Instead of pricing per token consumed, you price per task completed, per research report generated, per code review run. This shifts the frame from “how much compute did I burn” to “how much work got done,” and customers find it much easier to evaluate. But it requires the product to be mature enough to define clear, countable units of value, which many early-stage AI companies cannot do yet.

The credit system: the least bad option

This brings us to credits, the model that has quietly become the default across the AI industry. Microsoft announced AI credits for Copilot in January 2025. Salesforce added flex credits in May. OpenAI replaced seat licenses with pooled credits for Enterprise. Cursor shifted to credit-based pricing (and faced real pushback). Adobe, Apollo, Asana, Atlassian, Clay, HubSpot, Google, Replit, Monday.com, and many others have adopted some version of the model.

Credits are a compromise, and they behave like one. A credit system gives customers a pool of usage that feels relatively straightforward: “You get 500 credits per month.” Vendors, meanwhile, can adjust the cost of different actions in credits without changing the sticker price. A simple chat prompt might cost 1 credit. A complex reasoning task might cost 10. A full code generation run might cost 50. This lets the vendor manage margin at the action level while keeping the customer-facing pricing stable.

The approach works because it addresses the worst failure modes of both subscriptions and pure usage. It preserves some revenue predictability (customers commit to a credit tier). It aligns costs with usage (heavy users spend more credits). And it gives customers a sense of control and budgetability that raw per-token pricing does not.

But credits have their own problem, and it is not small: cognitive overhead. Users have to learn a new unit of account. They have to understand what a credit is worth, how different actions consume credits at different rates, and how to estimate their monthly usage. This is friction. And in an industry where frictionless onboarding is the whole growth engine, adding a mental math layer to the product experience is a real cost.

The Cursor backlash is instructive. When they shifted to credit-based pricing, users revolted, not because credits are inherently bad, but because the mapping from actions to credit costs felt opaque. Users could not predict what their bill would be. The cognitive overhead exceeded their tolerance. Cursor had to issue a public apology and revise the model. The lesson: credits work when the exchange rate between actions and credits is simple and intuitive. They fail when the credit system becomes its own puzzle.

The hybrid convergence

The model that is actually winning, as of right now, is not any of the three in isolation. It is the hybrid. Subscription tiers that include a credit allowance, with the ability to purchase additional credits on demand.

Hybrid pricing has gone from 27% of the market to 41% in just the past twelve months. And it is growing fast. The pattern we’re seeing most often looks like this: a base subscription that unlocks features and includes a monthly credit allotment, with overage pricing or credit top-up packs for users who need more.

Clay’s model has become the template that everyone references. They offer tiered subscriptions (more features at higher tiers), each including a pool of credits for enrichment and data actions. Unused credits roll over up to 2x the monthly allotment. Instead of a large annual discount, they offer a modest 10% reduction and let customers get credits upfront. This design hits several goals at once: it creates predictable base revenue, it aligns heavy usage with incremental spend, and the rollover provision reduces credit-waste anxiety while creating lock-in.

There is an interesting behavioral insight in the top-up model. Companies that have tested it, Lovable being a public example, found something counterintuitive. Adding a pay-as-you-go option alongside subscriptions did not cannibalize recurring revenue. It actually improved paid retention by 7%. The reason: users who previously oscillated between upgrading and downgrading their subscription now had a release valve. When they needed burst capacity, they bought a credit pack instead of upgrading their plan. When the burst passed, they stayed on their current tier instead of downgrading. The subscription became stickier because the top-up absorbed the variability.

The pricing of those top-ups matters. In A/B testing, a 20% premium over subscription credit rates hit the sweet spot. It was enough to make “subscribe and save” feel like a genuine deal, but not so expensive that users felt punished for going over their allotment. At a 40% premium, adoption of top-ups dropped sharply and engagement did not improve. The framing is important here: this is the Amazon “Subscribe & Save” mental model applied to AI credits, and customers already understand it.

What this means for GTM

The pricing model you choose has downstream effects on every part of your go-to-market, and most teams are not thinking about this carefully enough.

Sales compensation is the first domino. When revenue is split between subscription base and variable credit consumption, how do you comp reps? If you pay on contracted subscription value only, reps ignore expansion. If you pay on total consumption, deal values become unpredictable and comp plans are hard to model. The teams handling this well are splitting comp between initial contract value (weighted toward subscription) and a trailing expansion kicker based on credit consumption in the first 90-120 days.

Forecasting is the second challenge. In a hybrid model, your revenue has a fixed floor (subscriptions) and a variable ceiling (credit consumption). This is actually better for forecasting than pure usage, but it requires new modeling infrastructure. The traditional SaaS forecasting model, where you project ARR from new logos plus expansion minus churn, breaks down when a meaningful chunk of revenue comes from variable credit usage that does not fit neatly into “new” or “expansion” buckets.

Then there is the question of how AI pricing interacts with the broader shift toward AI FTE equivalents. Some companies are sidestepping the credit complexity entirely by positioning their AI agents as virtual employees. The pitch: this agent does the work of an SDR at one-fifth the cost of hiring one. The pricing maps to output, the equivalent of what a ramped human would produce, rather than to tokens or credits or API calls. It translates complex pricing into something that feels predictable and value-based. And it lets sales reps target hiring budgets, which are larger and more accessible than technology budgets. We’re in very early innings of this approach, but it is the most elegant frame I have seen for selling AI without making the buyer do compute math.

The honest assessment

There is no perfect pricing model for AI right now. Anyone who tells you otherwise is selling a billing platform.

Subscriptions optimize for revenue predictability but create margin risk. Usage-based pricing optimizes for margin alignment but creates buyer anxiety and sales complexity. Credits are the compromise that threads the needle on cost alignment and moderate predictability but adds cognitive overhead. The hybrid of all three, subscription plus credits plus top-ups, is currently the least bad option, and it is where most of the market is converging.

What matters more than the model itself is whether you have the operational infrastructure to iterate on it. The companies I see winning are the ones that treat pricing as a live experiment, not a launch decision. They instrument credit consumption by cohort. They track the ratio of subscription revenue to overage revenue and watch for signals that their allotments are too generous (no one buys top-ups) or too stingy (everyone churns). They run A/B tests on credit packaging and top-up pricing with the same rigor they apply to product features.

The single biggest mistake is treating pricing as a one-time decision made at launch and revisited annually. In the AI era, where model capabilities and costs shift quarterly, your pricing needs to move at the same cadence. The companies that build pricing iteration into their operating rhythm, with the cross-functional coordination to actually ship changes, will capture significantly more value than those still debating whether to charge per seat or per token. The model matters. The ability to evolve the model matters more.

Enjoying this essay?

Written by

Elom

GTM and Growth engineer with 12 years across Fortune 500s, fintech, and B2B startups. Building at the intersection of AI, data, and revenue.

Get the next deep-dive in your inbox

Essays on GTM, growth engineering, and what's actually working. Free.