AI Agent Pricing Models Explained for Buyers
AI agent pricing models explained: per-seat, usage, outcome, and project-based. How each bills, who wins, and which traps a mid-market buyer.
AI agents are sold under five pricing models: per-seat, usage-based, outcome-based, project-plus-retainer, and hybrids that blend them. Per-seat charges a flat fee per user, usage-based bills per action or token, outcome-based bills per result delivered, and project-plus-retainer pairs a fixed build cost with a monthly run fee. The right model is the one whose costs stay predictable when the agent succeeds and whose incentives point the vendor at your result, not their revenue.
I've bought and built agents at a $250M manufacturer. I've sat on both sides of the table. Here's how each model actually bills, who it favors, and where the traps are for a mid-market ops buyer who needs the number to hold for two years.
Why pricing is so confusing right now
The market is young and the models are still moving. Salesforce shipped three different Agentforce pricing structures in roughly 18 months: $2 per conversation, then Flex Credits at $0.10 per action in May 2025, then per-user licenses. When the biggest vendor in software can't settle on a model, smaller vendors will improvise, and the confusion is profitable for them.
The other reason is technical. Agents run on LLM inference, and inference has a real per-token cost the vendor has to recover somehow. That cost is why the old per-seat SaaS mold keeps breaking.
The stakes are higher than a wrong line item. Gartner predicts over 40% of agentic AI projects will be canceled by the end of 2027, citing escalating costs and unclear ROI. A pricing model that hides its true cost is one of the fastest ways to land in that 40%.
The five models you'll actually see
| Model | How it bills | Best when | Watch out for |
|---|---|---|---|
| Per-seat | Fixed price per user/month | Stable team, predictable use | You pay for seats that barely use it |
| Usage-based | Per action, token, or run | Variable or spiky volume | Bills scale with success — costs explode when it works |
| Outcome-based | Per result delivered | Clearly measurable output | Defining "outcome" honestly; gaming the metric |
| Project + retainer | Fixed build, then monthly run | Custom agents on your systems | Open-ended run scope; thin maintenance |
| Hybrid | Base fee + usage or outcome | Mixed workloads, shared risk | Two meters to audit instead of one |
None is right or wrong. The fit depends on your volume, how measurable the work is, and how much risk you want to hold.
Per-seat pricing
The SaaS default. You pay a flat rate per user per month. Easy to budget, easy to compare on a spreadsheet.
Where it works
A stable team using the agent daily. If ten planners log in every morning and lean on the same tool, per-seat is honest and predictable. You know the number a year out.
The trap
Agents don't work like seats. One power user might drive 80% of the value while ten licensed users barely log in, and you pay for all eleven. The model also misaligns incentives: the vendor wants more seats, you want more output per seat. For a focused estimating or planning agent used by a small team, per-seat often means paying for capacity you never touch. Audit actual usage at every renewal and cut dead seats.
Usage-based pricing
You pay per action, per token, or per run. Pure pay-for-what-you-use. Salesforce's Flex Credits model is the headline example: 20 credits per action, about $0.10 each.
Where it works
Spiky or seasonal volume. A quote agent that handles 50 RFQs in a slow week and 400 in a busy one only costs you for the busy week's work. You're not paying for idle capacity in January.
The trap
This is the one that bites manufacturers. Usage cost scales with adoption, so the better the agent works, the bigger the bill. The driver underneath is token cost. OpenAI's published API pricing runs from cents to dollars per million tokens depending on the model, and a document-heavy agent reading long spec sheets burns tokens fast.
You can blunt this. Anthropic's prompt caching cuts the cost of repeated context to roughly 10% of the standard input price, so a vendor who caches well should pass real savings to you. Model the worst-case month before you sign, not the average. And demand a cost ceiling plus alerting, so a runaway loop doesn't hand you a five-figure surprise.
Outcome-based pricing
You pay per result: per quote delivered, per invoice cleared, per shortage flagged. The model everyone says they want, and the direction the market is drifting.
Where it works
When the outcome is cleanly measurable and attributable to the agent. It aligns incentives better than anything else because the vendor only earns when you get value. Gartner projects 40% of enterprise apps will feature task-specific AI agents by the end of 2026, and outcome pricing is how many of those will bill.
The trap
Defining "outcome" honestly is brutally hard. Did the agent prevent that scrap event, or would the line lead have caught it? Vendors define outcomes generously toward themselves.
And any metric you pay against will get optimized. An agent paid per ticket closed will close tickets, not solve problems. This connects to a deeper finding: MIT's 2025 GenAI Divide study found 95% of enterprise AI pilots delivered no measurable return, and weak measurement is a big reason why. Outcome pricing is excellent when the metric is unambiguous and you control the measurement. It's a minefield when it isn't.
Project + retainer
Fixed build cost, then a monthly retainer for run and maintenance. The common model for custom agents built on your ERP and MES.
Where it works
Custom manufacturing agents that touch your specific systems. You get a fixed build number you can take to finance and a known monthly for upkeep. For a deeper look at the build side of this trade-off, see our guide on build vs buy AI agents for manufacturing.
The trap
Two of them. First, an open-ended retainer with vague scope: pin down exactly what the run fee covers (monitoring, fixes, token costs, model updates?). Second, a thin retainer that doesn't actually maintain the agent, so it drifts and degrades and you're paying for neglect.
Demand a defined maintenance SLA. Real upkeep means monitoring the agent in production, catching drift, and patching it when an upstream model or system changes. A retainer that doesn't fund that work is a retainer for nothing.
Hybrid models
A base fee plus a usage or outcome meter. A platform charges a per-seat floor for access, then bills per action above a threshold. Or a fixed build, then a small per-result fee.
Where it works
Mixed workloads where some use is steady and some is spiky. The base covers the vendor's fixed cost so they're not betting the relationship on volume, and the meter shares upside and risk. Done well, it's the fairest model on the list.
The trap
You now have two meters to audit instead of one, and vendors can hide margin in the gap. Make them show you, in writing, the all-in cost at low, expected, and high volume. If the hybrid only ever looks cheap at the volume they assume, it isn't a hybrid, it's a usage model with a cover charge.
How to choose for a mid-market plant
The model should match the work, not the vendor's preference:
- Custom agent on your systems → project + retainer, with a fixed build and a defined run scope.
- Variable-volume task → usage-based, with a cost ceiling and alerting.
- Cleanly measurable output you control → outcome-based, with you owning the measurement.
- Stable team, daily use, off-the-shelf → per-seat is fine, just audit actual usage at renewal.
- Mixed steady-plus-spiky work → hybrid, with all-in cost modeled at three volume levels.
Whichever you pick, tie it back to a real number. Work the math in our AI agent ROI calculation guide before you commit, and sanity-check the sticker against typical AI agent cost ranges for manufacturers. A model that can't survive your own ROI spreadsheet won't survive your CFO.
The questions that expose a bad deal
Before signing any AI agent pricing model, ask:
- What's my all-in cost at 3x current volume? If they can't answer, walk.
- What does the run or retainer fee actually cover? Get the line items.
- Who owns the agent and the data if I leave? Avoid lock-in to a black box.
- What's the worst-case monthly bill, and is there a ceiling? Especially for usage-based.
- How is success measured, and who measures it? Especially for outcome-based.
A vendor who answers these straight is a partner. One who dodges is selling you the model's blind spot. For the full version of this interrogation, our AI vendor RFP questions for manufacturing gives you 30 to put in writing.
Pick the model, then pick the agent
The right AI agent pricing model aligns the vendor's incentive with your outcome and holds its number when the agent succeeds. McKinsey's 2025 State of AI report found that redesigning workflows, not buying tools, is what most separates companies seeing real EBIT impact from those that aren't. Pricing is part of that design.
For most mid-market manufacturers building custom agents, the fit is a fixed-build project with a tightly scoped retainer: predictable, owned by you, no surprise at 3x volume. Pick the model that survives your worst-case month, then go pick the agent.
Our free First 5 Agents teardown maps the five highest-value agents for your operation and recommends the right pricing model for each, with the worst-case cost spelled out. Book a call after and we'll scope your first agent to a fixed number with no open-ended billing, the deal we'd want if we were sitting in your chair.
Frequently asked questions
What is the most common AI agent pricing model in 2026?
There's no single dominant model yet, but the market is shifting from per-seat toward usage-based and outcome-based billing because agents run on per-token inference cost that doesn't fit the seat mold. Salesforce, for example, moved Agentforce to a consumption model at roughly $0.10 per action. For custom agents built on a manufacturer's own systems, fixed-build-plus-retainer remains the most common arrangement.
Is usage-based or outcome-based pricing better for manufacturers?
It depends on how measurable the work is. Usage-based is safer when you can't cleanly attribute results to the agent, but its cost scales with success, so cap it. Outcome-based aligns incentives best when the result is unambiguous and you control the measurement, and it's a trap when either of those isn't true.
How do I avoid a runaway AI agent bill?
Model your worst-case month, not your average, before you sign anything usage-based. Require a hard cost ceiling and real-time alerting in the contract so a runaway loop can't generate a five-figure surprise. Ask whether the vendor uses prompt caching, which can cut repeated-context token cost to about 10% of the standard rate.
What should a monthly retainer for an AI agent cover?
A real retainer funds active maintenance: production monitoring, fixing drift, covering token costs, and patching the agent when an upstream model or system changes. Get the line items in writing and demand a defined maintenance SLA. A thin retainer that doesn't fund this work means you're paying for an agent that quietly degrades.
Why do AI agent projects get canceled over pricing?
Gartner expects over 40% of agentic AI projects to be canceled by the end of 2027, largely from escalating costs and unclear ROI. A pricing model that hides its true cost at scale, or that pays a vendor regardless of your result, is a direct path into that statistic. Choosing a model whose number holds at 3x volume is one of the cheapest insurance policies you can buy.
Let's see what's worth building first.
A 15-minute call: tell me where your AI or planning is stuck, and I'll tell you the one thing worth building first — and whether it's worth doing at all.