The headline rate of a Copilot Credit — one cent at pay-as-you-go, eight tenths of a cent in a pack — tells you almost nothing about what an agent will cost. The number that matters is cost per task: how many credits a single unit of work consumes once you account for the agent’s design. Get that number right for each agent and a credit budget becomes a defensible forecast. Skip it and you are guessing at a meter that varies a hundred-fold between its cheapest and most expensive operations.
This guide shows how to derive cost per task from Microsoft’s published rate card, why the same agent can cost five cents or two dollars a conversation, and how to turn the figure into a monthly budget you can take into a procurement conversation. It is the working companion to our Copilot Credits economics pillar.
Start from the rate card, not the credit price
Every Copilot agent interaction is a stack of priced operations. Microsoft meters them separately, and they add together within a single turn. The operations that drive cost are: a classic, scripted answer at 1 credit; a generative answer at 2; an agent action such as a tool or step call at 5; content processing at 8 per page; tenant Graph grounding at 10 per response; agent flow actions at 13 per 100; and the AI-tools tiers, where basic is 1 per ten responses, standard is 15 per ten, and premium reasoning is 100 per ten. That last figure is the one that reshapes budgets: a reasoning model costs fifty times a standard generative answer.
| Operation | Credits | Cost at $0.008 (pack) |
|---|---|---|
| Classic answer | 1 | $0.008 |
| Generative answer | 2 | $0.016 |
| Agent action (tool/step) | 5 | $0.04 |
| Tenant Graph grounding | 10 | $0.08 |
| Premium reasoning (per 10 resp.) | 100 | $0.80 |
Build the per-task number
Cost per task is the sum of the operations a representative turn performs, multiplied by your credit rate, multiplied by the number of turns in a typical conversation. Take a support agent that grounds on the Graph (10), generates an answer (2), and calls one tool (5): that is 17 credits a turn. At roughly 1.5 turns a conversation, the agent runs about 25 credits, or 20 cents at pack pricing, per conversation. A scripted version of the same agent — one classic answer, no grounding — runs a single credit a turn, under two cents a conversation. Same business purpose, a ten-fold cost gap, set entirely in the builder.
The expensive verbs are grounding, reasoning, and tools. If a turn does not need to look at your data, do not ground it. If it does not need to reason, do not route it to a reasoning model. Cost per task is mostly a question of which of these verbs fire on every turn versus only when needed.
Cost per task by agent archetype
The same method, applied to common archetypes, shows where the money concentrates. A scripted IT-helpdesk agent sits around 3 credits a conversation. An HR policy agent that generates and grounds sits near 12. An external customer-service agent lands around 8 to 25 depending on whether it grounds and calls tools. A sales-assist agent that reasons on every turn can exceed 50. Multiply each by monthly volume and the picture inverts the intuition that “more conversations cost more” — a low-volume reasoning agent can outspend a high-volume scripted one.
| Agent | Credits / conv. | Conv. / mo | Cost / mo (pack) |
|---|---|---|---|
| IT helpdesk (scripted) | ~3 | 5,000 | ~$120 |
| HR policy (gen + grounding) | ~12 | 3,000 | ~$290 |
| Customer service (external) | ~8–25 | 20,000 | ~$1,280–$4,000 |
| Sales assist (reasoning) | ~50 | 2,000 | ~$800 |
Subtract the inclusion path
One adjustment can zero out a line entirely. Internal interactions by users who hold a Microsoft 365 Copilot licence do not consume credits in business-to-employee scenarios. So before you cost an internal agent, ask whether its audience is licensed staff. If it is, the metered cost per task is zero regardless of how the agent is built — the credits apply only when the agent faces external users or unlicensed internal ones. Many organisations cost an internal helpdesk bot at hundreds of dollars a month that should be free. Map audience and licence status first; it is the cheapest saving available.
From cost per task to a monthly forecast
A defensible forecast is built bottom-up: for each agent, multiply cost per task by expected monthly volume, subtract inclusion-eligible traffic, and sum across the estate. Add a variance band, because a single design change — switching an agent to a reasoning model, or grounding every turn — can multiply its line overnight. Then validate the model against two to three months of the Copilot Credits report before committing to any prepaid pack or pre-purchase plan. The report breaks consumption down per agent and per user, which is exactly the granularity a cost-per-task model needs to be checked against.
This is the discipline that separates a budget from a hope. Forecasting top-down from a vendor estimate of “average conversations” ignores the hundred-fold spread that agent design introduces. Forecasting bottom-up from cost per task captures it, and gives you a number you can defend line by line when Microsoft proposes a commitment.
Where cost-per-task models go wrong
Most cost-per-task estimates fail in one of four predictable ways. The first is averaging away the spread: a model that assumes “about 8 credits a conversation” across an estate hides the reasoning agent burning 50 and the scripted one burning 1, and the average is wrong for every agent. Model each agent on its own design. The second is ignoring multi-turn conversations: a turn is metered, not a conversation, so a 17-credit turn across a three-turn exchange is 51 credits, not 17. Count turns, not just sessions.
The third is forgetting that operations stack. Builders reason about features in isolation — “grounding is 10” — and forget that a grounded, generative, tool-calling turn pays for all three at once. The fourth is treating the rate card as static. Microsoft has already changed the consumption model once, moving from messages to credits in September 2025, and it reprices meters more readily than seats. A model built on today’s rates needs a review cadence, and a commitment built on them needs rate protection in the contract.
A worked cost-per-task calculation
Take a customer-service agent handling 20,000 external conversations a month, averaging two turns each. A representative turn grounds on the Graph (10 credits), generates an answer (2), and calls one delivery-status tool (5): 17 credits a turn, 34 a conversation. That is 680,000 credits a month. At pack pricing of $0.008 it costs $5,440; at pay-as-you-go $0.01 it is $6,800. None of it qualifies for inclusion because the audience is external.
Now apply two design changes. Route the 40% of conversations that are simple status checks to a scripted path (1 credit a turn instead of 17), and drop grounding on the generative half where the answer does not need order data. The blended cost per conversation falls from 34 credits to roughly 12, and the monthly bill drops from about $5,440 to near $1,920 — a 65% reduction with no change to headcount, volume, or the customer experience. That is the practical payoff of costing per task: it tells you precisely which design levers move the bill, and by how much.
The short version
- Cost per task = (credits per turn from the rate card) × (turns per conversation) × (your credit rate).
- Grounding, reasoning, and tool calls are the expensive verbs; fire them only when needed.
- Internal agents for licensed users cost zero — subtract them before forecasting.
- Build the budget bottom-up per agent and validate against the Copilot Credits report before committing.