Copilot Cowork Models: What Each One Actually Costs

Microsoft made Copilot Cowork generally available on 16 June 2026, and the Copilot Cowork models in the picker now hit your bill directly. Each run consumes Copilot Credits, and consumption shifts sharply depending on which model you reach for. I tested the same prompt across three of the four available options and watched the cost swing by nearly 3x. The picker labels Microsoft ships with describe what each model does. They don’t tell you what each one costs.

How Copilot Cowork Models Work in the Picker

You open Cowork to run a task. Four options stare back at you. Each one produces broadly similar output. Pricing diverges sharply.

Microsoft’s GA launch on 16 June 2026 brought Cowork out of the Frontier preview and onto a consumption-based billing model. Every task spends Copilot Credits, charged at $0.01 per credit on pay-as-you-go. The model selection is one of the four cost drivers that determine task pricing, alongside context retrieval, tool calls, and runtime. It’s the one you actively control.

The four options Microsoft offers:

Auto picks the model best suited to your task. The default.
Claude Sonnet 4.6 is labelled “efficient for everyday tasks”.
Claude Opus 4.8 is labelled for “complex, high-stakes work”.
GPT 5.5 is labelled “versatile across task types”.

The official guidance per Microsoft’s Choose a model for Copilot Cowork documentation is to leave the picker on Auto for most day-to-day work. That’s reasonable advice. It’s also where the conversation usually stops.

The Same-Prompt Test Across Three Models

I ran the identical Cowork task three times on three different models. Same outputs requested each run. The cheapest landed at $3.98. The dearest hit $10.69.

The work was a CSV of 2025 sales figures, with three artefacts to produce: a multi-tabbed Excel workbook, a board-ready PowerPoint deck, and a fully interactive HTML dashboard. After each run I used the /cost command to read credit consumption.

The results:

Model	Credits	Cost (PayGo)
Claude Sonnet 4.6	398	$3.98
Claude Opus 4.8	477	$4.77
GPT 5.5	1,069	$10.69

GPT 5.5 was 2.7x the price of Sonnet for equivalent work. Opus sat surprisingly close to Sonnet, with roughly an $0.80 premium for deeper reasoning. The model labelled “versatile across task types” was the outlier on cost by a wide margin.

Output quality was equivalent across runs. I use a custom Skill called interactive-dashboard that holds the output UI constant across models, which lets you attribute the credit delta to the model itself, not to the work.

What Microsoft’s Labels Don’t Tell You

Read the picker labels again. “Efficient.” “Complex, high-stakes.” “Versatile.” If you had to pick the safe middle option from those three, you’d reach for “versatile”.

In practice, GPT 5.5 cost roughly what running Sonnet and Opus combined would cost for the same task. Microsoft’s own Copilot Credits Guide doesn’t break down credit consumption by model, so the picker label does most of the framing work.

The picker is a behaviour design surface. The labels describe what the models are good at. They say nothing about what they cost.

If you treat “versatile” as the default, you’ll burn 2-3x your budget on tasks Sonnet would have handled identically. Auto is the safer default if you’d rather not choose. Sonnet is the safer default if you want predictable everyday cost.

This is the kind of pricing-versus-licensing distinction that catches teams out. I covered the related licensing trap in Microsoft 365 Copilot vs Copilot Chat. The same dynamic applies here, one layer deeper.

Picking the Right Model for the Task

Reach for Sonnet on the routine work. Drafting, summarising, restructuring data, producing standard artefacts from a brief. The work that fits Microsoft’s “efficient for everyday tasks” framing. Cost-per-task lands at the bottom of the medium band.

Save the Opus runs for the complex high-stakes work that actually warrants them. Multi-source analysis, nuanced research synthesis, anything where a reasoning slip costs more than the model premium. The credit difference between Sonnet and Opus on most jobs is under a dollar. Output quality is sometimes meaningfully better, often not. Test before you default.

Pick Auto when you don’t want to choose. Auto orchestrates per-task and tends to skew toward the cheaper end. Microsoft’s GA announcement positions it as the right default for mixed workloads, especially when paired with a custom Skill that constrains the output shape.

GPT 5.5 has a place when the use case specifically calls for it. Long-form writing, citation-heavy outputs, large context windows. Treat it as the specialist, not the default.

Where this lands

The Copilot Credits meter is the new discipline. Model choice is the lever that decides what the meter reads. Microsoft’s picker labels describe what each model does, not what each one costs. Run an identical prompt on two models and use /cost after each. The picker is one click. Across a month, the difference compounds.

Copilot Cowork Models FAQs

What are the Copilot Cowork models?

Copilot Cowork ships with four model options in the picker: Auto (the default orchestrator), Claude Sonnet 4.6 (efficient for everyday tasks), Claude Opus 4.8 (complex high-stakes work), and GPT 5.5 (versatile across task types). Sonnet + Opus Advisor is also available as a paired mode in some tenants. The visible options depend on what your organisation allows.

How much does each Copilot Cowork model cost?

There is no fixed per-task price. Cowork bills usage in Copilot Credits at one cent per credit on pay-as-you-go. In my three-run test across the same brief, Sonnet 4.6 ran at 398 credits ($3.98), Opus 4.8 at 477 ($4.77), and GPT 5.5 at 1,069 ($10.69). Your prompt complexity, context retrieval, tool calls, and runtime all shift the credit total.

Which Copilot Cowork model should I use by default?

Pick Auto if you’d rather not think about it. Sonnet 4.6 is the safe default for predictable everyday cost. Reach for Opus 4.8 on nuanced reasoning or high-stakes decisions. Use GPT 5.5 selectively for long-form writing or citation-heavy outputs. See Picking the Right Model for the Task above for the full breakdown.

How do I check what a Cowork task costs?

Type /cost in the task window after a run completes. Cowork returns the exact number of Copilot Credits the conversation has used so far. Use it after each model trial to compare like for like. It’s the easiest way to build a feel for what a typical task costs before the month-end bill arrives.

Related content

Microsoft 365 Copilot vs Copilot Chat: What You're Actually Paying For

OneDrive vs SharePoint: The Complete Guide to What to Use When

Why Your SharePoint Needs an Information Architecture Before AI

Stay current on AI and modern work