One number sums up the launch: 80.3%. That is Claude Fable 5 on SWE-Bench Pro, the harder, contamination-resistant successor to SWE-Bench Verified. Opus 4.8, released just twelve days earlier, scores 69.2%. GPT-5.5 sits at 58.6%. An 11-point jump over your own flagship within two weeks is not a refresh. It is a different bracket.
The headline numbers
- SWE-Bench Pro: Fable 5 at 80.3%, Opus 4.8 at 69.2%, GPT-5.5 at 58.6%.
- Hex analytical benchmark: Fable 5 is the first model to clear 90% on this suite of long, multi-stage analytical tasks.
- FrontierCode: highest score among frontier models, per Cognition.
- Hebbia Finance Benchmark: highest score of any model on senior-level financial reasoning.
- Speed: on spreadsheet task suites, Fable 5 beats Opus 4.8 at every effort level and finishes runs 25 to 30% faster, using fewer turns.
Sources for the launch-day figures are Anthropic's announcement and partner-published evals; treat third-party numbers as provisional until full eval cards land.
Context: the gap was visible in April
The architecture behind Fable 5 has been measurable for months under another name. Claude Mythos Preview, the restricted sibling available to Project Glasswing partners, posted 77.8% on SWE-Bench Pro back in April against 53.4% for the then-current Opus 4.6. The public release actually lands above that preview number. Whatever tuning happened between April and June, capability did not get traded away for safety. The safeguards were layered on top instead, a mechanism we explain in the Fable 5 story from leak to launch.
The price-per-point math
Fable 5 costs $10 per million input tokens and $50 per million output tokens. That is exactly 2x Opus 4.8. Is double the price worth 11 points? Here is the arithmetic nobody publishes:
- Opus 4.8: $25 per million output tokens buys 69.2 SWE-Bench Pro points, or about $0.36 per point per million tokens.
- Fable 5: $50 buys 80.3 points, or about $0.62 per point per million tokens.
On raw cost per benchmark point, you pay roughly 72% more. But that framing hides two things. First, the marginal points are not equal: the tasks between 69 and 80 are the ones Opus fails outright, so for that work the comparison is not 2x the price, it is finished versus not finished. Second, Fable 5 completes runs in fewer turns. A 25 to 30% reduction in turns claws back a real share of the per-token premium on agentic workloads, where conversation overhead multiplies.
Where Opus 4.8 still makes sense
Opus 4.8 does not become obsolete today. At $5/$25 it remains the value pick for work that does not need the extra horizon: shorter coding tasks, drafting, review passes. It is also the model that answers when Fable 5's safety classifiers decline a query, so for cybersecurity-adjacent or biology-adjacent topics you may prefer to start there and skip the reroute. And below both, Sonnet 4.6 at $3/$15 still covers high-volume production work.
Bottom line
If your workload involves long autonomous runs, hard migrations, or analysis that spans hours, the math favors Fable 5 despite the premium. If you mostly do short, interactive tasks, Opus 4.8 keeps most of the capability at half the price. For what the new model is and how the two-tier release works, start with our full Fable 5 explainer; for access routes and the June 22 subscription window, see the pricing and access guide.