Anthropic doesn't do attention-grabbing version bumps. Opus 4.7 went generally available on April 16, 2026[2], with a release note that read like a normal point release — better coding, better vision, same pricing. Across the last three weeks the rollout has finished landing across Bedrock[5] and Microsoft Foundry, plus the Claude Security beta has been turned on for Enterprise customers. After a few weeks of using it across real workloads, the actual upgrade is more substantial than the marketing copy suggests.
Let me break down what changed.
The headline benchmark
Source: anthropic.com / vellum.ai
SWE-bench Verified — Anthropic's preferred coding benchmark — climbs from 80.8% on Opus 4.6 to 87.6% on Opus 4.7[4]. That is the biggest single-version jump in the Opus 4 family — bigger than 4.5 → 4.6 (4.4pp) and bigger than 4.4 → 4.5 (3.9pp). The 4.7 release closes the gap between "shippable for routine tasks" and "shippable for hard ones" more than any prior point release.
The benchmark is not the only interesting thing, though. The interesting bit is consistency over long sessions.
Long-running task quality is the real upgrade
Anthropic's own framing is that Opus 4.7 "handles complex, long-running tasks with rigor and consistency"[1]. That is engineering-marketing for "doesn't go off the rails after 30 minutes of context". In practice, I have seen this hold up across:
- Multi-hour agent runs that previously degraded into looping or hallucinating mid-session
- Long debugging sessions that 4.6 would lose thread on after ~80 messages
- Document-analysis runs over 500-page PDFs where 4.6 would start confusing sections
The model is not magically smarter on first-message tasks. It is meaningfully better on the 11th message of a 12-message thread.
1M context becomes the standard tier
The other significant API change: Opus 4.7 ships with 1 million tokens of context at standard pricing[3], not as a beta. Previously the 1M tier required a separate access flag and beta pricing. Now it is the default.
Practical implication: workloads that previously had to chunk and embed-and-retrieve can just dump everything into context. This is meaningful for codebase-analysis tasks, long-form contract review, and long agent loops. 1M tokens covers about 750k words — roughly 7 average novels.
Vision finally usable for screenshots
| Spec | 4.6 | 4.7 |
|---|---|---|
| Max image resolution | 1568px / 1.5MP | 2576px / 3.75MP |
| SWE-bench Verified | 80.8% | 87.6% |
| Context window | 200k (1M beta) | 1M standard |
| Security scanning | ❌ | ✅ (Claude Security public beta) |
| Pricing | $5 / $25 | $5 / $25 (unchanged) |
The vision upgrade is the headline I care most about. Max image resolution went from 1568px (1.5MP) to 2576px (3.75MP)[3]. That is the difference between "can read a screenshot of a webpage" and "can read the small print in a screenshot of a webpage".
Concretely:
- 4.6: paste in a Figma export at full res, model reduces it before OCR, gets things wrong
- 4.7: model can read 14px text in a screenshot reliably
This makes Opus 4.7 the first Claude version genuinely usable for UI debugging — paste a screenshot, ask "what is broken about this layout", get a specific answer.
Claude Security in public beta
Bundled with the 4.7 rollout was Claude Security — code vulnerability scanning with proposed fixes[1]. This is Anthropic's first explicit play in the dev-security space. Available to Claude Enterprise customers in public beta.
Two things stand out:
- It is real scanning, not a model wrapper. Anthropic has clearly invested in the static-analysis side and is using Opus 4.7 to triage and explain findings rather than to find them.
- The proposed-fix flow is novel. Most security scanners surface findings; Claude Security surfaces a finding plus a Claude-generated patch you can apply directly.
For shops already on Claude Enterprise, this is a free upgrade.
Pricing — still $5/$25
Source: vendor docs
Pricing held at $5 per million input tokens, $25 per million output tokens — same as Opus 4.5 and 4.6[2]. That continues to be the most expensive frontier-model API pricing on the market, though only just. GPT-5.5 Instant is at $5 input / $30 output (more expensive on output, same on input). Gemini 4 Pro is at $3/$15 (cheapest of the three).
The simple read: Opus 4.7 is no longer the obvious premium-priced option. It is priced in line with GPT-5.5 Instant on input and slightly cheaper on output. For coding-heavy workloads where output dominates, Opus 4.7 is now the same price as GPT-5.5 Instant or cheaper.
For everything in between, the right answer is a router. Use a multi-provider gateway to send routine queries to Gemini 4 Pro and hard ones to Opus 4.7. That is the pattern most production AI shops have settled on by mid-2026.
What this means if you build with Claude
Three concrete suggestions:
- Default to 4.7 for agents. If you have a Claude-based agent in production on 4.6, upgrade. The long-context consistency and the 1M-standard tier are both worth the deploy.
- Use vision for the first time. If you previously gave up on Claude vision because the resolution was insufficient, try again. 3.75MP is enough for most real screenshots.
- Pilot Claude Security if you are on Enterprise. The proposed-fix flow is genuinely different from existing SAST tools.
Opus 4.7 is the kind of upgrade that does not change what is possible but moves the needle on production reliability — and reliability is what determines whether AI features ship.