Introducing Claude Sonnet 4.6

Published: February 17, 2026

Overview

Claude Sonnet 4.6 represents Anthropic's most capable Sonnet model to date. This upgrade enhances capabilities across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. The model features a 1M token context window currently in beta.

For Free and Pro plan users, Claude Sonnet 4.6 is now the default model in claude.ai and Claude Cowork. Pricing remains consistent with Sonnet 4.5 at $3/$15 per million tokens.

Key Improvements

Coding Performance

Developers with early access strongly prefer Sonnet 4.6 over its predecessor, with many even favoring it over Claude Opus 4.5 from November 2025. The model demonstrates superior consistency and instruction-following capabilities.

Performance previously requiring Opus-class models—including real-world office tasks—is now accessible through Sonnet 4.6. The model shows major improvements in computer use skills compared to earlier Sonnet versions.

Safety Evaluation

Extensive safety testing indicates Sonnet 4.6 is as safe or safer than other recent Claude models. Researchers noted the model displays "very strong safety behaviors" with no major misalignment concerns.

Computer Use Capabilities

In October 2024, Anthropic introduced the first general-purpose computer-using model. Sixteen months of development show steady progress on OSWorld benchmarks, which test AI performance across real software like Chrome, LibreOffice, and VS Code.

Early users report human-level capability in complex tasks such as navigating spreadsheets and completing multi-step web forms. While the model still lags behind skilled human users, the improvement rate is substantial.

Prompt Injection Resistance

Computer use poses security risks through prompt injection attacks. Safety evaluations show Sonnet 4.6 significantly improves resistance compared to Sonnet 4.5 and performs similarly to Opus 4.6.

Benchmark Performance

Claude Sonnet 4.6 approaches Opus-level intelligence while maintaining a more practical price point. Key improvements include:

Claude Code testing: Users preferred Sonnet 4.6 over Sonnet 4.5 approximately 70% of the time
Comparison to Opus 4.5: Users preferred Sonnet 4.6 59% of the time, citing better instruction following and fewer hallucinations
Long-context reasoning: The 1M token window enables effective reasoning across entire codebases, lengthy contracts, and multiple research papers
Vending-Bench Arena: Sonnet 4.6 developed sophisticated business strategy, investing heavily early then pivoting to profitability

Design and Frontend Quality

Customers report notably more polished visual outputs from Sonnet 4.6, featuring superior layouts, animations, and design sensibility. Fewer iteration rounds are needed to reach production-quality results.

Product Updates

Platform Features

Adaptive thinking and extended thinking support
Context compaction in beta, which automatically summarizes older context
Enhanced web search and fetch tools with automatic filtering and code execution
Memory tool, programmatic tool calling, tool search, and tool use examples now generally available

Excel Integration

Claude in Excel now supports MCP connectors, enabling integration with tools like S&P Global, LSEG, Daloopa, PitchBook, Moody's, and FactSet. MCP connections configured in Claude.ai automatically work in Excel.

Recommendation

Opus 4.6 remains optimal for tasks demanding deepest reasoning, such as codebase refactoring, multi-agent coordination, and precision-critical work. Sonnet 4.6 offers strong performance across thinking effort levels.

Availability

Claude Sonnet 4.6 is available across:

All Claude plans
Claude Cowork
Claude Code
Claude API (using claude-sonnet-4-6)
Major cloud platforms
Free tier (now includes file creation, connectors, skills, and compaction)

Customer Testimonials

Industry leaders report significant improvements:

Databricks: Matches Opus 4.6 on document comprehension tasks
Replit: Extraordinary performance-to-cost ratio for agentic workloads
Cursor: Notable improvements on long-horizon tasks
GitHub: Excels at complex code fixes across large codebases
Cognition: Meaningfully closed gap with Opus on bug detection
Pace: Achieved 94% on insurance benchmark for computer use

Methodology Notes:

Benchmark comparisons reference best available API versions. OSWorld tests specific controlled tasks but doesn't fully capture real-world messiness. Terminal-Bench 2.0 uses Terminus-2 harness with 1x guaranteed/3x ceiling resource allocation. SWE-bench Verified scores averaged over 10 trials. Tools evaluations used web search, fetch, code execution, and various reasoning configurations as specified.

Introducing Claude Sonnet 4.6 ​

Overview ​

Key Improvements ​

Coding Performance ​

Safety Evaluation ​

Computer Use Capabilities ​

Prompt Injection Resistance ​

Benchmark Performance ​

Design and Frontend Quality ​

Product Updates ​

Platform Features ​

Excel Integration ​

Recommendation ​

Availability ​

Customer Testimonials ​