Claude Sonnet 4.5
Anthropic · 200K ctx
SWE-bench Verified
77.2%
LiveCodeBench v6
71.4%
Terminal-Bench 2.0
48.0%
Avg Score
65.5%
Position
#4
Claude Opus 4.5
Anthropic · 200K ctx
SWE-bench Verified
80.9%
LiveCodeBench v6
87.1%
Terminal-Bench 2.0
59.3%
Avg Score
75.8%
Position
#1
Qwen 3.5 397B
Alibaba · MoE 17B active
SWE-bench Verified
76.4%
LiveCodeBench v6
83.6%
Terminal-Bench 2.0
52.5%
Avg Score
70.8%
Position
#2
Qwen 3.6 35B
Alibaba · MoE 3B active
SWE-bench Verified
73.4%
LiveCodeBench v6
80.4%
Terminal-Bench 2.0
51.5%
Avg Score
68.4%
Position
#3
Sept 2025
Nov 2025
Feb 2026
April 2026
Anthropic Models
Qwen Models
Source: BenchLM, HuggingFace, llm-stats.com · 2026