So sánh khả năng thực hiện các tác vụ lập trình web của nhiều công cụ AI khác nhau trong không gian code
| Rank | Rank Spread (Upper-Lower) | Model | Score | 95% CI (±) | Votes | Organization | License |
|---|---|---|---|---|---|---|---|
| 1 | 1◄─►1 |
claude-opus-4-5-20251101-thinking-32k
|
1510 | +10/-10 | 6.717 | Anthropic | Proprietary |
| 2 | 2◄─►4 |
claude-opus-4-5-20251101
|
1478 | +10/-10 | 6.326 | Anthropic | Proprietary |
| 3 | 2◄─►4 |
gpt-5.2-high
|
1477 | +16/-16 | 1.691 | OpenAI | Proprietary |
| 4 | 2◄─►5 |
gemini-3-pro
|
1467 | +8/-8 | 13.138 | Proprietary | |
| 5 | 4◄─►6 |
gemini-3-flash
|
1450 | +9/-9 | 6.563 | Proprietary | |
| 6 | 5◄─►6 |
glm-4.7
|
1447 | +10/-10 | 4.833 | Z.ai | MIT |
| 7 | 7◄─►9 |
minimax-m2.1-preview
|
1422 | +9/-9 | 6.387 | MiniMax | MIT |
| 8 | 7◄─►10 |
gemini-3-flash (thinking-minimal)
|
1416 | +10/-10 | 4.649 | Proprietary | |
| 9 | 7◄─►14 |
gpt-5.2
|
1401 | +15/-15 | 1.628 | OpenAI | Proprietary |
| 10 | 8◄─►14 |
gpt-5-medium
|
1398 | +12/-12 | 3.928 | OpenAI | Proprietary |
| 11 | 9◄─►15 |
gpt-5.1-medium
|
1393 | +9/-9 | 6.587 | OpenAI | Proprietary |
| 12 | 9◄─►14 |
claude-sonnet-4-5-20250929-thinking-32k
|
1393 | +8/-8 | 10.271 | Anthropic | Proprietary |
| 13 | 9◄─►15 |
claude-opus-4-1-20250805
|
1391 | +8/-8 | 9.118 | Anthropic | Proprietary |
| 14 | 9◄─►15 |
claude-sonnet-4-5-20250929
|
1386 | +8/-8 | 11.837 | Anthropic | Proprietary |
| 15 | 12◄─►17 |
deepseek-v3.2-thinking
|
1373 | +12/-12 | 2.996 | DeepSeek | MIT |
| 16 | 15◄─►18 |
glm-4.6
|
1361 | +8/-8 | 8.883 | Z.ai | MIT |
| 17 | 15◄─►18 |
gpt-5.1
|
1356 | +8/-8 | 9.179 | OpenAI | Proprietary |
| 17 | 14◄─►19 |
mimo-v2-flash
|
1337 | +18/-18 | 1.039 | Xiaomi | MIT |
| 18 | 16◄─►20 |
mimo-v2-flash (non-thinking)
|
1343 | +11/-11 | 3.215 | Xiaomi | MIT |
| 19 | 18◄─►20 |
kimi-k2-thinking-turbo
|
1337 | +8/-8 | 8.901 | Moonshot | Modified MIT |
| 20 | 18◄─►21 |
gpt-5.1-codex
|
1335 | +9/-9 | 6.659 | OpenAI | Proprietary |
| 21 | 20◄─►21 |
minimax-m2
|
1318 | +8/-8 | 8.990 | MiniMax | Apache 2.0 |
| 22 | 22◄─►25 |
claude-haiku-4-5-20251001
|
1297 | +8/-8 | 10.012 | Anthropic | Proprietary |
| 23 | 22◄─►25 |
deepseek-v3.2
|
1295 | +11/-11 | 3.932 | DeepSeek | MIT |
| 24 | 22◄─►25 |
deepseek-v3.2-exp
|
1291 | +10/-10 | 5.127 | DeepSeek | MIT |
| 25 | 22◄─►26 |
qwen3-coder-480b-a35b-instruct
|
1286 | +8/-8 | 9.832 | Alibaba | Apache 2.0 |
| 26 | 25◄─►27 |
KAT-Coder-Pro-V1
|
1264 | +15/-15 | 1.956 | KwaiKAT | Proprietary |
| 27 | 26◄─►29 |
gpt-5.1-codex-mini
|
1248 | +17/-17 | 1.538 | OpenAI | Proprietary |
| 28 | 27◄─►30 |
grok-4-1-fast-reasoning
|
1235 | +12/-12 | 4.424 | xAI | Proprietary |
| 29 | 27◄─►31 |
mistral-large-3
|
1226 | +20/-20 | 1.038 | Mistral | Apache 2.0 |
| 30 | 29◄─►31 |
gemini-2.5-pro
|
1210 | +13/-13 | 3.454 | Proprietary | |
| 31 | 28◄─►31 |
grok-4.1-thinking
|
1209 | +19/-19 | 1.265 | xAI | Proprietary |
| 32 | 32◄─►33 |
grok-4-fast-reasoning
|
1157 | +22/-22 | 970 | xAI | Proprietary |
| 33 | 32◄─►34 |
grok-code-fast-1
|
1144 | +21/-21 | 1.015 | xAI | Proprietary |
| 34 | 33◄─►34 |
devstral-medium-2507
|
1102 | +22/-22 | 1.020 | Mistral | Proprietary |