So sánh khả năng thực hiện các tác vụ lập trình web của nhiều công cụ AI khác nhau trong không gian code
| Rank | Rank Spread (Upper-Lower) | Model | Score | 95% CI (±) | Votes | Organization | License |
|---|---|---|---|---|---|---|---|
| 1 | 1◄─►1 |
claude-opus-4-5-20251101-thinking-32k
|
1504 | +10/-10 | 7.543 | Anthropic | Proprietary |
| 2 | 2◄─►5 |
gpt-5.2-high
|
1475 | +16/-16 | 1.691 | OpenAI | Proprietary |
| 3 | 2◄─►5 |
claude-opus-4-5-20251101
|
1467 | +9/-9 | 7.900 | Anthropic | Proprietary |
| 4 | 2◄─►6 |
gemini-3-pro
|
1462 | +8/-8 | 14.043 | Proprietary | |
| 5 | 2◄─►6 |
gemini-3-flash
|
1454 | +9/-9 | 8.389 | Proprietary | |
| 6 | 4◄─►6 |
glm-4.7
|
1445 | +10/-10 | 5.650 | Z.ai | MIT |
| 7 | 7◄─►10 |
minimax-m2.1-preview
|
1414 | +9/-9 | 7.201 | MiniMax | MIT |
| 8 | 7◄─►10 |
gemini-3-flash (thinking-minimal)
|
1412 | +10/-10 | 5.430 | Proprietary | |
| 9 | 7◄─►15 |
gpt-5.2
|
1399 | +15/-15 | 1.632 | OpenAI | Proprietary |
| 10 | 7◄─►15 |
gpt-5-medium
|
1397 | +12/-12 | 3.929 | OpenAI | Proprietary |
| 11 | 9◄─►15 |
gpt-5.1-medium
|
1392 | +9/-9 | 6.594 | OpenAI | Proprietary |
| 12 | 9◄─►15 |
claude-opus-4-1-20250805
|
1392 | +8/-8 | 9.124 | Anthropic | Proprietary |
| 13 | 9◄─►15 |
claude-sonnet-4-5-20250929-thinking-32k
|
1390 | +8/-8 | 11.001 | Anthropic | Proprietary |
| 14 | 9◄─►15 |
claude-sonnet-4-5-20250929
|
1386 | +8/-8 | 12.662 | Anthropic | Proprietary |
| 15 | 9◄─►16 |
deepseek-v3.2-thinking
|
1377 | +11/-11 | 3.552 | DeepSeek | MIT |
| 16 | 15◄─►19 |
glm-4.6
|
1358 | +8/-8 | 8.890 | Z.ai | MIT |
| 17 | 14◄─►19 |
mimo-v2-flash
|
1337 | +18/-18 | 1.039 | Xiaomi | MIT |
| 17 | 16◄─►19 |
gpt-5.1
|
1355 | +8/-8 | 9.917 | OpenAI | Proprietary |
| 18 | 16◄─►20 |
mimo-v2-flash (non-thinking)
|
1351 | +10/-10 | 3.943 | Xiaomi | MIT |
| 19 | 16◄─►21 |
gpt-5.2-codex
|
1344 | +13/-13 | 2.500 | OpenAI | Proprietary |
| 20 | 18◄─►21 |
gpt-5.1-codex
|
1334 | +9/-9 | 6.661 | OpenAI | Proprietary |
| 21 | 19◄─►21 |
kimi-k2-thinking-turbo
|
1333 | +8/-8 | 9.556 | Moonshot | Modified MIT |
| 22 | 22◄─►23 |
minimax-m2
|
1316 | +8/-8 | 8.997 | MiniMax | Apache 2.0 |
| 23 | 22◄─►26 |
deepseek-v3.2
|
1299 | +10/-10 | 4.581 | DeepSeek | MIT |
| 24 | 23◄─►26 |
claude-haiku-4-5-20251001
|
1298 | +8/-8 | 10.767 | Anthropic | Proprietary |
| 25 | 23◄─►26 |
deepseek-v3.2-exp
|
1289 | +10/-10 | 5.133 | DeepSeek | MIT |
| 26 | 23◄─►26 |
qwen3-coder-480b-a35b-instruct
|
1287 | +8/-8 | 10.516 | Alibaba | Apache 2.0 |
| 27 | 27◄─►29 |
KAT-Coder-Pro-V1
|
1262 | +15/-15 | 1.956 | KwaiKAT | Proprietary |
| 28 | 27◄─►30 |
gpt-5.1-codex-mini
|
1247 | +17/-17 | 1.538 | OpenAI | Proprietary |
| 29 | 27◄─►30 |
grok-4-1-fast-reasoning
|
1240 | +11/-11 | 5.127 | xAI | Proprietary |
| 30 | 28◄─►32 |
mistral-large-3
|
1225 | +20/-20 | 1.037 | Mistral | Apache 2.0 |
| 31 | 30◄─►32 |
gemini-2.5-pro
|
1209 | +13/-13 | 3.454 | Proprietary | |
| 32 | 30◄─►32 |
grok-4.1-thinking
|
1208 | +19/-19 | 1.266 | xAI | Proprietary |
| 33 | 33◄─►34 |
grok-4-fast-reasoning
|
1156 | +22/-22 | 970 | xAI | Proprietary |
| 34 | 33◄─►35 |
grok-code-fast-1
|
1143 | +21/-21 | 1.017 | xAI | Proprietary |
| 35 | 34◄─►35 |
devstral-medium-2507
|
1101 | +22/-22 | 1.020 | Mistral | Proprietary |