AI 大模型排行榜 (Artificial Analysis LLM Ranking)

信息查询
2.1k 次浏览
100% 有帮助 · 1 人反馈

值品工具箱提供的 AI 大模型排行榜聚合了来自 Artificial Analysis 的权威数据,实时追踪并排名超过 100 个主流大语言模型。

AI 大模型排行榜数据中心

重置
排名 模型名称 综合指数 ▼ 编程 价格 ($/1M)
1 Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback) 59.9 76.5 $20
2 Claude Opus 4.8 (Adaptive Reasoning, Max Effort) 55.7 74.3 $10
3 GPT-5.5 (xhigh) 54.8 74.9 $11.25
4 Claude Opus 4.7 (Adaptive Reasoning, Max Effort) 53.5 73.6 $10
5 GPT-5.5 (high) 53.1 71.6 $11.25
6 GPT-5.4 (xhigh) 51.4 71.1 $5.625
7 GLM-5.2 (max) 51.1 68.8 $2.15
8 GPT-5.5 (medium) 50.4 71.5 $11.25
9 Gemini 3.5 Flash (high) 50.2 70.1 $3.375
10 Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) 47.2 63 $6
11 Gemini 3.1 Pro Preview 46.5 68.8 $4.5
12 Qwen3.7 Max 46 66 $3.75
13 Gemini 3.5 Flash (medium) 45.4 - $3.375
14 MiniMax-M3 44.4 58.6 $0.525
15 GPT-5.3 Codex (xhigh) 44.3 - $4.813
16 DeepSeek V4 Pro (Reasoning, Max Effort) 44.3 59.4 $0.544
17 Claude Opus 4.6 (Adaptive Reasoning, Max Effort) 43.7 - $10
18 GPT-5.5 (low) 43.5 60.9 $11.25
19 Muse Spark 43.1 58.6 $0
20 Kimi K2.6 42.8 56 $1.712
21 Claude Opus 4.7 (Non-reasoning, High Effort) 42.7 - $10
22 MiMo-V2.5-Pro 42.2 60.2 $0.544
23 GPT-5.2 (xhigh) 42.2 - $4.813
24 Kimi K2.7 Code 41.9 60.8 $1.712
25 Nex-N2-Pro 41 59.1 $1
26 DeepSeek V4 Pro (Reasoning, High Effort) 40.8 - $0.544
27 Claude Opus 4.5 (Reasoning) 40.8 - $10
28 DeepSeek V4 Flash (Reasoning, Max Effort) 40.3 56.2 $0.175
29 MiMo-V2-Pro 40.3 - $1.5
30 GLM-5.1 (Reasoning) 40.2 55.8 $2.15
31 MiMo-V2.5 40.1 - $0.175
32 GPT-5.2 Codex (xhigh) 40.1 - $4.813
33 GPT-5.4 mini (xhigh) 40 56.1 $1.688
34 Qwen3.6 Max Preview 40 - $2.925
35 Grok Build 0.1 0616 39.8 51.5 $1.25
36 Qwen3.6 Plus 39.6 54.5 $1.125
37 Gemini 3 Pro Preview (high) 39.6 - $4.5
38 GLM-5 (Reasoning) 39.5 - $1.55
39 GPT-5.4 (low) 39.1 - $5.906
40 Qwen3.7 Plus 39 55.9 $0.59
41 GPT-5.1 (high) 38.9 - $3.438
42 GPT-5.4 nano (xhigh) 38.2 56.1 $0.463
43 MiniMax-M2.7 38.1 52.6 $0.525
44 GLM-5-Turbo 38.1 - $0
45 Kimi K2.5 (Reasoning) 38.1 - $1.185
46 GPT-5.2 (medium) 38 - $4.813
47 Nemotron 3 Ultra 550B A55B (Reasoning) 37.8 49.3 $1.175
48 Gemini 3 Flash Preview (Reasoning) 37.8 - $1.125
49 Claude Opus 4.6 (Non-reasoning, High Effort) 37.8 - $10
50 Grok 4.3 (high) 37.6 42.2 $1.563
51 DeepSeek V4 Flash (Reasoning, High Effort) 37.4 - $0.175
52 Qwen3.6 27B (Reasoning) 37.1 53.7 $1.35
53 Grok 4.20 0309 v2 (Reasoning) 37 - $3
54 Grok 4.20 0309 (Reasoning) 36.5 - $3
55 MiMo-V2-Omni-0327 36.4 - $0.8
56 GPT-5 (high) 36.1 - $3.438
57 GPT-5 Codex (high) 36.1 - $3.438
58 Grok 4.3 (medium) 36 - $1.563
59 Claude Sonnet 4.6 (Non-reasoning, High Effort) 35.9 - $6
60 GPT-5.5 (Non-reasoning) 35.4 56.5 $11.25
61 Grok 4.3 (low) 35.4 - $1.563
62 KAT Coder Pro V2 35.4 - $0.525
63 GLM-5.1 (Non-reasoning) 35.4 - $2.15
64 MiMo-V2-Omni 35 - $0
65 Gemini 3.5 Flash (minimal) 34.9 - $3.375
66 GPT-5.1 Codex (high) 34.7 - $3.438
67 Claude 4.5 Sonnet (Reasoning) 34.7 - $6
68 Claude Opus 4.5 (Non-reasoning) 34.7 - $10
69 Kimi K2.6 (Non-reasoning) 34.6 - $1.712
70 KAT-Coder-Pro V1 34.6 58.9 $0.525
71 GLM 5V Turbo (Reasoning) 34.5 - $0
72 Claude Sonnet 4.6 (Non-reasoning, Low Effort) 34.3 - $6
73 GLM-4.7 (Reasoning) 33.8 - $1
74 Qwen3.5 27B (Reasoning) 33.8 - $0.825
75 Qwen3.5 397B A17B (Reasoning) 33.7 48.2 $1.35
76 GPT-5 (medium) 33.7 - $3.438
77 Claude 4.1 Opus (Reasoning) 33.7 - $30
78 MiniMax-M2.5 33.7 - $0.525
79 Hy3-preview (Reasoning) 33.6 - $0.2
80 GPT-5.5 Instant (May 2026) 33.5 - $11.25
81 DeepSeek V3.2 (Reasoning) 33.4 - $0.337
82 Grok 4 33.3 - $11
83 MiMo-V2-Flash (Feb 2026) 33.2 - $0.15
84 Gemini 3 Pro Preview (low) 33.1 - $4.5
85 GPT-5 mini (high) 33 - $0.688
86 Kimi K2 Thinking 32.7 - $1.075
87 o3-pro 32.5 - $35
88 GLM-5 (Non-reasoning) 32.4 - $1.55
89 Qwen3.5 122B A10B (Reasoning) 32.3 45.7 $1.1
90 Qwen3.5 397B A17B (Non-reasoning) 32 - $1.35
91 Qwen3 Max Thinking 31.7 - $2.4
92 Qwen3.6 35B A3B (Reasoning) 31.6 41.9 $0.557
93 MiniMax-M2.1 31.4 - $0.525
94 DeepSeek V4 Pro (Non-reasoning) 31.2 - $0.544
95 GPT-5 (low) 31.2 - $3.438
96 MiMo-V2-Flash (Reasoning) 31.2 - $0.15
97 Claude 4 Opus (Reasoning) 31 - $30
98 GPT-5 mini (medium) 30.9 - $0.688
99 Qwen3.5 Omni Plus 30.6 - $1.5
100 Ring-2.6-1T 30.6 42.8 $0.85
101 GPT-5.1 Codex mini (high) 30.6 - $0.688
102 Grok 4.1 Fast (Reasoning) 30.6 - $0
103 o3 30.4 - $3.5
104 GPT-5.4 nano (medium) 30.2 - $0.463
105 Mistral Medium 3.5 29.9 46.9 $3
106 GPT-5.4 mini (medium) 29.8 - $1.688
107 Step 3.7 Flash 29.7 37.3 $0.438
108 Claude 4.5 Haiku (Reasoning) 29.6 43.9 $2
109 Gemma 4 31B (Reasoning) 29.4 43.4 $0
110 Kimi K2.5 (Non-reasoning) 29.4 - $1.2
111 Command A+ 29.3 - $0
112 Qwen3.6 27B (Non-reasoning) 29.3 - $1.35
113 Claude 4.5 Sonnet (Non-reasoning) 29.3 - $6
114 Qwen3.5 27B (Non-reasoning) 29.3 - $0.875
115 Qwen3.5 35B A3B (Reasoning) 29.3 - $0.688
116 GPT-5.5 Instant (June 2026) 28.9 39.4 $11.25
117 Claude 4 Sonnet (Reasoning) 28.9 37.6 $6
118 DeepSeek V4 Flash (Non-reasoning) 28.7 - $0.175
119 JT-35B-Flash 28.4 - $0
120 MiniMax-M2 28.3 - $0.525
121 Claude 4.1 Opus (Non-reasoning) 28.2 - $30
122 Qwen3.5 122B A10B (Non-reasoning) 28.1 - $1.1
123 MiMo-V2.5-Pro (Non-reasoning) 27.9 - $1.35
124 GPT-5.4 (Non-reasoning) 27.7 - $5.906
125 Gemini 3 Flash Preview (Non-reasoning) 27.4 - $1.125
126 Grok 4 Fast (Reasoning) 27.4 - $0.275
127 Claude 3.7 Sonnet (Reasoning) 27.1 36.4 $0
128 GLM-4.7 (Non-reasoning) 26.6 - $1
129 DeepSeek V3.1 Terminus (Reasoning) 26.3 - $1.914
130 Hy3-preview (Non-reasoning) 26.1 - $0.2
131 Ling-2.6-1T 26.1 - $0.85
132 Step 3.5 Flash 2603 26 - $0.15
133 Doubao Seed Code 26 - $0
134 GPT-5.2 (Non-reasoning) 26 - $4.813
135 Gemini 2.5 Pro 25.8 33.3 $3.438
136 Gemma 4 26B A4B (Reasoning) 25.7 39.3 $0.198
137 o4-mini (high) 25.6 - $1.925
138 Claude 4 Sonnet (Non-reasoning) 25.5 - $6
139 Claude 4 Opus (Non-reasoning) 25.5 - $30
140 Step 3.5 Flash 25.5 - $0.15
141 NVIDIA Nemotron 3 Super 120B A12B (Reasoning) 25.4 37.7 $0.412
142 DeepSeek V3.2 Exp (Reasoning) 25.4 - $0.31
143 Mercury 2 25.3 - $0.375
144 GLM-4.6 (Reasoning) 25.1 - $0.963
145 Gemini 3.1 Flash-Lite 25 34.7 $0.563
146 Qwen3.5 9B (Reasoning) 25 28.7 $0.113
147 Qwen3 Max Thinking (Preview) 25 - $2.4
148 Gemma 4 31B (Non-reasoning) 24.8 - $0.205
149 Grok 4.3 (Non-reasoning) 24.8 35.2 $1.563
150 K-EXAONE (Reasoning) 24.7 - $0
151 MiMo-V2-Flash (Non-reasoning) 24.7 49.8 $0.15
152 DeepSeek V3.2 (Non-reasoning) 24.7 - $0.775
153 Trinity Large Thinking 24.5 - $0.395
154 Qwen3.6 35B A3B (Non-reasoning) 24.2 - $0.844
155 Qwen3 Max 24 - $3.047
156 gpt-oss-120b (high) 23.8 30.4 $0.262
157 Gemini 2.5 Flash Preview (Sep '25) (Reasoning) 23.8 - $0
158 Claude 4.5 Haiku (Non-reasoning) 23.7 - $2
159 Claude 3.7 Sonnet (Non-reasoning) 23.5 - $6
160 Kimi K2 0905 23.5 - $1.075
161 Qwen3.5 35B A3B (Non-reasoning) 23.4 - $0.688
162 o1 23.4 39.7 $26.25
163 EXAONE 4.5 33B 23 - $0
164 Gemini 2.5 Pro Preview (Mar' 25) 23 46.7 $0
165 GLM-4.6 (Non-reasoning) 23 - $1
166 GLM-4.7-Flash (Reasoning) 22.9 - $0.153
167 Grok 3 mini Reasoning (high) 22.5 - $0.35
168 Grok 4.20 0309 (Non-reasoning) 22.5 - $3
169 Gemini 2.5 Pro Preview (May' 25) 22.3 - $3.438
170 Qwen3 235B A22B 2507 (Reasoning) 22.3 - $0.838
171 DeepSeek V3.2 Speciale 22.2 - $0
172 HyperNova 60B 2605 22.1 - $0.065
173 Gemma 4 12B (Reasoning) 22 - $0.15
174 ERNIE 5.0 Thinking Preview 21.9 - $0
175 Nova 2.0 Pro Preview (medium) 21.8 34 $3.438
176 Grok 4.20 0309 v2 (Non-reasoning) 21.8 - $3
177 Grok Code Fast 1 21.6 - $0
178 DeepSeek V3.1 Terminus (Non-reasoning) 21.4 - $0.453
179 Nemotron Cascade 2 30B A3B 21.3 - $0
180 DeepSeek V3.2 Exp (Non-reasoning) 21.3 - $0.31
181 Qwen3 Coder Next 21.2 - $0.563
182 Apriel-v1.5-15B-Thinker 21.2 - $0
183 DeepSeek V3.1 (Non-reasoning) 21 - $0.834
184 Nova 2.0 Omni (medium) 20.9 - $0.85
185 Mistral Small 4 (Reasoning) 20.8 - $0.262
186 DeepSeek V3.1 (Reasoning) 20.7 - $0.865
187 North Mini Code 20.6 - $0
188 Qwen3 VL 235B A22B (Reasoning) 20.6 - $2.174
189 Apriel-v1.6-15B-Thinker 20.5 - $0
190 GPT-5.1 (Non-reasoning) 20.4 - $3.438
191 Qwen3.5 9B (Non-reasoning) 20.3 23.5 $0
192 Gemma 4 26B A4B (Non-reasoning) 20.1 - $0.198
193 Qwen3.5 4B (Reasoning) 20.1 22.6 $0.06
194 Gemini 2.5 Flash (Reasoning) 20.1 - $0.85
195 DeepSeek R1 0528 (May '25) 20.1 - $2.063
196 GPT-5 nano (high) 19.9 - $0.138
197 Qwen3 Next 80B A3B (Reasoning) 19.8 - $1.875
198 Nova 2.0 Pro Preview (low) 19.6 25.9 $3.438
199 GLM-4.5 (Reasoning) 19.5 - $1
200 GPT-4.1 19.4 - $3.5
201 Kimi K2 19.4 - $1.039
202 Ling 2.6 Flash 19.3 - $0.15
203 Devstral 2 19.2 31.3 $0
204 Qwen3 Max (Preview) 19.2 - $2.4
205 Nova 2.0 Lite (medium) 19 - $0.85
206 Qwen3.5 Omni Flash 19 - $0.275
207 o3-mini 19 - $1.925
208 GPT-5 nano (medium) 19 - $0.138
209 o1-pro 18.9 - $262.5
210 Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning) 18.8 - $0
211 JT-MINI 18.5 - $0
212 DeepSeek R1 (Jan '25) 18.5 24.6 $2.431
213 o3-mini (high) 18.4 42.1 $1.925
214 Grok 3 18.4 - $8
215 Seed-OSS-36B-Instruct 18.3 - $0.3
216 Nova 2.0 Lite (high) 18.2 23 $0.85
217 Qwen3 235B A22B 2507 Instruct 18.2 - $0.356
218 Qwen3 Coder 480B A35B Instruct 18 - $0.675
219 Magistral Medium 1.2 17.9 21.3 $2.75
220 Qwen3 VL 32B (Reasoning) 17.9 - $2.625
221 Nova 2.0 Lite (low) 17.8 - $0.85
222 Sonar Reasoning Pro 17.8 - $0
223 gpt-oss-120b (low) 17.7 - $0.262
224 MiniMax M1 80k 17.7 - $0.963
225 GPT-5.4 nano (Non-Reasoning) 17.6 - $0.463
226 Gemini 2.5 Flash Preview (Reasoning) 17.5 - $0
227 Devstral Small 2 17.4 29.3 $0
228 K2 Think V2 17.3 21 $0
229 LongCat Flash Lite 17.2 - $0
230 GPT-5 (minimal) 17.2 - $3.438
231 HyperCLOVA X SEED Think (32B) 17 - $0
232 o1-preview 17 34 $28.875
233 Grok 4.1 Fast (Non-reasoning) 16.9 - $0
234 GLM-4.6V (Reasoning) 16.8 - $0.45
235 K-EXAONE (Non-reasoning) 16.7 - $0
236 GPT-5.4 mini (Non-Reasoning) 16.6 - $1.688
237 Nova 2.0 Omni (low) 16.6 - $0.85
238 Grok 4 Fast (Non-reasoning) 16.5 - $0.275
239 GLM-4.5-Air 16.5 - $0.372
240 Mi:dm K 2.5 Pro 16.4 - $0
241 GPT-4.1 mini 16.3 - $0.7
242 Ring-1T 16.2 - $0
243 Qwen3.5 4B (Non-reasoning) 16 20.3 $0.06
244 Mistral Large 3 15.9 20.1 $0.75
245 Qwen3 30B A3B 2507 (Reasoning) 15.8 - $0.673
246 INTELLECT-3 15.6 - $0
247 GLM-4.7-Flash (Non-reasoning) 15.5 - $0.153
248 DeepSeek V3 0324 15.4 21.2 $1.209
249 GPT-5 (ChatGPT) 15.3 - $3.438
250 Solar Open 100B (Reasoning) 15.1 - $0
251 Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) 15.1 - $0.175
252 Grok 3 Reasoning Beta 15.1 - $0
253 gpt-oss-20B (high) 14.9 20.7 $0.088
254 Nemotron 3 Nano Omni 30B A3B Reasoning 14.9 - $0.131
255 Mistral Small 3.1 14.7 26.3 $0.138
256 Mistral Medium 3.1 14.7 20.5 $0.8
257 Nova 2.0 Pro Preview (Non-reasoning) 14.4 20.9 $3.438
258 MiniMax M1 40k 14.4 - $0
259 gpt-oss-20B (low) 14.3 - $0.095
260 Llama 4 Maverick 14.3 16.3 $0.475
261 GPT-5 mini (minimal) 14.3 - $0.688
262 Qwen3 VL 235B A22B Instruct 14.3 - $0.7
263 NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) 14.2 14.4 $0.096
264 K2-V2 (high) 14.2 - $0
265 DeepSeek V3 (Dec '24) 14.2 23 $0.523
266 Solar Pro 3 14.1 16.2 $0
267 Gemini 2.5 Flash (Non-reasoning) 14.1 - $0.85
268 o1-mini 14 - $0
269 Qwen3 Next 80B A3B Instruct 13.7 - $0.875
270 Tri-21B-think Preview 13.6 - $0
271 GPT-4.5 (Preview) 13.6 - $0
272 Qwen3 Coder 30B A3B Instruct 13.6 - $0.352
273 DiffusionGemma 26B A4B 13.5 19.7 $0
274 QwQ 32B 13.4 - $0.745
275 Qwen3 235B A22B (Reasoning) 13.4 - $2.625
276 Gemini 2.0 Flash Thinking Experimental (Jan '25) 13.3 24.1 $0
277 Qwen3 VL 30B A3B (Reasoning) 13.3 - $0.338
278 Gemma 4 12B (Non-reasoning) 13.2 - $0.15
279 Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning) 13.1 - $0.175
280 Motif-2-12.7B-Reasoning 12.8 - $0
281 Ling-1T 12.8 - $0
282 Nova Premier 12.7 - $5
283 Gemma 4 E4B (Reasoning) 12.5 - $0
284 Mistral Medium 3 12.5 - $0.8
285 Magistral Medium 1 12.5 - $0
286 Solar Pro 2 (Preview) (Reasoning) 12.5 - $0
287 Mistral Small 4 (Non-reasoning) 12.4 - $0.262
288 Llama Nemotron Super 49B v1.5 (Reasoning) 12.4 - $0.175
289 K2-V2 (medium) 12.4 - $0
290 Tri-21B-Think 12.4 - $0
291 Devstral Medium 12.4 - $0.8
292 GPT-4o (March 2025, chatgpt-4o-latest) 12.3 - $0
293 Gemini 2.0 Flash (Feb '25) 12.3 - $0.262
294 Claude 3.5 Haiku 12.3 15.9 $1.6
295 Llama 3.3 Nemotron Super 49B v1 (Reasoning) 12.2 - $0
296 MiniCPM5-1B (Reasoning) 12 - $0
297 Qwen3 4B 2507 (Reasoning) 12 - $0
298 Sarvam 105B (high) 11.9 - $0.074
299 Nova 2.0 Lite (Non-reasoning) 11.8 - $0.85
300 Gemini 2.0 Pro Experimental (Feb '25) 11.8 25.5 $0
301 Claude 3 Opus 11.8 19.5 $30
302 Devstral Small (May '25) 11.8 - $0
303 MiniCPM5-1B (Non-reasoning) 11.7 - $0
304 Gemini 2.5 Flash Preview (Non-reasoning) 11.7 - $0
305 Sonar Reasoning 11.7 - $0
306 Qwen3 32B (Reasoning) 11.5 15.3 $0.276
307 Gemini 2.5 Flash-Lite (Reasoning) 11.4 - $0.175
308 Magistral Small 1.2 11.3 14.7 $0.75
309 GPT-4o (Nov '24) 11.2 - $4.375
310 Ministral 3 14B 11.1 14.4 $0.2
311 Qwen3 VL 32B Instruct 11.1 - $1.225
312 DeepSeek R1 Distill Qwen 32B 11 - $0
313 GLM-4.6V (Non-reasoning) 11 - $0.45
314 Qwen3 235B A22B (Non-reasoning) 10.9 - $0.787
315 Gemini 2.0 Flash (experimental) 10.7 - $0
316 Magistral Small 1 10.7 - $0
317 EXAONE 4.0 32B (Reasoning) 10.6 - $0
318 Qwen3 VL 8B (Reasoning) 10.6 - $0.66
319 Nova 2.0 Omni (Non-reasoning) 10.5 - $0.85
320 DeepSeek R1 0528 Qwen3 8B 10.4 - $0
321 Qwen3.5 2B (Reasoning) 10.2 19.7 $0.04
322 Qwen2.5 Max 10.2 - $2.8
323 Nanbeige4.1-3B 10.1 - $0
324 Qwen3 14B (Reasoning) 10.1 - $0.731
325 Llama 4 Scout 10 8.2 $0.292
326 Hermes 4 - Llama-3.1 70B (Reasoning) 10 - $0.198
327 Gemini 1.5 Pro (Sep '24) 10 23.6 $0
328 Solar Pro 2 (Preview) (Non-reasoning) 10 - $0
329 Qwen3 VL 30B A3B Instruct 10 - $0.3
330 Claude 3.5 Sonnet (Oct '24) 9.9 30.2 $6
331 DeepSeek R1 Distill Llama 70B 9.9 - $0.787
332 Falcon-H1R-7B 9.8 - $0
333 DeepSeek R1 Distill Qwen 14B 9.8 - $0
334 Ling-flash-2.0 9.7 - $0.247
335 Qwen3 Omni 30B A3B (Reasoning) 9.6 - $0.43
336 GPT-4o (Aug '24) 9.6 - $4.375
337 Qwen2.5 Instruct 72B 9.6 - $0.37
338 Step3 VL 10B 9.5 - $0
339 Sonar 9.5 - $0
340 Gemma 4 E2B (Reasoning) 9.3 - $0
341 Devstral Small (Jul '25) 9.3 - $0.15
342 Sonar Pro 9.3 - $0
343 Qwen3 30B A3B (Reasoning) 9.3 - $0.18
344 QwQ 32B-Preview 9.2 - $0
345 Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) 9.1 - $0.9
346 Mistral Large 2 (Nov '24) 9.1 - $3
347 Mistral Small 3.2 9.1 - $0.128
348 GLM-4.5V (Reasoning) 9.1 - $0.9
349 Qwen3 30B A3B 2507 Instruct 9.1 - $0.213
350 Ministral 3 8B 9 9.7 $0.15
351 Solar Pro 2 (Reasoning) 9 - $0
352 NVIDIA Nemotron Nano 12B v2 VL (Reasoning) 9 - $0.3
353 Hermes 4 - Llama-3.1 405B (Reasoning) 9 - $1.5
354 ERNIE 4.5 300B A47B 9 - $0.485
355 Gemma 4 E4B (Non-reasoning) 8.9 - $0
356 Granite 4.1 30B 8.9 10.4 $0
357 NVIDIA Nemotron 3 Nano 4B 8.8 - $0
358 NVIDIA Nemotron Nano 9B V2 (Reasoning) 8.8 - $0.07
359 Hermes 4 - Llama-3.1 405B (Non-reasoning) 8.8 - $1.5
360 Qwen3.5 2B (Non-reasoning) 8.8 17.4 $0.04
361 Gemini 2.0 Flash-Lite (Feb '25) 8.8 - $0
362 Llama Nemotron Super 49B v1.5 (Non-reasoning) 8.7 - $0.175
363 Llama 3.3 Instruct 70B 8.6 - $0.612
364 K2-V2 (low) 8.6 - $0
365 GPT-4o (May '24) 8.6 24.2 $7.5
366 Gemini 2.0 Flash-Lite (Preview) 8.6 - $0
367 Qwen3 32B (Non-reasoning) 8.6 - $0.26
368 Llama 3.1 Instruct 405B 8.5 - $3.688
369 Kimi Linear 48B A3B Instruct 8.5 - $0
370 Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning) 8.5 - $0
371 Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) 8.5 - $0
372 Qwen3 4B (Reasoning) 8.4 - $0.398
373 Qwen3 VL 8B Instruct 8.4 - $0.31
374 LFM2.5-8B-A1B 8.3 - $0
375 Claude 3.5 Sonnet (June '24) 8.3 26 $6
376 Llama 3.1 Tulu3 405B 8.3 - $0
377 Ring-flash-2.0 8.2 - $0.247
378 GPT-4o (ChatGPT) 8.2 - $0
379 Olmo 3.1 32B Think 8.1 - $0
380 Pixtral Large 8.1 - $3
381 GPT-5 nano (minimal) 8 - $0.138
382 Gemini 1.5 Flash (Sep '24) 8 - $0
383 Grok 2 (Dec '24) 8 - $0
384 GPT-4 Turbo 7.9 21.5 $15
385 Qwen3 VL 4B (Reasoning) 7.9 - $0
386 Solar Pro 2 (Non-reasoning) 7.8 - $0
387 Command A 7.7 - $4.375
388 Nova Pro 7.7 - $1.4
389 Llama 3.1 Nemotron Instruct 70B 7.6 - $1.2
390 Llama 3.1 Instruct 8B 7.6 5.4 $0.1
391 Grok Beta 7.5 - $0
392 Qwen2.5 Instruct 32B 7.5 - $0
393 NVIDIA Nemotron Nano 9B V2 (Non-reasoning) 7.4 - $0.086
394 NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning) 7.4 - $0.088
395 Gemma 3 27B Instruct 7.4 10.1 $0.145
396 Qwen3 8B (Reasoning) 7.4 - $0.37
397 GPT-4.1 nano 7.3 11.1 $0.175
398 Mistral Large 2 (Jul '24) 7.3 - $3
399 Qwen2.5 Coder Instruct 32B 7.1 - $0
400 Qwen3 4B 2507 Instruct 7.1 - $0
401 GPT-4 7 13.1 $37.5
402 GLM-4.5V (Non-reasoning) 7 - $0.9
403 Qwen3 14B (Non-reasoning) 7 - $0.381
404 Hermes 4 - Llama-3.1 70B (Non-reasoning) 6.9 - $0.198
405 GPT-4o mini 6.9 11.4 $0.262
406 Gemini 2.5 Flash-Lite (Non-reasoning) 6.9 - $0.175
407 Mistral Small 3 6.9 - $0.104
408 Nova Lite 6.9 - $0.105
409 Ministral 3 3B 6.8 4.8 $0.1
410 Llama 3.1 Instruct 70B 6.8 - $0.56
411 DeepSeek-V2.5 (Dec '24) 6.8 - $0
412 Qwen3 30B A3B (Non-reasoning) 6.8 - $0.133
413 Qwen3 4B (Non-reasoning) 6.8 - $0.188
414 Granite 4.1 8B 6.7 9.5 $0.063
415 Sarvam 30B (high) 6.6 - $0.047
416 Gemini 2.0 Flash Thinking Experimental (Dec '24) 6.6 - $0
417 DeepSeek-V2.5 6.6 - $0
418 Olmo 3.1 32B Instruct 6.5 - $0
419 Gemma 4 E2B (Non-reasoning) 6.4 - $0
420 Mistral Saba 6.4 - $0
421 DeepSeek R1 Distill Llama 8B 6.4 - $0
422 Olmo 3 32B Think 6.4 - $0
423 R1 1776 6.3 - $0
424 Gemini 1.5 Pro (May '24) 6.3 19.8 $0
425 Reka Flash (Sep '24) 6.3 - $0.35
426 Qwen2.5 Turbo 6.3 - $0.088
427 Llama 3.2 Instruct 90B (Vision) 6.2 - $1.38
428 Solar Mini 6.2 - $0.15
429 Grok-1 6 - $0
430 EXAONE 4.0 32B (Non-reasoning) 6 - $0
431 Qwen2 Instruct 72B 6 - $0
432 Gemini 1.5 Flash-8B 5.5 - $0
433 DeepHermes 3 - Mistral 24B Preview (Non-reasoning) 5.3 - $0
434 Jamba 1.7 Large 5.3 - $3.5
435 Granite 4.0 H Small 5.2 - $0.107
436 Qwen3 Omni 30B A3B Instruct 5.1 - $0.43
437 DeepSeek-Coder-V2 5.1 - $0
438 Hermes 3 - Llama-3.1 70B 5.1 - $0.3
439 Jamba 1.5 Large 5.1 - $3.5
440 Qwen3 8B (Non-reasoning) 5.1 - $0.185
441 Qwen3.5 0.8B (Reasoning) 5 15 $0.02
442 OLMo 2 32B 5 - $0
443 Jamba 1.6 Large 5 - $3.5
444 Phi-4 4.9 - $0.219
445 LFM2 24B A2B 4.9 - $0.052
446 Gemini 1.5 Flash (May '24) 4.9 - $0
447 Nova Micro 4.7 - $0.061
448 Claude 3 Sonnet 4.7 - $6
449 Mistral Small (Sep '24) 4.7 - $0.3
450 NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) 4.6 - $0.3
451 Gemma 3n E4B Instruct Preview (May '25) 4.6 - $0
452 Gemini 1.0 Ultra 4.6 17.6 $0
453 Phi-3 Mini Instruct 3.8B 4.6 - $0
454 Phi-4 Multimodal Instruct 4.5 - $0
455 Qwen2.5 Coder Instruct 7B 4.5 - $0
456 Qwen3.5 0.8B (Non-reasoning) 4.4 21.9 $0.02
457 Mixtral 8x22B Instruct 4.4 - $0
458 Mistral Large (Feb '24) 4.4 - $6
459 Llama 2 Chat 7B 4.3 - $0.1
460 MiniCPM-V 4.6 1.3B 4.2 0.7 $0
461 Llama 3.2 Instruct 3B 4.2 - $0.15
462 Reka Flash 3 4.1 - $0.35
463 Jamba Reasoning 3B 4.1 - $0
464 Qwen1.5 Chat 110B 4.1 - $0
465 Qwen3 VL 4B Instruct 4.1 - $0
466 Olmo 3 7B Think 4 - $0
467 Claude 3 Haiku 3.9 - $0.5
468 Claude 2.1 3.9 14 $0
469 OLMo 2 7B 3.9 - $0
470 Molmo 7B-D 3.8 - $0
471 Ling-mini-2.0 3.8 - $0
472 DeepSeek R1 Distill Qwen 1.5B 3.7 - $0
473 GPT-3.5 Turbo 3.6 10.7 $0.75
474 Claude 2.0 3.6 12.9 $0
475 Mistral Small (Feb '24) 3.6 - $1.5
476 Mistral Medium 3.6 - $4.088
477 DeepSeek-V2-Chat 3.6 - $0
478 Llama 3 Instruct 70B 3.5 - $1.175
479 Gemma 3 12B Instruct 3.4 - $0.14
480 LFM 40B 3.4 - $0
481 Arctic Instruct 3.4 - $0
482 Qwen Chat 72B 3.4 - $0
483 Llama 3.2 Instruct 11B (Vision) 3.3 - $0.245
484 Granite 4.1 3B 3.2 - $0
485 PALM-2 3.2 4.6 $0
486 Gemini 1.0 Pro 3.1 - $0
487 DeepSeek Coder V2 Lite Instruct 3.1 - $0
488 Phi-4 Mini Instruct 3 - $0
489 Llama 2 Chat 13B 3 - $0
490 Llama 2 Chat 70B 3 - $0
491 DeepSeek LLM 67B Chat (V1) 3 - $0
492 OpenChat 3.5 (1210) 3 - $0
493 DBRX Instruct 3 - $0
494 Sarvam M (Reasoning) 3 - $0
495 Command-R+ (Apr '24) 3 - $6
496 Exaone 4.0 1.2B (Reasoning) 2.9 - $0
497 Olmo 3 7B Instruct 2.8 - $0.125
498 Exaone 4.0 1.2B (Non-reasoning) 2.8 - $0
499 LFM2.5-1.2B-Instruct 2.7 - $0
500 LFM2.5-1.2B-Thinking 2.7 - $0
501 LFM2 2.6B 2.7 - $0
502 Granite 4.0 H 1B 2.7 - $0
503 Jamba 1.7 Mini 2.7 - $0
504 Jamba 1.5 Mini 2.7 - $0.25
505 Jamba 1.6 Mini 2.6 - $0.25
506 Qwen3 1.7B (Reasoning) 2.6 - $0.398
507 Gemma 3 270M 2.4 - $0
508 Granite 4.0 Micro 2.4 - $0
509 Apertus 70B Instruct 2.4 - $1.345
510 Mixtral 8x7B Instruct 2.4 - $0.512
511 DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning) 2.3 - $0
512 Llama 65B 2.1 - $0
513 Qwen Chat 14B 2.1 - $0
514 Granite 4.0 1B 2.1 - $0
515 Claude Instant 2.1 7.8 $0
516 Mistral 7B Instruct 2.1 - $0.206
517 Command-R (Mar '24) 2.1 - $0.75
518 Molmo2-8B 2 - $0
519 LFM2 8B A1B 1.8 - $0
520 Granite 3.3 8B (Non-reasoning) 1.8 - $0.085
521 Qwen3 1.7B (Non-reasoning) 1.5 - $0.188
522 Qwen3 0.6B (Reasoning) 1.3 - $0.398
523 Llama 3 Instruct 8B 1.2 - $0.07
524 Gemma 3n E4B Instruct 1.2 3.2 $0.025
525 Llama 3.2 Instruct 1B 1.1 - $0.05
526 Gemma 3 4B Instruct 1.1 - $0.05
527 LFM2 1.2B 1.1 - $0
528 LFM2.5-VL-1.6B 1 - $0
529 Granite 4.0 H 350M 1 - $0
530 Granite 4.0 350M 1 - $0
531 Apertus 8B Instruct 1 - $0.125
532 Tiny Aya Global 1 - $0
533 Gemma 3n E2B Instruct 1 - $0
534 Gemma 3 1B Instruct 1 - $0
535 Qwen3 0.6B (Non-reasoning) 1 - $0.188
536 GPT-5.5 Pro (xhigh) - - $0
537 Gemini 3 Deep Think - - $0
538 EXAONE 4.5 33B (Non-reasoning) - - $0
539 Cogito v2.1 (Reasoning) - - $1.25
540 Mi:dm K 2.5 Pro Preview - - $0
541 GPT-4o mini Realtime (Dec '24) - - $0
542 GPT-5.4 Pro (xhigh) - - $67.5
543 GPT-3.5 Turbo (0613) - - $0
544 GPT-4o Realtime (Dec '24) - - $0

榜单解读建议

参考 AI 大模型排行榜 时,应综合考虑“综合指数”与“成本价格”。如果您是开发者,编程能力 (Coding) 是更核心的指标。

值品工具箱同步的 AI 大模型排行榜 数据每 24 小时更新,确保您获取到最新的模型性能对比。

指标说明

  • 综合指数:评估通用理解与逻辑。
  • 价格 $/1M:混合 3:1 输入输出比的平均成本。
  • 编程能力:衡量代码生成的准确性。

AI 大模型排行榜 常见问题 (FAQ)

Q1: AI 大模型排行榜 的数据多久更新?

AI 大模型排行榜 数据每 24 小时自动抓取一次,确保最新模型加入列表。

Q2: 这个 AI 大模型排行榜 包含国产模型吗?

是的,只要国产模型通过了 Artificial Analysis 的全球测评,就会出现在 AI 大模型排行榜 中。

Q3: 综合指数在 AI 大模型排行榜 中代表什么?

它代表模型的全能表现。AI 大模型排行榜 通过加权算法给出这个综合评分。

Q4: 如何在 AI 大模型排行榜 中查找性价比最高的游戏?

在 AI 大模型排行榜 页面中,您可以点击“价格”标题进行排序,寻找低价高分的模型。

Q5: AI 大模型排行榜 的编程能力测试准吗?

AI 大模型排行榜 参考了 LiveCodeBench 等权威基准测试,具有极高的参考价值。

Q6: 为什么有的新模型没进入 AI 大模型排行榜?

模型进入 AI 大模型排行榜 需要经过一系列测试,通常在新模型发布后数日内会完成更新。

Q7: AI 大模型排行榜 中的价格计算标准是什么?

价格是基于百万 Token 的调用成本,由 AI 大模型排行榜 统一混合计算得出。

Q8: 手机上能查看 AI 大模型排行榜 吗?

当然可以。AI 大模型排行榜 进行了移动端响应式深度优化。

Q9: AI 大模型排行榜 这个工具免费吗?

是的,由值品工具箱免费提供 AI 大模型排行榜 信息查询服务。

Q10: 我该怎么利用 AI 大模型排行榜 做选型?

如果您需要智能客服,参考 AI 大模型排行榜 的综合指数;如果做翻译,参考编程外的语言指标。

发表评论

请友善文明留言