AI 大模型排名 ArtificialAnalysis AI 大模型排行榜

信息查询
1.1k 次浏览
100% 有帮助 · 1 人反馈

AI 大模型排名 Artificial Analysis AI 大模型排行榜,综合对超过 100 个 AI 模型(LLM)的性能进行了比较和排名,评估指标包括智能程度、价格以及常见AI基准测试的结果。

AI 大模型排行榜数据中心

重置
排名 模型名称 综合指数 ▼ 编程 价格 ($/1M)
1 GPT-5.5 (xhigh) 60.2 59.1 $11.25
2 GPT-5.5 (high) 58.9 58.5 $11.25
3 Claude Opus 4.7 (Adaptive Reasoning, Max Effort) 57.3 52.5 $10.938
4 Gemini 3.1 Pro Preview 57.2 55.5 $4.5
5 GPT-5.4 (xhigh) 56.8 57.2 $5.625
6 GPT-5.5 (medium) 56.7 56.2 $11.25
7 Kimi K2.6 53.9 47.1 $1.712
8 MiMo-V2.5-Pro 53.8 45.5 $1.5
9 GPT-5.3 Codex (xhigh) 53.6 53.1 $4.813
10 Grok 4.3 (high) 53.2 41 $1.563
11 Claude Opus 4.6 (Adaptive Reasoning, Max Effort) 52.9 48.1 $10.938
12 Muse Spark 52.2 47.5 $0
13 Claude Opus 4.7 (Non-reasoning, High Effort) 51.8 53.1 $10.938
14 Qwen3.6 Max Preview 51.8 44.9 $2.925
15 Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) 51.7 50.9 $6.563
16 DeepSeek V4 Pro (Reasoning, Max Effort) 51.5 47.5 $2.175
17 GLM-5.1 (Reasoning) 51.4 43.4 $2.15
18 GPT-5.2 (xhigh) 51.3 48.7 $4.813
19 GPT-5.5 (low) 50.8 52.1 $11.25
20 Qwen3.6 Plus 50 42.9 $1.125
21 DeepSeek V4 Pro (Reasoning, High Effort) 49.8 43.2 $2.175
22 GLM-5 (Reasoning) 49.8 44.2 $1.55
23 Claude Opus 4.5 (Reasoning) 49.7 47.8 $10.938
24 MiniMax-M2.7 49.6 41.9 $0.525
25 Grok 4.20 0309 v2 (Reasoning) 49.3 40.5 $3
26 MiMo-V2-Pro 49.2 41.4 $1.5
27 MiMo-V2.5 49 42.1 $0.72
28 GPT-5.2 Codex (xhigh) 49 43 $4.813
29 GPT-5.4 mini (xhigh) 48.9 51.5 $1.688
30 Grok 4.20 0309 (Reasoning) 48.5 42.2 $3
31 Gemini 3 Pro Preview (high) 48.4 46.5 $4.5
32 GPT-5.4 (low) 47.9 45.6 $5.625
33 GPT-5.1 (high) 47.7 44.7 $3.438
34 GLM-5-Turbo 46.8 36.8 $0
35 Kimi K2.5 (Reasoning) 46.8 39.6 $1.136
36 GPT-5.2 (medium) 46.6 44.2 $4.813
37 DeepSeek V4 Flash (Reasoning, Max Effort) 46.5 38.7 $0.175
38 Claude Opus 4.6 (Non-reasoning, High Effort) 46.5 47.6 $10.938
39 Gemini 3 Flash Preview (Reasoning) 46.4 42.6 $1.125
40 DeepSeek V4 Flash (Reasoning, High Effort) 46 39.8 $0.175
41 Qwen3.6 27B (Reasoning) 45.8 36.5 $1.35
42 Qwen3.5 397B A17B (Reasoning) 45 41.3 $1.35
43 MiMo-V2-Omni-0327 44.9 36.9 $0.8
44 GPT-5 (high) 44.6 36 $3.438
45 GPT-5 Codex (high) 44.6 38.9 $3.438
46 Claude Sonnet 4.6 (Non-reasoning, High Effort) 44.4 46.4 $6.563
47 GPT-5.4 nano (xhigh) 44 43.9 $0.463
48 KAT Coder Pro V2 43.8 45.6 $0.525
49 GLM-5.1 (Non-reasoning) 43.8 35.8 $2.15
50 Qwen3.6 35B A3B (Reasoning) 43.5 35.2 $0.557
51 MiMo-V2-Omni 43.4 35.5 $0
52 GPT-5.1 Codex (high) 43.1 36.6 $3.438
53 Claude Opus 4.5 (Non-reasoning) 43.1 42.9 $10.938
54 Claude 4.5 Sonnet (Reasoning) 43 38.6 $6.563
55 Kimi K2.6 (Non-reasoning) 42.9 38.4 $1.712
56 GLM 5V Turbo (Reasoning) 42.9 36.2 $0
57 Claude Sonnet 4.6 (Non-reasoning, Low Effort) 42.6 43 $6.563
58 GLM-4.7 (Reasoning) 42.1 36.3 $1
59 Qwen3.5 27B (Reasoning) 42.1 34.9 $0.825
60 GPT-5 (medium) 42 38.9 $3.438
61 Claude 4.1 Opus (Reasoning) 42 36.5 $32.813
62 Hy3-preview (Reasoning) 41.9 36.5 $0.2
63 MiniMax-M2.5 41.9 37.4 $0.525
64 DeepSeek V3.2 (Reasoning) 41.7 36.7 $0.337
65 Qwen3.5 122B A10B (Reasoning) 41.6 34.7 $1.1
66 MiMo-V2-Flash (Feb 2026) 41.5 33.5 $0.15
67 Grok 4 41.5 40.5 $8.5
68 Gemini 3 Pro Preview (low) 41.3 39.4 $4.5
69 GPT-5 mini (high) 41.2 35.3 $0.688
70 GPT-5.5 (Non-reasoning) 40.9 48.6 $11.25
71 Kimi K2 Thinking 40.9 34.8 $1.075
72 o3-pro 40.7 - $35
73 GLM-5 (Non-reasoning) 40.6 39 $1.55
74 Qwen3.5 397B A17B (Non-reasoning) 40.1 37.4 $1.35
75 Qwen3 Max Thinking 39.8 30.5 $2.4
76 MiniMax-M2.1 39.4 32.8 $0.525
77 DeepSeek V4 Pro (Non-reasoning) 39.3 38.4 $2.175
78 Gemma 4 31B (Reasoning) 39.2 38.7 $0
79 Mistral Medium 3.5 39.2 35.4 $3
80 GPT-5 (low) 39.2 30.7 $3.438
81 MiMo-V2-Flash (Reasoning) 39.2 31.8 $0.15
82 Claude 4 Opus (Reasoning) 39 34 $32.813
83 GPT-5 mini (medium) 38.9 32.8 $0.688
84 Claude 4 Sonnet (Reasoning) 38.7 34.1 $6.563
85 Grok 4.1 Fast (Reasoning) 38.6 30.9 $0.275
86 Qwen3.5 Omni Plus 38.6 27.6 $1.5
87 GPT-5.1 Codex mini (high) 38.6 36.4 $0.688
88 Step 3.5 Flash 2603 38.5 34.6 $0
89 o3 38.4 38.4 $3.5
90 GPT-5.4 nano (medium) 38.1 35 $0.463
91 Step 3.5 Flash 37.8 31.6 $0.15
92 GPT-5.4 mini (medium) 37.7 37.5 $1.688
93 Kimi K2.5 (Non-reasoning) 37.3 25.8 $1.2
94 Qwen3.5 27B (Non-reasoning) 37.2 33.4 $0.835
95 Claude 4.5 Haiku (Reasoning) 37.1 32.6 $2.188
96 Qwen3.6 27B (Non-reasoning) 37.1 26.6 $1.35
97 Claude 4.5 Sonnet (Non-reasoning) 37.1 33.5 $6.563
98 Qwen3.5 35B A3B (Reasoning) 37.1 30.3 $0.688
99 DeepSeek V4 Flash (Non-reasoning) 36.5 35.2 $0.175
100 JT-35B-Flash 36.1 28.9 $0
101 MiniMax-M2 36.1 29.2 $0.525
102 NVIDIA Nemotron 3 Super 120B A12B (Reasoning) 36 31.2 $0.412
103 KAT-Coder-Pro V1 36 18.3 $0.525
104 Claude 4.1 Opus (Non-reasoning) 36 - $32.813
105 Qwen3.5 122B A10B (Non-reasoning) 35.9 31.6 $1.1
106 Nova 2.0 Pro Preview (medium) 35.7 30.4 $3.438
107 MiMo-V2.5-Pro (Non-reasoning) 35.6 36.8 $1.5
108 GPT-5.4 (Non-reasoning) 35.4 41 $5.625
109 Grok 4 Fast (Reasoning) 35.1 27.4 $0.275
110 Gemini 3 Flash Preview (Non-reasoning) 35 37.8 $1.125
111 Claude 3.7 Sonnet (Reasoning) 34.7 27.6 $0
112 Gemini 2.5 Pro 34.6 32 $3.438
113 Nova 2.0 Lite (high) 34.5 23.4 $0.85
114 GLM-4.7 (Non-reasoning) 34.2 32 $1
115 DeepSeek V3.1 Terminus (Reasoning) 33.9 33.7 $1.914
116 Hy3-preview (Non-reasoning) 33.7 34.3 $0.2
117 Ling-2.6-1T 33.6 33.1 $0.85
118 GPT-5.2 (Non-reasoning) 33.6 34.7 $4.813
119 Gemini 3.1 Flash-Lite Preview 33.5 30.1 $0.563
120 Doubao Seed Code 33.5 31.3 $0
121 gpt-oss-120B (high) 33.3 28.6 $0.262
122 o4-mini (high) 33.1 25.6 $1.925
123 Claude 4 Sonnet (Non-reasoning) 33 30.6 $6.563
124 Claude 4 Opus (Non-reasoning) 33 - $32.813
125 DeepSeek V3.2 Exp (Reasoning) 32.9 33.3 $0.31
126 Mercury 2 32.8 30.6 $0.375
127 GLM-4.6 (Reasoning) 32.5 29.5 $0.963
128 Qwen3 Max Thinking (Preview) 32.5 24.5 $2.4
129 Qwen3.5 9B (Reasoning) 32.4 25.3 $0.113
130 Gemma 4 31B (Non-reasoning) 32.3 33.9 $0.205
131 Grok 3 mini Reasoning (high) 32.1 25.2 $0.35
132 K-EXAONE (Reasoning) 32.1 27 $0
133 DeepSeek V3.2 (Non-reasoning) 32.1 34.6 $0.775
134 Nova 2.0 Pro Preview (low) 31.9 24.5 $3.438
135 Trinity Large Thinking 31.9 27.2 $0.395
136 Qwen3.6 35B A3B (Non-reasoning) 31.5 17.6 $0.844
137 Qwen3 Max 31.4 26.4 $3.047
138 Gemma 4 26B A4B (Reasoning) 31.2 22.4 $0.198
139 Gemini 2.5 Flash Preview (Sep '25) (Reasoning) 31.1 24.6 $0
140 Claude 4.5 Haiku (Non-reasoning) 31 29.6 $2.188
141 Grok 4.3 (Non-reasoning) 31 25.1 $1.563
142 Kimi K2 0905 30.9 25.9 $1.075
143 Claude 3.7 Sonnet (Non-reasoning) 30.8 26.7 $6.563
144 Qwen3.5 35B A3B (Non-reasoning) 30.7 16.8 $0.688
145 o1 30.7 20.5 $26.25
146 MiMo-V2-Flash (Non-reasoning) 30.3 25.8 $0.15
147 Gemini 2.5 Pro Preview (Mar' 25) 30.3 46.7 $0
148 EXAONE 4.5 33B 30.2 23 $0
149 GLM-4.6 (Non-reasoning) 30.2 30.2 $1
150 GLM-4.7-Flash (Reasoning) 30.1 25.9 $0.153
151 Nova 2.0 Lite (medium) 29.7 23.9 $0.85
152 Grok 4.20 0309 (Non-reasoning) 29.7 25.4 $3
153 Gemini 2.5 Pro Preview (May' 25) 29.5 - $3.438
154 Qwen3 235B A22B 2507 (Reasoning) 29.5 23.2 $0.838
155 DeepSeek V3.2 Speciale 29.4 37.9 $0
156 ERNIE 5.0 Thinking Preview 29.1 29.2 $0
157 Grok 4.20 0309 v2 (Non-reasoning) 29 22 $3
158 Grok Code Fast 1 28.7 23.7 $0.525
159 DeepSeek V3.1 Terminus (Non-reasoning) 28.5 31.9 $0.453
160 Nemotron Cascade 2 30B A3B 28.4 25.8 $0
161 DeepSeek V3.2 Exp (Non-reasoning) 28.4 30 $0.31
162 Qwen3 Coder Next 28.3 22.9 $0.563
163 Apriel-v1.5-15B-Thinker 28.3 18.7 $0
164 DeepSeek V3.1 (Non-reasoning) 28.1 28.4 $0.834
165 Nova 2.0 Omni (medium) 28 15.1 $0.85
166 Mistral Small 4 (Reasoning) 27.8 24.3 $0.262
167 DeepSeek V3.1 (Reasoning) 27.7 29.7 $0.865
168 Apriel-v1.6-15B-Thinker 27.6 22 $0
169 Qwen3 VL 235B A22B (Reasoning) 27.6 20.9 $2.174
170 GPT-5.1 (Non-reasoning) 27.4 27.3 $3.438
171 Qwen3.5 9B (Non-reasoning) 27.3 21.3 $0
172 Gemma 4 26B A4B (Non-reasoning) 27.1 29.1 $0.198
173 Magistral Medium 1.2 27.1 21.7 $2.75
174 DeepSeek R1 0528 (May '25) 27.1 24 $2.063
175 Qwen3.5 4B (Reasoning) 27.1 17.5 $0.06
176 Gemini 2.5 Flash (Reasoning) 27 22.2 $0.85
177 GPT-5 nano (high) 26.8 20.3 $0.138
178 Qwen3 Next 80B A3B (Reasoning) 26.7 19.5 $1.875
179 GLM-4.5 (Reasoning) 26.4 26.3 $1
180 GPT-4.1 26.3 21.8 $3.5
181 Kimi K2 26.3 22.1 $1.039
182 Ling 2.6 Flash 26.2 23.2 $0.15
183 Qwen3 Max (Preview) 26.1 25.5 $2.4
184 Solar Pro 3 25.9 13.3 $0
185 Qwen3.5 Omni Flash 25.9 14 $0.275
186 GPT-5 nano (medium) 25.9 22.9 $0.138
187 o3-mini 25.9 17.9 $1.925
188 o1-pro 25.8 - $262.5
189 Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning) 25.7 22.1 $0
190 JT-MINI 25.4 21.2 $0
191 o3-mini (high) 25.2 17.3 $1.925
192 Grok 3 25.2 19.8 $6
193 Seed-OSS-36B-Instruct 25.2 16.7 $0.3
194 Qwen3 235B A22B 2507 Instruct 25 22.1 $0.356
195 Qwen3 Coder 480B A35B Instruct 24.8 24.6 $0.675
196 Qwen3 VL 32B (Reasoning) 24.7 14.5 $2.625
197 Nova 2.0 Lite (low) 24.6 13.6 $0.85
198 Sonar Reasoning Pro 24.6 - $0
199 gpt-oss-120B (low) 24.5 15.5 $0.262
200 gpt-oss-20B (high) 24.5 18.5 $0.088
201 GPT-5.4 nano (Non-Reasoning) 24.4 27.9 $0.463
202 MiniMax M1 80k 24.4 14.5 $0.963
203 NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) 24.3 19 $0.096
204 Gemini 2.5 Flash Preview (Reasoning) 24.3 - $0
205 K2 Think V2 24.1 15.5 $0
206 LongCat Flash Lite 23.9 16.5 $0
207 GPT-5 (minimal) 23.9 25 $3.438
208 HyperCLOVA X SEED Think (32B) 23.7 17.5 $0
209 o1-preview 23.7 34 $28.875
210 Grok 4.1 Fast (Non-reasoning) 23.6 19.5 $0.275
211 K-EXAONE (Non-reasoning) 23.4 13.5 $0
212 GLM-4.6V (Reasoning) 23.4 19.7 $0.45
213 GPT-5.4 mini (Non-Reasoning) 23.3 25.3 $1.688
214 Nova 2.0 Omni (low) 23.2 13.9 $0.85
215 GLM-4.5-Air 23.2 23.8 $0.372
216 Nova 2.0 Pro Preview (Non-reasoning) 23.1 20.5 $3.438
217 Mi:dm K 2.5 Pro 23.1 12.6 $0
218 Grok 4 Fast (Non-reasoning) 23.1 19 $0.275
219 GPT-4.1 mini 22.9 18.5 $0.7
220 Mistral Large 3 22.8 22.7 $0.75
221 Ring-1T 22.8 16.8 $0
222 Qwen3.5 4B (Non-reasoning) 22.6 13.7 $0.06
223 Qwen3 30B A3B 2507 (Reasoning) 22.4 14.6 $0.673
224 DeepSeek V3 0324 22.3 22 $1.209
225 INTELLECT-3 22.2 19.1 $0
226 GLM-4.7-Flash (Non-reasoning) 22.1 11 $0.153
227 Devstral 2 22 23.7 $0
228 GPT-5 (ChatGPT) 21.8 21.2 $3.438
229 Solar Open 100B (Reasoning) 21.7 10.5 $0
230 Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) 21.6 18.2 $0.175
231 Grok 3 Reasoning Beta 21.6 - $0
232 Nemotron 3 Nano Omni 30B A3B Reasoning 21.4 14.8 $0.131
233 Mistral Medium 3.1 21.3 18.3 $0.8
234 MiniMax M1 40k 20.9 14.1 $0
235 gpt-oss-20B (low) 20.8 14.4 $0.095
236 Qwen3 VL 235B A22B Instruct 20.8 16.5 $0.7
237 GPT-5 mini (minimal) 20.7 21.9 $0.688
238 K2-V2 (high) 20.6 16.1 $0
239 Gemini 2.5 Flash (Non-reasoning) 20.6 17.8 $0.85
240 o1-mini 20.4 - $0
241 Qwen3 Next 80B A3B Instruct 20.1 15.3 $0.875
242 Tri-21B-think Preview 20 7.4 $0
243 GPT-4.5 (Preview) 20 - $0
244 Qwen3 Coder 30B A3B Instruct 20 19.4 $0.352
245 Qwen3 235B A22B (Reasoning) 19.8 17.4 $2.625
246 Qwen3 VL 30B A3B (Reasoning) 19.7 13.1 $0.338
247 QwQ 32B 19.7 - $0.745
248 Gemini 2.0 Flash Thinking Experimental (Jan '25) 19.6 24.1 $0
249 Devstral Small 2 19.5 20.7 $0
250 Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning) 19.4 14.5 $0.175
251 Motif-2-12.7B-Reasoning 19.1 11.9 $0
252 Nova Premier 19 13.8 $5
253 Ling-1T 19 18.8 $0
254 Gemma 4 E4B (Reasoning) 18.8 13.7 $0
255 Mistral Medium 3 18.8 13.6 $0.8
256 Magistral Medium 1 18.8 16 $0
257 DeepSeek R1 (Jan '25) 18.8 15.9 $2.431
258 Solar Pro 2 (Preview) (Reasoning) 18.8 - $0
259 Llama Nemotron Super 49B v1.5 (Reasoning) 18.7 15.1 $0.175
260 K2-V2 (medium) 18.7 14 $0
261 Claude 3.5 Haiku 18.7 10.7 $1.75
262 Devstral Medium 18.7 15.9 $0.8
263 Mistral Small 4 (Non-reasoning) 18.6 16.4 $0.262
264 Hermes 4 - Llama-3.1 405B (Reasoning) 18.6 16 $1.5
265 Tri-21B-Think 18.6 6.3 $0
266 GPT-4o (Aug '24) 18.6 16.6 $4.375
267 GPT-4o (March 2025, chatgpt-4o-latest) 18.6 - $0
268 Llama 3.3 Nemotron Super 49B v1 (Reasoning) 18.5 9.4 $0
269 Gemini 2.0 Flash (Feb '25) 18.5 13.6 $0.262
270 Llama 4 Maverick 18.4 15.6 $0.475
271 Magistral Small 1.2 18.2 14.8 $0.75
272 Sarvam 105B (high) 18.2 9.8 $0
273 Qwen3 4B 2507 (Reasoning) 18.2 9.5 $0
274 Gemini 2.0 Pro Experimental (Feb '25) 18.1 25.5 $0
275 Nova 2.0 Lite (Non-reasoning) 18 12.5 $0.85
276 Claude 3 Opus 18 19.5 $32.813
277 Devstral Small (May '25) 18 12.2 $0
278 Sonar Reasoning 17.9 - $0
279 Gemini 2.5 Flash Preview (Non-reasoning) 17.8 - $0
280 Hermes 4 - Llama-3.1 405B (Non-reasoning) 17.6 18.1 $1.5
281 Gemini 2.5 Flash-Lite (Reasoning) 17.6 9.5 $0.175
282 Llama 3.1 Instruct 405B 17.4 14.5 $3.688
283 GPT-4o (Nov '24) 17.3 16.7 $4.375
284 DeepSeek R1 Distill Qwen 32B 17.2 - $0
285 Qwen3 VL 32B Instruct 17.2 15.6 $1.225
286 GLM-4.6V (Non-reasoning) 17.1 11.1 $0.45
287 Qwen3 235B A22B (Non-reasoning) 17 14 $0.787
288 Gemini 2.0 Flash (experimental) 16.8 - $0
289 Magistral Small 1 16.8 11.1 $0
290 EXAONE 4.0 32B (Reasoning) 16.7 14 $0
291 Qwen3 VL 8B (Reasoning) 16.7 9.8 $0.66
292 Nova 2.0 Omni (Non-reasoning) 16.6 13.8 $0.85
293 DeepSeek V3 (Dec '24) 16.5 16.4 $0.523
294 Qwen3 32B (Reasoning) 16.5 13.8 $0.276
295 DeepSeek R1 0528 Qwen3 8B 16.4 7.8 $0
296 Qwen3.5 2B (Reasoning) 16.3 3.5 $0.04
297 Qwen2.5 Max 16.3 - $2.8
298 Qwen3 14B (Reasoning) 16.2 13.1 $0.731
299 Nanbeige4.1-3B 16.1 8.9 $0
300 Ministral 3 14B 16 10.9 $0.2
301 DeepSeek R1 Distill Llama 70B 16 11.4 $0.787
302 Hermes 4 - Llama-3.1 70B (Reasoning) 16 14.4 $0.198
303 Gemini 1.5 Pro (Sep '24) 16 23.6 $0
304 Solar Pro 2 (Preview) (Non-reasoning) 16 - $0
305 Qwen3 VL 30B A3B Instruct 16 14.3 $0.3
306 Claude 3.5 Sonnet (Oct '24) 15.9 30.2 $6.563
307 Falcon-H1R-7B 15.8 9.8 $0
308 DeepSeek R1 Distill Qwen 14B 15.8 - $0
309 Ling-flash-2.0 15.7 16.7 $0.247
310 Qwen3 Omni 30B A3B (Reasoning) 15.6 12.7 $0.43
311 Qwen2.5 Instruct 72B 15.6 11.9 $0.37
312 Step3 VL 10B 15.5 13.9 $0
313 Sonar 15.5 - $0
314 Qwen3 30B A3B (Reasoning) 15.3 11 $0.18
315 Gemma 4 E2B (Reasoning) 15.2 9 $0
316 Devstral Small (Jul '25) 15.2 12.1 $0.15
317 Sonar Pro 15.2 - $0
318 QwQ 32B-Preview 15.2 - $0
319 Mistral Large 2 (Nov '24) 15.1 13.8 $3
320 Mistral Small 3.2 15.1 13.3 $0.128
321 GLM-4.5V (Reasoning) 15.1 10.9 $0.9
322 Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) 15 13.1 $0.9
323 ERNIE 4.5 300B A47B 15 14.5 $0.485
324 Qwen3 30B A3B 2507 Instruct 15 14.2 $0.213
325 Solar Pro 2 (Reasoning) 14.9 12.1 $0
326 NVIDIA Nemotron Nano 12B v2 VL (Reasoning) 14.9 11.7 $0.3
327 Gemma 4 E4B (Non-reasoning) 14.8 6.4 $0
328 Ministral 3 8B 14.8 10 $0.15
329 NVIDIA Nemotron Nano 9B V2 (Reasoning) 14.8 8.3 $0.07
330 NVIDIA Nemotron 3 Nano 4B 14.7 10 $0
331 Granite 4.1 30B 14.7 10.1 $0
332 Qwen3.5 2B (Non-reasoning) 14.7 4.9 $0.04
333 Gemini 2.0 Flash-Lite (Feb '25) 14.7 - $0
334 Llama Nemotron Super 49B v1.5 (Non-reasoning) 14.6 10.5 $0.175
335 Llama 3.3 Instruct 70B 14.5 10.7 $0.616
336 GPT-4o (May '24) 14.5 24.2 $7.5
337 Gemini 2.0 Flash-Lite (Preview) 14.5 - $0
338 Mistral Small 3.1 14.5 13.9 $0.138
339 Qwen3 32B (Non-reasoning) 14.5 - $0.26
340 Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning) 14.4 - $0
341 Kimi Linear 48B A3B Instruct 14.4 14.2 $0
342 K2-V2 (low) 14.4 10.5 $0
343 Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) 14.3 7.6 $0
344 Qwen3 VL 8B Instruct 14.3 7.3 $0.31
345 Claude 3.5 Sonnet (June '24) 14.2 26 $6.563
346 Qwen3 4B (Reasoning) 14.2 - $0.398
347 GPT-4o (ChatGPT) 14.1 - $0
348 Llama 3.1 Tulu3 405B 14.1 - $0
349 Ring-flash-2.0 14 10.6 $0.247
350 Pixtral Large 14 - $3
351 Olmo 3.1 32B Think 13.9 9.8 $0
352 Grok 2 (Dec '24) 13.9 - $0
353 GPT-5 nano (minimal) 13.8 14.2 $0.138
354 Gemini 1.5 Flash (Sep '24) 13.8 - $0
355 GPT-4 Turbo 13.7 21.5 $15
356 Qwen3 VL 4B (Reasoning) 13.7 6.7 $0
357 Solar Pro 2 (Non-reasoning) 13.6 11.3 $0
358 Llama 4 Scout 13.5 6.7 $0.292
359 Command A 13.5 9.9 $4.375
360 Nova Pro 13.5 11 $1.4
361 Llama 3.1 Nemotron Instruct 70B 13.4 10.8 $1.2
362 Grok Beta 13.3 - $0
363 NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning) 13.2 15.8 $0.088
364 NVIDIA Nemotron Nano 9B V2 (Non-reasoning) 13.2 7.5 $0.086
365 Qwen3 8B (Reasoning) 13.2 9 $0.37
366 Qwen2.5 Instruct 32B 13.2 - $0
367 GPT-4.1 nano 13 11.2 $0.175
368 Mistral Large 2 (Jul '24) 13 - $3
369 Qwen2.5 Coder Instruct 32B 12.9 - $0
370 Qwen3 4B 2507 Instruct 12.9 9 $0
371 GPT-4 12.8 13.1 $37.5
372 Qwen3 14B (Non-reasoning) 12.8 12.4 $0.381
373 MiniCPM-V 4.6 1.3B 12.7 0.7 $0
374 Gemini 2.5 Flash-Lite (Non-reasoning) 12.7 7.4 $0.175
375 Mistral Small 3 12.7 - $0.104
376 Nova Lite 12.7 5.1 $0.105
377 GLM-4.5V (Non-reasoning) 12.7 10.8 $0.9
378 Hermes 4 - Llama-3.1 70B (Non-reasoning) 12.6 9.2 $0.198
379 GPT-4o mini 12.6 - $0.262
380 Llama 3.1 Instruct 70B 12.5 10.9 $0.56
381 DeepSeek-V2.5 (Dec '24) 12.5 - $0
382 Qwen3 30B A3B (Non-reasoning) 12.5 13.3 $0.133
383 Qwen3 4B (Non-reasoning) 12.5 - $0.188
384 Granite 4.1 8B 12.4 7.3 $0.063
385 Sarvam 30B (high) 12.3 7.9 $0
386 Gemini 2.0 Flash Thinking Experimental (Dec '24) 12.3 - $0
387 Claude 3 Haiku 12.3 6.7 $0.5
388 DeepSeek-V2.5 12.3 - $0
389 Olmo 3.1 32B Instruct 12.2 5.6 $0
390 Gemma 4 E2B (Non-reasoning) 12.1 8.3 $0
391 Mistral Saba 12.1 - $0
392 DeepSeek R1 Distill Llama 8B 12.1 - $0
393 Olmo 3 32B Think 12.1 10.5 $0
394 R1 1776 12 - $0
395 Gemini 1.5 Pro (May '24) 12 19.8 $0
396 Reka Flash (Sep '24) 12 - $0.35
397 Qwen2.5 Turbo 12 - $0.088
398 Llama 3.2 Instruct 90B (Vision) 11.9 - $1.38
399 Solar Mini 11.9 - $0.15
400 Llama 3.1 Instruct 8B 11.8 4.9 $0.1
401 Grok-1 11.7 - $0
402 EXAONE 4.0 32B (Non-reasoning) 11.7 9.4 $0
403 Qwen2 Instruct 72B 11.7 - $0
404 Ministral 3 3B 11.2 4.8 $0.1
405 Gemini 1.5 Flash-8B 11.1 - $0
406 DeepHermes 3 - Mistral 24B Preview (Non-reasoning) 10.9 - $0
407 Jamba 1.7 Large 10.9 7.8 $3.5
408 Granite 4.0 H Small 10.8 8.5 $0.107
409 Qwen3 Omni 30B A3B Instruct 10.7 7.2 $0.43
410 Jamba 1.5 Large 10.7 - $3.5
411 DeepSeek-Coder-V2 10.6 - $0
412 OLMo 2 32B 10.6 2.7 $0
413 Hermes 3 - Llama-3.1 70B 10.6 - $0.3
414 Jamba 1.6 Large 10.6 - $3.5
415 Qwen3 8B (Non-reasoning) 10.6 7.1 $0.185
416 LFM2 24B A2B 10.5 3.6 $0.052
417 Qwen3.5 0.8B (Reasoning) 10.5 0 $0.02
418 Gemini 1.5 Flash (May '24) 10.5 - $0
419 Phi-4 10.4 11.2 $0.219
420 Nova Micro 10.3 4.1 $0.061
421 Gemma 3 27B Instruct 10.3 9.6 $0.145
422 Claude 3 Sonnet 10.3 - $6
423 Mistral Small (Sep '24) 10.2 - $0.3
424 NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) 10.1 5.9 $0.3
425 Gemma 3n E4B Instruct Preview (May '25) 10.1 - $0
426 Gemini 1.0 Ultra 10.1 17.6 $0
427 Phi-3 Mini Instruct 3.8B 10.1 3 $0
428 Phi-4 Multimodal Instruct 10 - $0
429 Qwen2.5 Coder Instruct 7B 10 - $0
430 Qwen3.5 0.8B (Non-reasoning) 9.9 1 $0.02
431 Mistral Large (Feb '24) 9.9 - $6
432 Mixtral 8x22B Instruct 9.8 - $0
433 Llama 3.2 Instruct 3B 9.7 - $0.15
434 Llama 2 Chat 7B 9.7 - $0.1
435 Jamba Reasoning 3B 9.6 2.5 $0
436 Qwen3 VL 4B Instruct 9.6 4.6 $0
437 Reka Flash 3 9.5 8.9 $0.35
438 Qwen1.5 Chat 110B 9.5 - $0
439 Olmo 3 7B Think 9.4 7.6 $0
440 Claude 2.1 9.3 14 $0
441 OLMo 2 7B 9.3 1.2 $0
442 Molmo 7B-D 9.2 1.2 $0
443 Ling-mini-2.0 9.2 5 $0
444 Claude 2.0 9.1 12.9 $0
445 DeepSeek R1 Distill Qwen 1.5B 9.1 - $0
446 DeepSeek-V2-Chat 9.1 - $0
447 GPT-3.5 Turbo 9 10.7 $0.75
448 Mistral Small (Feb '24) 9 - $1.5
449 Mistral Medium 9 - $4.088
450 Llama 3 Instruct 70B 8.9 6.8 $1.175
451 Gemma 3 12B Instruct 8.8 6.3 $0.14
452 LFM 40B 8.8 - $0
453 Arctic Instruct 8.8 - $0
454 Qwen Chat 72B 8.8 - $0
455 Llama 3.2 Instruct 11B (Vision) 8.7 4.2 $0.245
456 PALM-2 8.6 4.6 $0
457 Granite 4.1 3B 8.5 5.5 $0
458 Gemini 1.0 Pro 8.5 - $0
459 DeepSeek Coder V2 Lite Instruct 8.5 - $0
460 Phi-4 Mini Instruct 8.4 3.6 $0
461 Llama 2 Chat 70B 8.4 - $0
462 Llama 2 Chat 13B 8.4 - $0
463 DeepSeek LLM 67B Chat (V1) 8.4 - $0
464 Sarvam M (Reasoning) 8.4 7.5 $0
465 Exaone 4.0 1.2B (Reasoning) 8.3 3.1 $0
466 OpenChat 3.5 (1210) 8.3 - $0
467 DBRX Instruct 8.3 - $0
468 Command-R+ (Apr '24) 8.3 - $6
469 LFM2.5-1.2B-Thinking 8.1 1.4 $0
470 Olmo 3 7B Instruct 8.1 3.4 $0.125
471 Exaone 4.0 1.2B (Non-reasoning) 8.1 2.5 $0
472 Jamba 1.7 Mini 8.1 3.1 $0
473 LFM2.5-1.2B-Instruct 8 0.8 $0
474 LFM2 2.6B 8 1.4 $0
475 Granite 4.0 H 1B 8 2.7 $0
476 Jamba 1.5 Mini 8 - $0.25
477 Qwen3 1.7B (Reasoning) 8 1.4 $0.398
478 Jamba 1.6 Mini 7.9 - $0.25
479 Gemma 3 270M 7.7 0 $0
480 Granite 4.0 Micro 7.7 5 $0
481 Apertus 70B Instruct 7.7 1.9 $1.345
482 Mixtral 8x7B Instruct 7.7 - $0.512
483 DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning) 7.6 - $0
484 Llama 65B 7.4 - $0
485 Qwen Chat 14B 7.4 - $0
486 Claude Instant 7.4 7.8 $0
487 Mistral 7B Instruct 7.4 - $0.206
488 Command-R (Mar '24) 7.4 - $0.75
489 Molmo2-8B 7.3 4.4 $0
490 Granite 4.0 1B 7.3 2.9 $0
491 LFM2 8B A1B 7 2.3 $0
492 Granite 3.3 8B (Non-reasoning) 7 3.4 $0.085
493 Qwen3 1.7B (Non-reasoning) 6.8 2.3 $0.188
494 Qwen3 0.6B (Reasoning) 6.5 0.9 $0.398
495 Llama 3 Instruct 8B 6.4 4 $0.07
496 Gemma 3n E4B Instruct 6.4 4.2 $0.025
497 Llama 3.2 Instruct 1B 6.3 0.6 $0.05
498 Gemma 3 4B Instruct 6.3 2.9 $0.05
499 LFM2 1.2B 6.3 0.8 $0
500 LFM2.5-VL-1.6B 6.2 1 $0
501 Granite 4.0 350M 6.1 0.3 $0
502 Apertus 8B Instruct 5.9 1.4 $0.125
503 Qwen3 0.6B (Non-reasoning) 5.7 1.4 $0.188
504 Gemma 3 1B Instruct 5.6 0.2 $0
505 Granite 4.0 H 350M 5.4 0.6 $0
506 Gemma 3n E2B Instruct 4.8 2.2 $0
507 Tiny Aya Global 4.7 1.2 $0
508 GPT-5.5 Pro (xhigh) - - $0
509 Gemini 3 Deep Think - - $0
510 EXAONE 4.5 33B (Non-reasoning) - - $0
511 Cogito v2.1 (Reasoning) - 24.8 $1.25
512 Mi:dm K 2.5 Pro Preview - 11.9 $0
513 GPT-4o mini Realtime (Dec '24) - - $0
514 GPT-4o Realtime (Dec '24) - - $0
515 GPT-3.5 Turbo (0613) - - $0
516 GPT-5.4 Pro (xhigh) - - $67.5

榜单解读建议

参考 AI 大模型排行榜 时,应综合考虑“综合指数”与“成本价格”。如果您是开发者,编程能力 (Coding) 是更核心的指标。

值品工具箱同步的 AI 大模型排行榜 数据每 24 小时更新,确保您获取到最新的模型性能对比。

指标说明

  • 综合指数:评估通用理解与逻辑。
  • 价格 $/1M:混合 3:1 输入输出比的平均成本。
  • 编程能力:衡量代码生成的准确性。

AI 大模型排行榜 常见问题 (FAQ)

Q1: AI 大模型排行榜 的数据多久更新?

AI 大模型排行榜 数据每 24 小时自动抓取一次,确保最新模型加入列表。

Q2: 这个 AI 大模型排行榜 包含国产模型吗?

是的,只要国产模型通过了 Artificial Analysis 的全球测评,就会出现在 AI 大模型排行榜 中。

Q3: 综合指数在 AI 大模型排行榜 中代表什么?

它代表模型的全能表现。AI 大模型排行榜 通过加权算法给出这个综合评分。

Q4: 如何在 AI 大模型排行榜 中查找性价比最高的游戏?

在 AI 大模型排行榜 页面中,您可以点击“价格”标题进行排序,寻找低价高分的模型。

Q5: AI 大模型排行榜 的编程能力测试准吗?

AI 大模型排行榜 参考了 LiveCodeBench 等权威基准测试,具有极高的参考价值。

Q6: 为什么有的新模型没进入 AI 大模型排行榜?

模型进入 AI 大模型排行榜 需要经过一系列测试,通常在新模型发布后数日内会完成更新。

Q7: AI 大模型排行榜 中的价格计算标准是什么?

价格是基于百万 Token 的调用成本,由 AI 大模型排行榜 统一混合计算得出。

Q8: 手机上能查看 AI 大模型排行榜 吗?

当然可以。AI 大模型排行榜 进行了移动端响应式深度优化。

Q9: AI 大模型排行榜 这个工具免费吗?

是的,由值品工具箱免费提供 AI 大模型排行榜 信息查询服务。

Q10: 我该怎么利用 AI 大模型排行榜 做选型?

如果您需要智能客服,参考 AI 大模型排行榜 的综合指数;如果做翻译,参考编程外的语言指标。

发表评论

请友善文明留言