Find The Best LLM for your hardware
Select your GPU and system RAM to see models ranked by VRAM fit. Get deployment-ready configurations optimized for your infrastructure.
Hardware Configuration
Optional — for precise deployment recommendations
Model Filters
1B
1026B
4K
10M
| Model | Organization | Parameters | Context | Capabilities | Quants | Fit |
|---|---|---|---|---|---|---|
| Kimi K2 Instruct | Moonshot AI | 1026.47B | 128K |
Code
Tool Calls
|
18 | — |
| Kimi K2.5 | Moonshot AI | 1016.23B | 256K |
Code
Thinking
Tool Calls
Vision
|
18 | — |
| DeepSeek V3.2 | DeepSeek | 685.4B | 160K |
Code
Multilingual
Thinking
Tool Calls
|
18 | — |
| DeepSeek V3.1 | DeepSeek | 684.53B | 160K |
Code
Multilingual
Thinking
Tool Calls
|
18 | — |
| Mistral Large 3 675B Instruct 2512 | Mistral AI | 675B | 288K |
Code
Multilingual
Tool Calls
|
18 | — |
| Qwen3.5 397B A17B | Qwen | 403.4B | 256K |
Code
Multilingual
Thinking
Tool Calls
Vision
|
22 | — |
| Llama 4 Maverick 17B 128E Instruct | Meta | 396.58B | 1M |
Code
Multilingual
Tool Calls
Vision
|
18 | — |
| GLM 4.7 | Zai Org | 358.34B | 198K |
Code
Thinking
Tool Calls
|
18 | — |
| Qwen3 235B A22B | Qwen | 235.09B | 40K |
Code
Multilingual
Thinking
Tool Calls
|
17 | — |
| MiniMax M2 | MiniMax | 228.7B | 192K |
Code
Thinking
Tool Calls
|
18 | — |
| Kimi K2 Thinking | Moonshot AI | 170.27B | 256K |
Code
Thinking
Tool Calls
|
18 | — |
| Devstral 2 123B Instruct 2512 | Mistral AI | 125.03B | 256K |
Code
Multilingual
Tool Calls
|
18 | — |
| NVIDIA Nemotron 3 Super 120B A12B | NVIDIA | 123.61B | 256K |
Code
Multilingual
Thinking
Tool Calls
|
15 | — |
| Mistral Large Instruct 2411 | Mistral AI | 122.61B | 128K |
Multilingual
Tool Calls
|
14 | — |
| GPT OSS 120B | OpenAI | 120.41B | 128K |
Multilingual
Thinking
Tool Calls
|
16 | — |
| Mistral Small 4 119B 2603 | Mistral AI | 119.4B | 256K |
Code
Multilingual
Thinking
Tool Calls
Vision
|
3 | — |
| Qwen3 Next 80B A3B Thinking | Qwen | 81.32B | 256K |
Code
Multilingual
Thinking
Tool Calls
|
18 | — |
| Qwen3 Next 80B A3B Instruct | Qwen | 81.32B | 256K |
Code
Multilingual
Tool Calls
|
18 | — |
| Qwen3 Coder Next | Qwen | 79.67B | 256K |
Code
Multilingual
Tool Calls
|
19 | — |
| Qwen2.5 72B Instruct | Qwen | 72.71B | 32K |
Code
Multilingual
Tool Calls
|
9 | — |
| DeepSeek R1 Distill Llama 70B | DeepSeek | 70.55B | 128K |
Code
Multilingual
Thinking
Tool Calls
|
19 | — |
| Llama 3.3 70B Instruct | Meta | 70B | 128K |
Code
Multilingual
Tool Calls
|
20 | — |
| Meta Llama 3.1 70B Instruct | Meta | 70B | 128K |
Code
Multilingual
Tool Calls
|
15 | — |
| Qwen3.5 35B A3B | Qwen | 35.95B | 256K |
Code
Multilingual
Thinking
Tool Calls
Vision
|
11 | — |
| DeepSeek R1 Distill Qwen 32B | DeepSeek | 32.76B | 128K |
Code
Multilingual
Thinking
Tool Calls
|
8 | — |
| Qwen3 32B | Qwen | 32B | 40K |
Code
Multilingual
Thinking
Tool Calls
|
18 | — |
| NVIDIA Nemotron 3 Nano 30B A3B | NVIDIA | 31.58B | 256K |
Code
Multilingual
Thinking
Tool Calls
|
17 | — |
| GLM 4.7 Flash | Zai Org | 31.22B | 198K |
Code
Thinking
Tool Calls
|
17 | — |
| Qwen3.5 27B | Qwen | 27.78B | 256K |
Code
Multilingual
Thinking
Tool Calls
Vision
|
16 | — |
| Devstral Small 2 24B Instruct 2512 | Mistral AI | 24.01B | 384K |
Code
Multilingual
Tool Calls
|
18 | — |
| Mistral Small 3.1 24B Instruct 2503 | Mistral AI | 24B | 128K |
Code
Multilingual
Tool Calls
Vision
|
18 | — |
| Mistral Small 24B Instruct 2501 | Mistral AI | 23.57B | 32K |
Code
Multilingual
Tool Calls
|
19 | — |
| GPT OSS 20B | OpenAI | 21.51B | 128K |
Multilingual
Thinking
Tool Calls
|
16 | — |
| Llama 4 Scout 17B 16E Instruct | Meta | 17B | 10M |
Code
Multilingual
Tool Calls
Vision
|
18 | — |
| DeepSeek Coder V2 Lite Instruct | DeepSeek | 15.71B | 160K |
Code
Multilingual
|
5 | — |
| DeepSeek R1 Distill Qwen 14B | DeepSeek | 14.77B | 128K |
Code
Multilingual
Thinking
Tool Calls
|
8 | — |
| Qwen2.5 14B Instruct | Qwen | 14.77B | 32K |
Code
Multilingual
Tool Calls
|
9 | — |
| Phi 4 | Microsoft | 14.66B | 16K |
Code
|
14 | — |
| Qwen3.5 9B | Qwen | 9.65B | 256K |
Code
Multilingual
Thinking
Tool Calls
Vision
|
16 | — |
| DeepSeek R1 0528 Qwen3 8B | DeepSeek | 8.19B | 128K |
Code
Multilingual
Thinking
Tool Calls
|
18 | — |
| Meta Llama 3.1 8B Instruct | Meta | 8B | 128K |
Code
Multilingual
Tool Calls
|
19 | — |
| Qwen3 8B | Qwen | 8B | 40K |
Code
Multilingual
Thinking
Tool Calls
|
17 | — |
| DeepSeek R1 Distill Qwen 7B | DeepSeek | 7.62B | 128K |
Code
Multilingual
Thinking
Tool Calls
|
8 | — |
| Qwen2.5 7B Instruct | Qwen | 7.62B | 32K |
Code
Multilingual
Tool Calls
|
9 | — |
| Granite 4.0 Tiny Base Preview | IBM | 6.67B | 128K |
Code
Multilingual
|
15 | — |
| Qwen3.5 4B | Qwen | 4.66B | 256K |
Code
Multilingual
Thinking
Tool Calls
Vision
|
16 | — |
| NVIDIA Nemotron 3 Nano 4B | NVIDIA | 3.97B | 256K |
Code
Thinking
Tool Calls
|
22 | — |
| Phi 3 mini 4k instruct | Microsoft | 3.82B | 4K |
Code
|
2 | — |
| Qwen3.5 2B | Qwen | 2.27B | 256K |
Code
Multilingual
Thinking
Tool Calls
Vision
|
16 | — |
| LFM2.5 1.2B Thinking | Liquid AI | 1.17B | 125K |
Multilingual
Thinking
Tool Calls
|
6 | — |
| Qwen3.5 0.8B | Qwen | 0.87B | 256K |
Code
Multilingual
Thinking
Tool Calls
Vision
|
16 | — |
Last updated: March 20, 2026