Mistral Large 3 675B Instruct 2512
Mistral AI
Code Multilingual Tool Calls
Mistral Large 3 675B Instruct 2512 is a 675-billion-parameter granular Mixture-of-Experts model from Mistral AI, activating 4 of 128 experts plus 1 shared expert per token for efficient large-scale inference. It represents Mistral AI's flagship open-weight model, designed for general-purpose reasoning, agentic workflows, and enterprise applications. The model supports tool calling, code generation, and 11 languages including English, French, Spanish, and Arabic. With a 288K context window and flash attention, it handles long-document analysis while its MoE architecture keeps per-token compute manageable for GGUF-quantized self-hosted deployment.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| Q8_0 | High | 666.55 GB | — |
| Q8_K_XL | High | 720.39 GB | — |
| Q6_K | High | 515.3 GB | — |
| Q6_K_XL | High | 536.9 GB | — |
| Q5_K_M | Medium | 445.15 GB | — |
| Q5_K_S | Medium | 432.56 GB | — |
| Q5_K_XL | Medium | 446.87 GB | — |
| Q4_K_M | Medium | 379.04 GB | — |
| Q4_K_S | Medium | 356.38 GB | — |
| Q4_K_XL | Medium | 361.26 GB | — |
| Q4_0 | Medium | 355.48 GB | — |
| Q4_1 | Medium | 393.34 GB | — |
| Q3_K_M | Low | 299.72 GB | — |
| Q3_K_S | Low | 271.83 GB | — |
| Q3_K_XL | Low | 280.14 GB | — |
| Q2_K | Low | 230.13 GB | — |
| Q2_K_L | Low | 230.33 GB | — |
| Q2_K_XL | Low | 238.76 GB | — |
Last updated: March 5, 2026