Skip to content

Mistral Large 3 675B Instruct 2512

Mistral AI
Code Multilingual Tool Calls

Mistral Large 3 675B Instruct 2512 is a 675-billion-parameter granular Mixture-of-Experts model from Mistral AI, activating 4 of 128 experts plus 1 shared expert per token for efficient large-scale inference. It represents Mistral AI's flagship open-weight model, designed for general-purpose reasoning, agentic workflows, and enterprise applications. The model supports tool calling, code generation, and 11 languages including English, French, Spanish, and Arabic. With a 288K context window and flash attention, it handles long-document analysis while its MoE architecture keeps per-token compute manageable for GGUF-quantized self-hosted deployment.

Hardware Configuration

Optional — for precise deployment recommendations
Quantization Quality Size Fit
Q8_0 High 666.55 GB
Q8_K_XL High 720.39 GB
Q6_K High 515.3 GB
Q6_K_XL High 536.9 GB
Q5_K_M Medium 445.15 GB
Q5_K_S Medium 432.56 GB
Q5_K_XL Medium 446.87 GB
Q4_K_M Medium 379.04 GB
Q4_K_S Medium 356.38 GB
Q4_K_XL Medium 361.26 GB
Q4_0 Medium 355.48 GB
Q4_1 Medium 393.34 GB
Q3_K_M Low 299.72 GB
Q3_K_S Low 271.83 GB
Q3_K_XL Low 280.14 GB
Q2_K Low 230.13 GB
Q2_K_L Low 230.33 GB
Q2_K_XL Low 238.76 GB
Last updated: March 5, 2026