Mistral Large 3 675B Instruct 2512

Code Multilingual Tool Calls

Mistral Large 3 675B Instruct 2512 is a 675-billion-parameter granular Mixture-of-Experts model from Mistral AI, activating 4 of 128 experts plus 1 shared expert per token for efficient large-scale inference. It represents Mistral AI's flagship open-weight model, designed for general-purpose reasoning, agentic workflows, and enterprise applications. The model supports tool calling, code generation, and 11 languages including English, French, Spanish, and Arabic. With a 288K context window and flash attention, it handles long-document analysis while its MoE architecture keeps per-token compute manageable for GGUF-quantized self-hosted deployment.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
Q8_0	High	666.55 GB	—
Q8_K_XL	High	720.39 GB	—
Q6_K	High	515.3 GB	—
Q6_K_XL	High	536.9 GB	—
Q5_K_M	Medium	445.15 GB	—
Q5_K_S	Medium	432.56 GB	—
Q5_K_XL	Medium	446.87 GB	—
Q4_K_M	Medium	379.04 GB	—
Q4_K_S	Medium	356.38 GB	—
Q4_K_XL	Medium	361.26 GB	—
Q4_0	Medium	355.48 GB	—
Q4_1	Medium	393.34 GB	—
Q3_K_M	Low	299.72 GB	—
Q3_K_S	Low	271.83 GB	—
Q3_K_XL	Low	280.14 GB	—
Q2_K	Low	230.13 GB	—
Q2_K_L	Low	230.33 GB	—
Q2_K_XL	Low	238.76 GB	—

Last updated: April 29, 2026