Mistral Small 24B Instruct 2501

Code Multilingual Tool Calls

Mistral Small 24B Instruct 2501 is a 23.57-billion-parameter dense transformer from Mistral AI, optimized for instruction following, code generation, and multilingual conversation. It occupies a mid-range parameter class that offers strong performance relative to its size, competing with larger 30B models on many benchmarks. The model supports tool calling and 10 languages including English, French, Chinese, and Japanese. With a 32K context window and flash attention, it fits on a single consumer GPU at Q4 quantization for efficient self-hosted inference.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
FP32	Full precision	87.82 GB	—
FP16	Full precision	43.92 GB	—
Q8_0	High	23.33 GB	—
Q6_K	High	18.02 GB	—
Q5_K_M	Medium	15.61 GB	—
Q5_K_S	Medium	15.18 GB	—
Q4_K_M	Medium	13.35 GB	—
Q4_K_S	Medium	12.62 GB	—
Q4_0	Medium	12.57 GB	—
Q4_1	Medium	13.85 GB	—
Q3_K_M	Low	10.69 GB	—
Q3_K_S	Low	9.69 GB	—
Q3_K_XL	Low	12.1 GB	—
Q2_K	Low	8.28 GB	—
Q2_K_L	Low	8.89 GB	—
Q3_K_L	Low	11.55 GB	—
Q4_K_L	Low	13.81 GB	—
Q5_K_L	Low	16 GB	—
Q6_K_L	Low	18.32 GB	—

Last updated: April 29, 2026