Mistral Small 24B Instruct 2501
Mistral AI
Code Multilingual Tool Calls
Mistral Small 24B Instruct 2501 is a 23.57-billion-parameter dense transformer from Mistral AI, optimized for instruction following, code generation, and multilingual conversation. It occupies a mid-range parameter class that offers strong performance relative to its size, competing with larger 30B models on many benchmarks. The model supports tool calling and 10 languages including English, French, Chinese, and Japanese. With a 32K context window and flash attention, it fits on a single consumer GPU at Q4 quantization for efficient self-hosted inference.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| FP32 | Full precision | 87.82 GB | — |
| FP16 | Full precision | 43.92 GB | — |
| Q8_0 | High | 23.33 GB | — |
| Q6_K | High | 18.02 GB | — |
| Q5_K_M | Medium | 15.61 GB | — |
| Q5_K_S | Medium | 15.18 GB | — |
| Q4_K_M | Medium | 13.35 GB | — |
| Q4_K_S | Medium | 12.62 GB | — |
| Q4_0 | Medium | 12.57 GB | — |
| Q4_1 | Medium | 13.85 GB | — |
| Q3_K_M | Low | 10.69 GB | — |
| Q3_K_S | Low | 9.69 GB | — |
| Q3_K_XL | Low | 12.1 GB | — |
| Q2_K | Low | 8.28 GB | — |
| Q2_K_L | Low | 8.89 GB | — |
| Q3_K_L | Low | 11.55 GB | — |
| Q4_K_L | Low | 13.81 GB | — |
| Q5_K_L | Low | 16 GB | — |
| Q6_K_L | Low | 18.32 GB | — |
Last updated: March 12, 2026