MiniMax M2
MiniMax
Code Thinking Tool Calls
MiniMax M2 is a 228.7-billion-parameter Mixture-of-Experts model from MiniMax with 256 experts and 8 active per token, optimized for coding and agentic workflows. It uses interleaved chain-of-thought reasoning and ranks among the top open-source models for multi-step task execution and code generation. The model supports tool calling with strong performance across shell, browser, and code runner toolchains. With a 192K context window and flash attention, it handles long-horizon tasks while quantizing down to Q2 GGUF levels for self-hosted multi-GPU deployments.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| Q8_0 | High | 226.43 GB | — |
| Q8_K_XL | High | 243.43 GB | — |
| Q6_K | High | 174.87 GB | — |
| Q6_K_XL | High | 180.95 GB | — |
| Q5_K_M | Medium | 151.16 GB | — |
| Q5_K_S | Medium | 146.67 GB | — |
| Q5_K_XL | Medium | 150.96 GB | — |
| Q4_K_M | Medium | 128.84 GB | — |
| Q4_K_S | Medium | 121.1 GB | — |
| Q4_K_XL | Medium | 122.58 GB | — |
| Q4_0 | Medium | 120.61 GB | — |
| Q4_1 | Medium | 133.39 GB | — |
| Q3_K_M | Low | 101.77 GB | — |
| Q3_K_S | Low | 91.92 GB | — |
| Q3_K_XL | Low | 94.48 GB | — |
| Q2_K | Low | 77.58 GB | — |
| Q2_K_L | Low | 77.71 GB | — |
| Q2_K_XL | Low | 79.87 GB | — |
Last updated: March 5, 2026