Skip to content

MiniMax M2

MiniMax
Code Thinking Tool Calls

MiniMax M2 is a 228.7-billion-parameter Mixture-of-Experts model from MiniMax with 256 experts and 8 active per token, optimized for coding and agentic workflows. It uses interleaved chain-of-thought reasoning and ranks among the top open-source models for multi-step task execution and code generation. The model supports tool calling with strong performance across shell, browser, and code runner toolchains. With a 192K context window and flash attention, it handles long-horizon tasks while quantizing down to Q2 GGUF levels for self-hosted multi-GPU deployments.

Hardware Configuration

Optional — for precise deployment recommendations
Quantization Quality Size Fit
Q8_0 High 226.43 GB
Q8_K_XL High 243.43 GB
Q6_K High 174.87 GB
Q6_K_XL High 180.95 GB
Q5_K_M Medium 151.16 GB
Q5_K_S Medium 146.67 GB
Q5_K_XL Medium 150.96 GB
Q4_K_M Medium 128.84 GB
Q4_K_S Medium 121.1 GB
Q4_K_XL Medium 122.58 GB
Q4_0 Medium 120.61 GB
Q4_1 Medium 133.39 GB
Q3_K_M Low 101.77 GB
Q3_K_S Low 91.92 GB
Q3_K_XL Low 94.48 GB
Q2_K Low 77.58 GB
Q2_K_L Low 77.71 GB
Q2_K_XL Low 79.87 GB
Last updated: March 5, 2026