MiniMax M2

Code Thinking Tool Calls

MiniMax M2 is a 228.7-billion-parameter Mixture-of-Experts model from MiniMax with 256 experts and 8 active per token, optimized for coding and agentic workflows. It uses interleaved chain-of-thought reasoning and ranks among the top open-source models for multi-step task execution and code generation. The model supports tool calling with strong performance across shell, browser, and code runner toolchains. With a 192K context window and flash attention, it handles long-horizon tasks while quantizing down to Q2 GGUF levels for self-hosted multi-GPU deployments.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
Q8_0	High	226.43 GB	—
Q8_K_XL	High	243.43 GB	—
Q6_K	High	174.87 GB	—
Q6_K_XL	High	180.95 GB	—
Q5_K_M	Medium	151.16 GB	—
Q5_K_S	Medium	146.67 GB	—
Q5_K_XL	Medium	150.96 GB	—
Q4_K_M	Medium	128.84 GB	—
Q4_K_S	Medium	121.1 GB	—
Q4_K_XL	Medium	122.58 GB	—
Q4_0	Medium	120.61 GB	—
Q4_1	Medium	133.39 GB	—
Q3_K_M	Low	101.77 GB	—
Q3_K_S	Low	91.92 GB	—
Q3_K_XL	Low	94.48 GB	—
Q2_K	Low	77.58 GB	—
Q2_K_L	Low	77.71 GB	—
Q2_K_XL	Low	79.87 GB	—

Last updated: April 29, 2026