Skip to content

Kimi K2 Thinking

Moonshot AI
Code Thinking Tool Calls

Kimi K2 Thinking is a 170.27-billion-parameter Mixture-of-Experts model from Moonshot AI, trained end-to-end for extended chain-of-thought reasoning with integrated tool use. It activates 8 of 384 experts plus 1 shared expert per token, delivering frontier performance on complex math, coding, and agentic benchmarks while maintaining long-horizon coherence across hundreds of consecutive tool invocations. The model supports code generation, deep thinking, and tool calling in English and Chinese. With a 256K context window and flash attention, it excels at multi-step agentic workflows that require sustained goal-directed reasoning.

Hardware Configuration

Optional — for precise deployment recommendations
Quantization Quality Size Fit
Q8_0 High 1016.07 GB
Q8_K_XL High 1108.35 GB
Q6_K High 785.01 GB
Q6_K_XL High 818.73 GB
Q5_K_M Medium 678.67 GB
Q5_K_S Medium 658.39 GB
Q5_K_XL Medium 679.68 GB
Q4_K_M Medium 578.6 GB
Q4_K_S Medium 543.23 GB
Q4_K_XL Medium 601.86 GB
Q4_0 Medium 541.24 GB
Q4_1 Medium 598.79 GB
Q3_K_M Low 456.33 GB
Q3_K_S Low 412.7 GB
Q3_K_XL Low 423.87 GB
Q2_K Low 348.4 GB
Q2_K_L Low 348.65 GB
Q2_K_XL Low 359.82 GB
Last updated: March 5, 2026