Skip to content

Kimi K2.5

Moonshot AI
Code Thinking Tool Calls Vision

Kimi K2.5 is a 1,016.23-billion-parameter Mixture-of-Experts model from Moonshot AI, combining native vision with advanced agentic reasoning. It activates 8 of 384 experts plus 1 shared expert per token, achieving frontier performance on coding, math, and multimodal benchmarks while keeping per-token compute equivalent to a 32B dense model. The model supports code generation, extended thinking, tool calling, and image understanding in English and Chinese. With a 256K context window and flash attention, it handles long-document analysis and multi-step agentic workflows with visual inputs.

Hardware Configuration

Optional — for precise deployment recommendations
Quantization Quality Size Fit
Q8_0 High 1016.23 GB
Q8_K_XL High 1108.05 GB
Q6_K High 785.01 GB
Q6_K_XL High 817.42 GB
Q5_K_M Medium 678.67 GB
Q5_K_S Medium 658.39 GB
Q5_K_XL Medium 681.05 GB
Q4_K_M Medium 578.6 GB
Q4_K_S Medium 543.22 GB
Q4_K_XL Medium 579.29 GB
Q4_0 Medium 541.23 GB
Q4_1 Medium 598.79 GB
Q3_K_M Low 456.13 GB
Q3_K_S Low 412.7 GB
Q3_K_XL Low 456.76 GB
Q2_K Low 348.09 GB
Q2_K_L Low 348.35 GB
Q2_K_XL Low 349.01 GB
Last updated: March 5, 2026