Kimi K2.5
Moonshot AI
Code Thinking Tool Calls Vision
Kimi K2.5 is a 1,016.23-billion-parameter Mixture-of-Experts model from Moonshot AI, combining native vision with advanced agentic reasoning. It activates 8 of 384 experts plus 1 shared expert per token, achieving frontier performance on coding, math, and multimodal benchmarks while keeping per-token compute equivalent to a 32B dense model. The model supports code generation, extended thinking, tool calling, and image understanding in English and Chinese. With a 256K context window and flash attention, it handles long-document analysis and multi-step agentic workflows with visual inputs.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| Q8_0 | High | 1016.23 GB | — |
| Q8_K_XL | High | 1108.05 GB | — |
| Q6_K | High | 785.01 GB | — |
| Q6_K_XL | High | 817.42 GB | — |
| Q5_K_M | Medium | 678.67 GB | — |
| Q5_K_S | Medium | 658.39 GB | — |
| Q5_K_XL | Medium | 681.05 GB | — |
| Q4_K_M | Medium | 578.6 GB | — |
| Q4_K_S | Medium | 543.22 GB | — |
| Q4_K_XL | Medium | 579.29 GB | — |
| Q4_0 | Medium | 541.23 GB | — |
| Q4_1 | Medium | 598.79 GB | — |
| Q3_K_M | Low | 456.13 GB | — |
| Q3_K_S | Low | 412.7 GB | — |
| Q3_K_XL | Low | 456.76 GB | — |
| Q2_K | Low | 348.09 GB | — |
| Q2_K_L | Low | 348.35 GB | — |
| Q2_K_XL | Low | 349.01 GB | — |
Last updated: March 5, 2026