Kimi K2.5

Code Thinking Tool Calls Vision

Kimi K2.5 is a 1,016.23-billion-parameter Mixture-of-Experts model from Moonshot AI, combining native vision with advanced agentic reasoning. It activates 8 of 384 experts plus 1 shared expert per token, achieving frontier performance on coding, math, and multimodal benchmarks while keeping per-token compute equivalent to a 32B dense model. The model supports code generation, extended thinking, tool calling, and image understanding in English and Chinese. With a 256K context window and flash attention, it handles long-document analysis and multi-step agentic workflows with visual inputs.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
Q8_0	High	1016.23 GB	—
Q8_K_XL	High	1108.05 GB	—
Q6_K	High	785.01 GB	—
Q6_K_XL	High	817.42 GB	—
Q5_K_M	Medium	678.67 GB	—
Q5_K_S	Medium	658.39 GB	—
Q5_K_XL	Medium	681.05 GB	—
Q4_K_M	Medium	578.6 GB	—
Q4_K_S	Medium	543.22 GB	—
Q4_K_XL	Medium	579.29 GB	—
Q4_0	Medium	541.23 GB	—
Q4_1	Medium	598.79 GB	—
Q3_K_M	Low	456.13 GB	—
Q3_K_S	Low	412.7 GB	—
Q3_K_XL	Low	456.76 GB	—
Q2_K	Low	348.09 GB	—
Q2_K_L	Low	348.35 GB	—
Q2_K_XL	Low	349.01 GB	—

Last updated: April 29, 2026