Kimi K2 Instruct

Code Tool Calls

Kimi K2 Instruct is a 1,026.47-billion-parameter Mixture-of-Experts model from Moonshot AI, fine-tuned for instruction following, code generation, and autonomous tool use. It activates 8 of 384 experts plus 1 shared expert per token, achieving strong performance on coding and agentic benchmarks while keeping per-token compute equivalent to a 32B dense model. The model supports code generation and tool calling in English and Chinese, trained with the MuonClip optimizer on 15.5 trillion tokens. With a 128K context window and flash attention, it is suited for agentic deployments requiring reliable tool orchestration.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
Q8_0	High	1016.15 GB	—
Q8_K_XL	High	1108.35 GB	—
Q6_K	High	784.82 GB	—
Q6_K_XL	High	818.73 GB	—
Q5_K_M	Medium	678.37 GB	—
Q5_K_S	Medium	658.07 GB	—
Q5_K_XL	Medium	680.38 GB	—
Q4_K_M	Medium	578.14 GB	—
Q4_K_S	Medium	542.74 GB	—
Q4_K_XL	Medium	546.77 GB	—
Q4_0	Medium	540.76 GB	—
Q4_1	Medium	598.41 GB	—
Q3_K_M	Low	455.73 GB	—
Q3_K_S	Low	412.01 GB	—
Q3_K_XL	Low	421.03 GB	—
Q2_K	Low	347.56 GB	—
Q2_K_L	Low	347.82 GB	—
Q2_K_XL	Low	355.65 GB	—

Last updated: April 29, 2026