Skip to content

Qwen2.5 72B Instruct

Qwen
Code Multilingual Tool Calls

Qwen2.5 72B Instruct is a 72.71-billion-parameter dense transformer from Alibaba's Qwen team, fine-tuned for instruction following, code generation, and multilingual tasks. It competes with other leading 70B instruct models while supporting 14 languages including English, Chinese, Arabic, and Japanese. The model provides native tool calling and structured output capabilities. With a 32K context window and grouped-query attention, it quantizes efficiently for self-hosted inference on high-end consumer or server-class GPU configurations.

Hardware Configuration

Optional — for precise deployment recommendations
Quantization Quality Size Fit
FP16 Full precision 135.84 GB
Q8_0 High 72.27 GB
Q6_K High 55.76 GB
Q5_K_M Medium 48.1 GB
Q4_K_M Medium 40.97 GB
Q4_0 Medium 38.51 GB
Q3_K_M Low 33.02 GB
Q2_K Low 25.45 GB
Q5_0 Low 46.89 GB
Last updated: March 5, 2026