Qwen2.5 72B Instruct
Qwen
Code Multilingual Tool Calls
Qwen2.5 72B Instruct is a 72.71-billion-parameter dense transformer from Alibaba's Qwen team, fine-tuned for instruction following, code generation, and multilingual tasks. It competes with other leading 70B instruct models while supporting 14 languages including English, Chinese, Arabic, and Japanese. The model provides native tool calling and structured output capabilities. With a 32K context window and grouped-query attention, it quantizes efficiently for self-hosted inference on high-end consumer or server-class GPU configurations.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| FP16 | Full precision | 135.84 GB | — |
| Q8_0 | High | 72.27 GB | — |
| Q6_K | High | 55.76 GB | — |
| Q5_K_M | Medium | 48.1 GB | — |
| Q4_K_M | Medium | 40.97 GB | — |
| Q4_0 | Medium | 38.51 GB | — |
| Q3_K_M | Low | 33.02 GB | — |
| Q2_K | Low | 25.45 GB | — |
| Q5_0 | Low | 46.89 GB | — |
Last updated: March 5, 2026