Qwen3 Next 80B A3B Instruct
Qwen
Code Multilingual Tool Calls
Qwen3 Next 80B A3B Instruct is a Mixture-of-Experts model from Alibaba's Qwen team with 81.32 billion total parameters, fine-tuned for instruction following and tool-use workflows. Only around 3 billion parameters activate per token, activating 10 of 512 experts, matching the performance of much larger models at dramatically lower compute cost. It supports code generation, tool calling, and 13 languages including English and Chinese. With a 262K context window and flash attention, it processes long documents natively and quantizes well to GGUF for self-hosted inference on consumer-grade multi-GPU setups.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| Q8_0 | High | 78.99 GB | — |
| Q8_K_XL | High | 86.69 GB | — |
| Q6_K | High | 61.04 GB | — |
| Q6_K_XL | High | 63.81 GB | — |
| Q5_K_M | Medium | 52.91 GB | — |
| Q5_K_S | Medium | 51.24 GB | — |
| Q5_K_XL | Medium | 52.77 GB | — |
| Q4_K_M | Medium | 45.17 GB | — |
| Q4_K_S | Medium | 42.38 GB | — |
| Q4_K_XL | Medium | 42.9 GB | — |
| Q4_0 | Medium | 42.2 GB | — |
| Q4_1 | Medium | 46.61 GB | — |
| Q3_K_M | Low | 35.67 GB | — |
| Q3_K_S | Low | 32.21 GB | — |
| Q3_K_XL | Low | 33.19 GB | — |
| Q2_K | Low | 27.17 GB | — |
| Q2_K_L | Low | 27.24 GB | — |
| Q2_K_XL | Low | 28.06 GB | — |
Last updated: March 5, 2026