Qwen3 Next 80B A3B Thinking
Qwen
Code Multilingual Thinking Tool Calls
Qwen3 Next 80B A3B Thinking is a reasoning-focused Mixture-of-Experts model from Alibaba's Qwen team with 81.32 billion total parameters, optimized for chain-of-thought inference on complex math, logic, and coding tasks. Only around 3 billion parameters activate per token, activating 10 of 512 experts, achieving strong reasoning performance at a fraction of the compute cost of dense alternatives. The model supports code generation, tool calling, and 13 languages including English and Chinese. With a 262K context window and flash attention, it handles long reasoning traces natively and quantizes well to GGUF for self-hosted deployment.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| Q8_0 | High | 78.99 GB | — |
| Q8_K_XL | High | 86.69 GB | — |
| Q6_K | High | 61.04 GB | — |
| Q6_K_XL | High | 63.81 GB | — |
| Q5_K_M | Medium | 52.91 GB | — |
| Q5_K_S | Medium | 51.24 GB | — |
| Q5_K_XL | Medium | 52.77 GB | — |
| Q4_K_M | Medium | 45.17 GB | — |
| Q4_K_S | Medium | 42.38 GB | — |
| Q4_K_XL | Medium | 42.78 GB | — |
| Q4_0 | Medium | 42.2 GB | — |
| Q4_1 | Medium | 46.61 GB | — |
| Q3_K_M | Low | 35.67 GB | — |
| Q3_K_S | Low | 32.21 GB | — |
| Q3_K_XL | Low | 33.06 GB | — |
| Q2_K | Low | 27.17 GB | — |
| Q2_K_L | Low | 27.24 GB | — |
| Q2_K_XL | Low | 28.06 GB | — |
Last updated: March 5, 2026