Skip to content

Qwen3 Next 80B A3B Thinking

Qwen
Code Multilingual Thinking Tool Calls

Qwen3 Next 80B A3B Thinking is a reasoning-focused Mixture-of-Experts model from Alibaba's Qwen team with 81.32 billion total parameters, optimized for chain-of-thought inference on complex math, logic, and coding tasks. Only around 3 billion parameters activate per token, activating 10 of 512 experts, achieving strong reasoning performance at a fraction of the compute cost of dense alternatives. The model supports code generation, tool calling, and 13 languages including English and Chinese. With a 262K context window and flash attention, it handles long reasoning traces natively and quantizes well to GGUF for self-hosted deployment.

Hardware Configuration

Optional — for precise deployment recommendations
Quantization Quality Size Fit
Q8_0 High 78.99 GB
Q8_K_XL High 86.69 GB
Q6_K High 61.04 GB
Q6_K_XL High 63.81 GB
Q5_K_M Medium 52.91 GB
Q5_K_S Medium 51.24 GB
Q5_K_XL Medium 52.77 GB
Q4_K_M Medium 45.17 GB
Q4_K_S Medium 42.38 GB
Q4_K_XL Medium 42.78 GB
Q4_0 Medium 42.2 GB
Q4_1 Medium 46.61 GB
Q3_K_M Low 35.67 GB
Q3_K_S Low 32.21 GB
Q3_K_XL Low 33.06 GB
Q2_K Low 27.17 GB
Q2_K_L Low 27.24 GB
Q2_K_XL Low 28.06 GB
Last updated: March 5, 2026