Qwen3 Next 80B A3B Instruct

Code Multilingual Tool Calls

Qwen3 Next 80B A3B Instruct is a Mixture-of-Experts model from Alibaba's Qwen team with 81.32 billion total parameters, fine-tuned for instruction following and tool-use workflows. Only around 3 billion parameters activate per token, activating 10 of 512 experts, matching the performance of much larger models at dramatically lower compute cost. It supports code generation, tool calling, and 13 languages including English and Chinese. With a 262K context window and flash attention, it processes long documents natively and quantizes well to GGUF for self-hosted inference on consumer-grade multi-GPU setups.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
Q8_0	High	78.99 GB	—
Q8_K_XL	High	86.69 GB	—
Q6_K	High	61.04 GB	—
Q6_K_XL	High	63.81 GB	—
Q5_K_M	Medium	52.91 GB	—
Q5_K_S	Medium	51.24 GB	—
Q5_K_XL	Medium	52.77 GB	—
Q4_K_M	Medium	45.17 GB	—
Q4_K_S	Medium	42.38 GB	—
Q4_K_XL	Medium	42.9 GB	—
Q4_0	Medium	42.2 GB	—
Q4_1	Medium	46.61 GB	—
Q3_K_M	Low	35.67 GB	—
Q3_K_S	Low	32.21 GB	—
Q3_K_XL	Low	33.19 GB	—
Q2_K	Low	27.17 GB	—
Q2_K_L	Low	27.24 GB	—
Q2_K_XL	Low	28.06 GB	—

Last updated: April 29, 2026