Skip to content

Qwen3 8B

Qwen
Code Multilingual Thinking Tool Calls

Qwen3 8B is an 8-billion-parameter dense transformer from Alibaba's Qwen team, featuring built-in thinking capabilities alongside code generation, tool calling, and multilingual support. It advances beyond Qwen2.5 with improved reasoning, supporting chain-of-thought inference in a compact form factor. The model covers 14 languages including English, Chinese, and Arabic. With a 40K context window and flash attention, it fits on a single consumer GPU and quantizes efficiently for cost-effective self-hosted reasoning workloads.

Hardware Configuration

Optional — for precise deployment recommendations
Quantization Quality Size Fit
Q8_0 High 8.11 GB
Q8_K_XL High 10.08 GB
Q6_K High 6.26 GB
Q6_K_XL High 6.98 GB
Q5_K_M Medium 5.45 GB
Q5_K_S Medium 5.33 GB
Q5_K_XL Medium 5.47 GB
Q4_K_M Medium 4.68 GB
Q4_K_S Medium 4.47 GB
Q4_K_XL Medium 4.78 GB
Q4_1 Medium 4.89 GB
Q3_K_M Low 3.84 GB
Q3_K_S Low 3.51 GB
Q3_K_XL Low 4.01 GB
Q2_K Low 3.06 GB
Q2_K_L Low 3.19 GB
Q2_K_XL Low 3.26 GB
Last updated: March 5, 2026