Skip to content

Qwen3 235B A22B

Qwen
Code Multilingual Thinking Tool Calls

Qwen3 235B A22B is a 235.09-billion-parameter Mixture-of-Experts model from Alibaba's Qwen team, optimized for both thinking and non-thinking inference modes. It activates 8 of 128 experts per token, delivering frontier-level reasoning at a fraction of the compute cost of comparable dense models. The model supports code generation, tool calling, and 14 languages including English, Chinese, Japanese, and Arabic. With a 40K context window and flash attention, it targets multi-GPU server deployments and quantizes well to GGUF for self-hosted inference on high-end hardware.

Hardware Configuration

Optional — for precise deployment recommendations
Quantization Quality Size Fit
Q8_0 High 232.76 GB
Q8_K_XL High 246.89 GB
Q6_K High 179.76 GB
Q6_K_XL High 185.2 GB
Q5_K_M Medium 155.36 GB
Q5_K_S Medium 150.76 GB
Q5_K_XL Medium 155.43 GB
Q4_K_M Medium 132.39 GB
Q4_K_S Medium 124.51 GB
Q4_K_XL Medium 124.91 GB
Q4_1 Medium 137.12 GB
Q3_K_M Low 104.73 GB
Q3_K_S Low 94.47 GB
Q3_K_XL Low 96.61 GB
Q2_K Low 79.81 GB
Q2_K_L Low 79.94 GB
Q2_K_XL Low 81.97 GB
Last updated: March 5, 2026