Qwen3.5 397B A17B
Qwen
Code Multilingual Thinking Tool Calls Vision
Qwen3.5 397B A17B is the largest Mixture-of-Experts model from Alibaba's Qwen team with 403 billion total parameters and only 17 billion active per token, routed across 512 experts for exceptional efficiency at scale. It is natively multimodal, processing text, images, and video, with built-in thinking capabilities for chain-of-thought reasoning. The model supports a 262K context window and covers over 200 languages. Released under the Apache 2.0 license, it delivers frontier-level performance on math, code, and vision benchmarks, making it a compelling self-hosted alternative for demanding workloads.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| BF16 | Full precision | 738.5 GB | — |
| MXFP4_MOE | Very high | 221.04 GB | — |
| Q8_0 | High | 392.55 GB | — |
| Q8_K_XL | High | 398.37 GB | — |
| Q6_K | High | 304.17 GB | — |
| Q6_K_XL | High | 337.44 GB | — |
| Q5_K_M | Medium | 273.49 GB | — |
| Q5_K_S | Medium | 257.57 GB | — |
| Q5_K_XL | Medium | 274.61 GB | — |
| Q4_K_M | Medium | 227.33 GB | — |
| Q4_K_S | Medium | 212.33 GB | — |
| Q4_K_XL | Medium | 228.44 GB | — |
| IQ4_NL | Medium | 180.45 GB | — |
| IQ4_XS | Medium | 176.7 GB | — |
| Q3_K_M | Low | 165.22 GB | — |
| Q3_K_S | Low | 153.03 GB | — |
| Q3_K_XL | Low | 166.46 GB | — |
| IQ3_S | Low | 136.32 GB | — |
| IQ3_XXS | Low | 130.69 GB | — |
| IQ2_M | Low | 114.61 GB | — |
| IQ2_XXS | Low | 106.99 GB | — |
| IQ1_M | Low | 99.49 GB | — |
Last updated: March 20, 2026