Qwen3.5 9B
Qwen
Code Multilingual Thinking Tool Calls Vision
Qwen3.5 9B is the flagship small model in Alibaba's Qwen 3.5 family, built on the Gated Delta Networks hybrid architecture with 9.65 billion parameters, outperforming gpt-oss-120B on GPQA Diamond with 81.7 versus 80.1 at thirteen times fewer parameters. It is natively multimodal, processing text, images, and video, with built-in thinking capabilities for chain-of-thought reasoning. The model supports a 262K context window and covers over 201 languages. Released under the Apache 2.0 license, it runs in roughly 5 GB of VRAM at Q4, making it a top choice for self-hosted deployment on consumer hardware.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| Q8_0 | High | 8.87 GB | — |
| Q8_K_XL | High | 12.08 GB | — |
| Q6_K | High | 6.95 GB | — |
| Q6_K_XL | High | 8.16 GB | — |
| Q5_K_M | Medium | 6.13 GB | — |
| Q5_K_S | Medium | 5.92 GB | — |
| Q5_K_XL | Medium | 6.28 GB | — |
| Q4_K_M | Medium | 5.29 GB | — |
| Q4_K_S | Medium | 5.02 GB | — |
| Q4_K_XL | Medium | 5.56 GB | — |
| Q4_0 | Medium | 5.01 GB | — |
| Q4_1 | Medium | 5.44 GB | — |
| Q3_K_M | Low | 4.35 GB | — |
| Q3_K_S | Low | 4.02 GB | — |
| Q3_K_XL | Low | 4.71 GB | — |
| Q2_K_XL | Low | 3.84 GB | — |
Last updated: March 13, 2026