Qwen3.5 4B

Code Multilingual Thinking Tool Calls Vision

Qwen3.5 4B is a model from Alibaba's Qwen 3.5 family built on the Gated Delta Networks hybrid architecture with 4.66 billion parameters, widely regarded as the community sweet spot for performance per watt. It is natively multimodal, processing text, images, and video, with built-in thinking capabilities for chain-of-thought reasoning. The model supports a 262K context window and covers over 201 languages, nearly matching previous-generation 80B MoE models on coding benchmarks. Released under the Apache 2.0 license, it runs in roughly 3 GB of VRAM at Q4, delivering fast and stable self-hosted deployment on consumer hardware.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
Q8_0	High	4.17 GB	—
Q8_K_XL	High	5.54 GB	—
Q6_K	High	3.28 GB	—
Q6_K_XL	High	3.86 GB	—
Q5_K_M	Medium	2.93 GB	—
Q5_K_S	Medium	2.82 GB	—
Q5_K_XL	Medium	3.03 GB	—
Q4_K_M	Medium	2.55 GB	—
Q4_K_S	Medium	2.41 GB	—
Q4_K_XL	Medium	2.71 GB	—
Q4_0	Medium	2.41 GB	—
Q4_1	Medium	2.59 GB	—
Q3_K_M	Low	2.14 GB	—
Q3_K_S	Low	1.96 GB	—
Q3_K_XL	Low	2.27 GB	—
Q2_K_XL	Low	1.81 GB	—

Last updated: March 24, 2026