Qwen3.5 397B A17B

Code Multilingual Thinking Tool Calls Vision

Qwen3.5 397B A17B is the largest Mixture-of-Experts model from Alibaba's Qwen team with 403 billion total parameters and only 17 billion active per token, routed across 512 experts for exceptional efficiency at scale. It is natively multimodal, processing text, images, and video, with built-in thinking capabilities for chain-of-thought reasoning. The model supports a 262K context window and covers over 200 languages. Released under the Apache 2.0 license, it delivers frontier-level performance on math, code, and vision benchmarks, making it a compelling self-hosted alternative for demanding workloads.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
BF16	Full precision	738.5 GB	—
Q8_0	High	392.55 GB	—
Q8_K_XL	High	398.37 GB	—
Q6_K	High	304.17 GB	—
Q6_K_XL	High	337.44 GB	—
Q5_K_M	Medium	273.49 GB	—
Q5_K_S	Medium	257.57 GB	—
Q5_K_XL	Medium	274.61 GB	—
Q4_K_M	Medium	227.33 GB	—
Q4_K_S	Medium	212.33 GB	—
Q4_K_XL	Medium	228.44 GB	—
MXFP4_MOE	Medium	221.04 GB	—
IQ4_NL	Medium	180.45 GB	—
IQ4_XS	Medium	176.7 GB	—
Q3_K_M	Low	165.22 GB	—
Q3_K_S	Low	153.03 GB	—
Q3_K_XL	Low	166.46 GB	—
IQ3_S	Low	136.32 GB	—
IQ3_XXS	Low	130.69 GB	—
IQ2_M	Low	114.61 GB	—
IQ2_XXS	Low	106.99 GB	—
IQ1_M	Low	99.49 GB	—

Last updated: April 29, 2026