Skip to content

Qwen3.6 27B

Qwen
Code Multilingual Thinking Tool Calls Vision

Qwen3.6 27B is a dense large language model from Alibaba's Qwen team with 27 billion parameters and 64 layers, built on a novel hybrid Gated DeltaNet and Gated Attention architecture shared with its larger Mixture-of-Experts siblings. It is natively multimodal, processing text, images, and video, and ships with built-in thinking and tool-calling capabilities across a 262K context window extensible to roughly one million tokens via YaRN. The model is released under the Apache 2.0 license. At Q4 quantization it requires around 16 GB of VRAM, making it a strong fit for self-hosted deployment on a single high-end consumer GPU.

Hardware Configuration
Optional — for precise deployment recommendations
Quantization Quality Size Fit
BF16 Full precision 50.11 GB
Q8_0 High 26.63 GB
Q8_K_XL High 32.9 GB
Q6_K High 20.98 GB
Q6_K_XL High 23.88 GB
Q5_K_M Medium 18.17 GB
Q5_K_S Medium 17.66 GB
Q5_K_XL Medium 18.66 GB
Q4_K_M Medium 15.66 GB
Q4_K_S Medium 14.77 GB
Q4_K_XL Medium 16.4 GB
IQ4_NL Medium 14.97 GB
IQ4_XS Medium 14.38 GB
Q4_0 Medium 14.71 GB
Q4_1 Medium 16.07 GB
Q3_K_M Low 12.65 GB
Q3_K_S Low 11.51 GB
Q3_K_XL Low 13.48 GB
IQ3_XXS Low 11.17 GB
Q2_K_XL Low 11.04 GB
IQ2_M Low 10.1 GB
IQ2_XXS Low 8.74 GB
Last updated: April 29, 2026