Qwen3.6 27B

Code Multilingual Thinking Tool Calls Vision

Qwen3.6 27B is a dense large language model from Alibaba's Qwen team with 27 billion parameters and 64 layers, built on a novel hybrid Gated DeltaNet and Gated Attention architecture shared with its larger Mixture-of-Experts siblings. It is natively multimodal, processing text, images, and video, and ships with built-in thinking and tool-calling capabilities across a 262K context window extensible to roughly one million tokens via YaRN. The model is released under the Apache 2.0 license. At Q4 quantization it requires around 16 GB of VRAM, making it a strong fit for self-hosted deployment on a single high-end consumer GPU.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
BF16	Full precision	50.11 GB	—
Q8_0	High	26.63 GB	—
Q8_K_XL	High	32.9 GB	—
Q6_K	High	20.98 GB	—
Q6_K_XL	High	23.88 GB	—
Q5_K_M	Medium	18.17 GB	—
Q5_K_S	Medium	17.66 GB	—
Q5_K_XL	Medium	18.66 GB	—
Q4_K_M	Medium	15.66 GB	—
Q4_K_S	Medium	14.77 GB	—
Q4_K_XL	Medium	16.4 GB	—
IQ4_NL	Medium	14.97 GB	—
IQ4_XS	Medium	14.38 GB	—
Q4_0	Medium	14.71 GB	—
Q4_1	Medium	16.07 GB	—
Q3_K_M	Low	12.65 GB	—
Q3_K_S	Low	11.51 GB	—
Q3_K_XL	Low	13.48 GB	—
IQ3_XXS	Low	11.17 GB	—
Q2_K_XL	Low	11.04 GB	—
IQ2_M	Low	10.1 GB	—
IQ2_XXS	Low	8.74 GB	—

Last updated: April 29, 2026