DeepSeek R1 0528 Qwen3 8B

Code Multilingual Thinking Tool Calls

DeepSeek R1 0528 Qwen3 8B is an 8.19-billion-parameter dense transformer from DeepSeek, distilled from the R1-0528 reasoning model into a Qwen3-based architecture. It brings chain-of-thought reasoning to the 8B class, matching far larger models on math benchmarks while remaining deployable on a single consumer GPU. It supports code generation, tool calls, and nine languages including English, Chinese, and major European languages. With a 128K context window and flash attention, it quantizes efficiently to GGUF for resource-conscious self-hosted inference.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
Q8_0	High	8.11 GB	—
Q8_K_XL	High	10.08 GB	—
Q6_K	High	6.26 GB	—
Q6_K_XL	High	6.98 GB	—
Q5_K_M	Medium	5.45 GB	—
Q5_K_S	Medium	5.33 GB	—
Q5_K_XL	Medium	5.48 GB	—
Q4_K_M	Medium	4.68 GB	—
Q4_K_S	Medium	4.47 GB	—
Q4_K_XL	Medium	4.77 GB	—
Q4_0	Medium	4.46 GB	—
Q4_1	Medium	4.89 GB	—
Q3_K_M	Low	3.84 GB	—
Q3_K_S	Low	3.51 GB	—
Q3_K_XL	Low	4.02 GB	—
Q2_K	Low	3.06 GB	—
Q2_K_L	Low	3.19 GB	—
Q2_K_XL	Low	3.26 GB	—

Last updated: March 24, 2026