DeepSeek R1 0528 Qwen3 8B
DeepSeek
Code Multilingual Thinking Tool Calls
DeepSeek R1 0528 Qwen3 8B is an 8.19-billion-parameter dense transformer from DeepSeek, distilled from the R1-0528 reasoning model into a Qwen3-based architecture. It brings chain-of-thought reasoning to the 8B class, matching far larger models on math benchmarks while remaining deployable on a single consumer GPU. It supports code generation, tool calls, and nine languages including English, Chinese, and major European languages. With a 128K context window and flash attention, it quantizes efficiently to GGUF for resource-conscious self-hosted inference.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| Q8_0 | High | 8.11 GB | — |
| Q8_K_XL | High | 10.08 GB | — |
| Q6_K | High | 6.26 GB | — |
| Q6_K_XL | High | 6.98 GB | — |
| Q5_K_M | Medium | 5.45 GB | — |
| Q5_K_S | Medium | 5.33 GB | — |
| Q5_K_XL | Medium | 5.48 GB | — |
| Q4_K_M | Medium | 4.68 GB | — |
| Q4_K_S | Medium | 4.47 GB | — |
| Q4_K_XL | Medium | 4.77 GB | — |
| Q4_0 | Medium | 4.46 GB | — |
| Q4_1 | Medium | 4.89 GB | — |
| Q3_K_M | Low | 3.84 GB | — |
| Q3_K_S | Low | 3.51 GB | — |
| Q3_K_XL | Low | 4.02 GB | — |
| Q2_K | Low | 3.06 GB | — |
| Q2_K_L | Low | 3.19 GB | — |
| Q2_K_XL | Low | 3.26 GB | — |
Last updated: March 5, 2026