DeepSeek R1 Distill Qwen 7B
DeepSeek
Code Multilingual Thinking Tool Calls
DeepSeek R1 Distill Qwen 7B is a 7.62-billion-parameter dense transformer from DeepSeek, distilled from the R1 reasoning model into a compact Qwen-based architecture. It brings chain-of-thought reasoning and thinking capabilities to the 7B parameter class, performing above its weight on math and logic tasks. Compared to standard 7B instruct models, it offers noticeably stronger structured reasoning. With a 128K context window and nine supported languages, it fits on a single consumer GPU and quantizes well for efficient self-hosted deployment.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| FP16 | Full precision | 14.19 GB | — |
| Q8_0 | High | 7.54 GB | — |
| Q6_K | High | 5.82 GB | — |
| Q5_K_M | Medium | 5.07 GB | — |
| Q4_K_M | Medium | 4.36 GB | — |
| Q3_K_M | Low | 3.55 GB | — |
| Q2_K | Low | 2.81 GB | — |
| Q2_K_L | Low | 2.93 GB | — |
Last updated: March 5, 2026