DeepSeek R1 Distill Qwen 14B
DeepSeek
Code Multilingual Thinking Tool Calls
DeepSeek R1 Distill Qwen 14B is a 14.77-billion-parameter dense transformer from DeepSeek, distilled from the R1 reasoning model into a Qwen2.5-based architecture. It brings chain-of-thought reasoning to the 14B class, outperforming comparable instruct models on math and coding benchmarks through reasoning distillation. It supports code generation, tool calls, and nine languages including English, Chinese, and major European languages. With a 128K context window and flash attention, it fits on a single mid-range GPU and quantizes efficiently to GGUF for self-hosted deployment.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| FP16 | Full precision | 27.52 GB | — |
| Q8_0 | High | 14.62 GB | — |
| Q6_K | High | 11.29 GB | — |
| Q5_K_M | Medium | 9.79 GB | — |
| Q4_K_M | Medium | 8.37 GB | — |
| Q3_K_M | Low | 6.84 GB | — |
| Q2_K | Low | 5.37 GB | — |
| Q2_K_L | Low | 5.54 GB | — |
Last updated: March 5, 2026