DeepSeek V3.2
DeepSeek
Code Multilingual Thinking Tool Calls
DeepSeek V3.2 is a 685-billion-parameter Mixture-of-Experts model from DeepSeek, activating 8 of 256 experts per token plus one shared expert. It delivers frontier-level performance on code generation, reasoning, and multilingual tasks while using far fewer active parameters per inference step than comparably sized dense models. The model supports thinking mode, tool calling, and extensive multilingual support. With a 163K context window, it requires multi-GPU or distributed setups but quantizes down to Q2 levels for reduced VRAM footprint.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| Q8_0 | High | 664.33 GB | — |
| Q8_K_XL | High | 726.67 GB | — |
| Q6_K | High | 513.41 GB | — |
| Q6_K_XL | High | 534.55 GB | — |
| Q5_K_M | Medium | 443.48 GB | — |
| Q5_K_S | Medium | 430.87 GB | — |
| Q5_K_XL | Medium | 448.8 GB | — |
| Q4_K_M | Medium | 377.56 GB | — |
| Q4_K_S | Medium | 354.89 GB | — |
| Q4_K_XL | Medium | 379.8 GB | — |
| Q4_0 | Medium | 353.99 GB | — |
| Q4_1 | Medium | 391.86 GB | — |
| Q3_K_M | Low | 298.21 GB | — |
| Q3_K_S | Low | 270.49 GB | — |
| Q3_K_XL | Low | 298.99 GB | — |
| Q2_K | Low | 228.52 GB | — |
| Q2_K_L | Low | 228.73 GB | — |
| Q2_K_XL | Low | 229.68 GB | — |
Last updated: March 19, 2026