Skip to content

DeepSeek V3.1

DeepSeek
Code Multilingual Thinking Tool Calls

DeepSeek V3.1 is a 685-billion-parameter Mixture-of-Experts model from DeepSeek, activating 8 of 256 experts per token plus one shared expert. It delivers frontier-level performance on code generation, reasoning, and multilingual tasks while using far fewer active parameters per inference step than comparably sized dense models. The model supports thinking mode, tool calling, and nine languages. With a 160K context window, it requires multi-GPU or distributed setups but quantizes down to Q2 levels for reduced VRAM footprint.

Hardware Configuration

Optional — for precise deployment recommendations
Quantization Quality Size Fit
Q8_0 High 664.33 GB
Q8_K_XL High 726.99 GB
Q6_K High 513.41 GB
Q6_K_XL High 535.03 GB
Q5_K_M Medium 443.48 GB
Q5_K_S Medium 430.87 GB
Q5_K_XL Medium 451.3 GB
Q4_K_M Medium 377.56 GB
Q4_K_S Medium 354.9 GB
Q4_K_XL Medium 360.33 GB
Q4_0 Medium 354 GB
Q4_1 Medium 391.86 GB
Q3_K_M Low 298.46 GB
Q3_K_S Low 270.49 GB
Q3_K_XL Low 279.43 GB
Q2_K Low 228.82 GB
Q2_K_L Low 229.02 GB
Q2_K_XL Low 238.17 GB
Last updated: March 5, 2026