Skip to content

Granite 4.0 Tiny Base Preview

IBM
Code Multilingual

Granite 4.0 Tiny Base Preview is a 6.67-billion-parameter fine-grained Mixture-of-Experts model from IBM, designed for efficient instruction following and code generation. With 62 experts and 6 active per token, it delivers strong reasoning at a fraction of the compute cost of dense models its size. The model supports code-related tasks and multilingual conversation across 12 languages including English, Chinese, and Japanese. A 128K context window with flash attention enables long-document workflows, and it quantizes well to GGUF for lightweight self-hosted deployments.

Hardware Configuration

Optional — for precise deployment recommendations
Quantization Quality Size Fit
FP16 Full precision 12.44 GB
Q8_0 High 6.62 GB
Q6_K High 5.11 GB
Q5_K_M Medium 4.42 GB
Q5_K_S Medium 4.3 GB
Q4_K_M Medium 3.77 GB
Q4_K_S Medium 3.56 GB
Q4_0 Medium 3.53 GB
Q4_1 Medium 3.91 GB
Q3_K_M Low 2.98 GB
Q3_K_S Low 2.71 GB
Q2_K Low 2.28 GB
Q3_K_L Low 3.2 GB
Q5_0 Low 4.3 GB
Q5_1 Low 4.68 GB
Last updated: March 5, 2026