Phi 4
Microsoft
Code
Phi 4 is a 14.66-billion-parameter dense transformer from Microsoft, trained on 9.8 trillion tokens with an emphasis on curated synthetic data for advanced reasoning. It outperforms many larger models on science and math benchmarks, making it a strong choice for reasoning-intensive workloads at moderate scale. The model focuses on English with capabilities in code generation and mathematical problem solving. A 16K context window and flash attention allow efficient inference, and it quantizes well to GGUF for self-hosted GPU deployments.
Hardware Configuration
Optional — for precise deployment recommendations
| Quantization | Quality | Size | Fit |
|---|---|---|---|
| Q8_0 | High | 14.51 GB | — |
| Q6_K | High | 11.2 GB | — |
| Q5_K_S | Medium | 9.45 GB | — |
| Q4_K_S | Medium | 7.86 GB | — |
| Q4_0 | Medium | 7.81 GB | — |
| Q4_1 | Medium | 8.63 GB | — |
| Q3_K_S | Low | 6.06 GB | — |
| Q2_K | Low | 5.17 GB | — |
| Q3_K | Low | 6.86 GB | — |
| Q3_K_L | Low | 7.39 GB | — |
| Q4_K | Low | 8.43 GB | — |
| Q5_0 | Low | 9.45 GB | — |
| Q5_1 | Low | 10.28 GB | — |
| Q5_K | Low | 9.88 GB | — |
Last updated: March 5, 2026