LFM2.5 1.2B Thinking

Multilingual Thinking Tool Calls

LFM2.5 1.2B Thinking is a 1.17-billion-parameter hybrid convolution-attention model from Liquid AI, optimized for on-device chain-of-thought reasoning. It produces thinking traces before answering, delivering math and logic performance that rivals models with 40% more parameters. The model supports tool calling and eight languages including English, French, German, and Spanish. With a 128K context window and flash attention, it fits under 1 GB as a Q4 GGUF for efficient edge deployment on mobile and consumer hardware.

Hardware Configuration

Vendor

Product

Platform

Family

Model

VRAM

System RAM (GB) Optional — for precise deployment recommendations

Quantization	Quality	Size	Fit
FP16	Full precision	2.18 GB	—
Q8_0	High	1.16 GB	—
Q6_K	High	0.9 GB	—
Q5_K_M	Medium	0.79 GB	—
Q4_K_M	Medium	0.68 GB	—
Q4_0	Medium	0.65 GB	—

Last updated: March 24, 2026