AI Glossary
Small language model (SLM)
Small Language Model, SLM, small LLM
A small language model (SLM) is a compact language model that can run on modest hardware or locally. In exchange for fewer parameters, it offers lower cost and greater control than large models.
- Has far fewer parameters than frontier LLMs, at the cost of breadth of general knowledge.
- Can run locally or on-premise, on a laptop or a single graphics card.
- Excels at narrow, repeatable tasks, especially after fine-tuning on a company's own data.
A small language model (Small Language Model, SLM) is a compact variant of a large language model, designed so that it can run on modest hardware — a single graphics card, a company server, sometimes even a laptop. The line between an SLM and an LLM is a matter of convention and shifts over time, but the idea is models with one or two orders of magnitude fewer parameters than the largest systems.
Unlike a large model, which prioritizes broad general knowledge and the ability to handle arbitrary tasks, a small model gives up some of that versatility in exchange for low cost, speed and the ability to operate without an external API. Its effectiveness is boosted further by fine-tuning on a specific company's data and by quantization, which reduces hardware requirements even more.
From a deployment perspective, an SLM is the natural choice wherever data privacy, predictable inference cost and independence from a vendor matter. It works well for narrow, repeatable tasks — classifying documents, extracting data, handling routine queries — where broad general knowledge is not needed and the priorities are control and cost across a high volume of calls.
Related terms
Related articles