Aurora AITell us your case

Offering

ServicesProductsCase studies

For whom

Private EquityEnterpriseSMB
ServicesProductsCase studiesAboutBlogContact

Knowledge base

Start hereWikiGlossaryGuides

AI Glossary

Transformer

transformer architecture, transformer model

A transformer is a neural network architecture built on the attention mechanism, which lets the model weigh the relationships between every token in a sequence. It is the foundation of today's large language models.

A transformer is a type of neural network first described in 2017. Its key component is the attention mechanism (attention), which lets the model assess how strongly each token relates to the others in the same sequence. This makes it far better at handling context and long-range dependencies.

Unlike older architectures, a transformer processes a sequence in parallel, which makes good use of modern hardware and makes very large models easier to train. It is precisely this property that made it the foundation of today's large language models.

Related terms

Related articles