Aurora AITell us your case

Offering

ServicesProductsCase studies

For whom

Private EquityEnterpriseSMB
ServicesProductsCase studiesAboutBlogContact

Knowledge base

Start hereWikiGlossaryGuides

AI Glossary

Prompt injection

prompt injection attack, prompt attack

Prompt injection is an attack in which hidden instructions in the input hijack a model's behavior and coax it into breaking its own rules. It is especially dangerous for agents that read content from external sources.

Prompt injection exploits the fact that a model treats all the text it is given as context. An attacker places a command inside data the model is going to read anyway — for example in the body of a web page, a document, or a message. The model may then treat the hidden instruction as its own and ignore its original rules.

The risk grows in systems with external content retrieval and in agents that have access to tools, because the outcome can be a data leak or the execution of an unwanted operation. There is no single complete safeguard. You limit the impact instead: separating instructions from data, applying guardrails, and narrowing the permissions the model holds.

Related terms

In guides