What is the OWASP LLM Top 10?

The OWASP Top 10 for Large Language Model Applications is a list of the most critical security risks for applications that use LLMs (like GPT-4, Claude, Gemini, Llama). It covers prompt injection, insecure output handling, training data poisoning, model denial of service, supply chain risks, sensitive information disclosure, insecure plugin design, excessive agency, overreliance, and model theft. Published by the OWASP Foundation, it is the canonical taxonomy security teams use to scope LLM application security reviews.

How do I secure an LLM application?

Securing an LLM application means defending against ten distinct risk classes (the OWASP LLM Top 10). The minimum baseline: validate and constrain prompt inputs to defend LLM01 (prompt injection); sanitize and encode all LLM outputs before passing them to downstream systems to defend LLM02; pin model and dependency versions to defend LLM05 (supply chain); never let an LLM see secrets it does not need (LLM06 sensitive information disclosure); apply least-privilege to any tool, plugin, or agent the LLM can invoke (LLM07, LLM08); and treat every LLM output as untrusted user input. Bachao.AI runs an AI-native security review of LLM applications mapped to all 10 risks.

What is prompt injection (LLM01)?

Prompt injection is when an attacker manipulates an LLM's behaviour by inserting malicious instructions into the prompt — either directly (typing 'ignore previous instructions and...') or indirectly (planting instructions in a webpage, email, or document the LLM will read). It is OWASP's number-one ranked LLM risk because traditional input validation does not work — LLMs are trained to follow natural-language instructions. Defenses include input/output filtering, separating system and user prompts, treating LLM outputs as untrusted, and constraining the agent's tool access.

What is indirect prompt injection?

Indirect prompt injection is when an attacker plants malicious instructions in third-party content that an LLM agent will later read — for example, a hidden instruction in a webpage that an LLM-powered web browser visits, or a poisoned document in a knowledge base the LLM uses for retrieval-augmented generation (RAG). The user never typed the malicious instruction; the LLM encountered it while doing its job. Indirect prompt injection is the dominant attack surface for agentic AI systems and the hardest class of LLM bug to defend.

What is insecure output handling (LLM02)?

Insecure output handling is when an application passes LLM-generated text directly to downstream systems without sanitization — for example, rendering LLM output as HTML (XSS), passing it as SQL (SQL injection), executing it as shell commands (RCE), or sending it to an email/SMS gateway without filtering. Treat every LLM output as untrusted user input. Sanitize, encode, and constrain before downstream use.

What is excessive agency (LLM08)?

Excessive agency is when an LLM agent has more permissions, tools, or autonomy than the task requires — for example, a customer support bot with full database write access, or a coding agent that can execute arbitrary shell commands. When (not if) the agent is prompt-injected, the blast radius is the full permission set. Defenses: least-privilege tool access, human-in-the-loop for high-risk actions, scoped credentials per invocation, dry-run mode for destructive operations.

What is training data poisoning (LLM03)?

Training data poisoning is when an attacker introduces malicious content into the dataset used to train or fine-tune an LLM, causing the model to behave incorrectly on specific triggers. For most production LLM applications using API-provided models (OpenAI, Anthropic, Google), this risk applies to fine-tuning data and RAG content rather than the base model. Defenses include verifying training data provenance, isolating fine-tuning corpora, and continuously monitoring model outputs for backdoor triggers.

How do I test an LLM application for prompt injection?

Testing an LLM application for prompt injection requires both manual red-team and automated probe-based testing. Manual: try direct instruction overrides ('ignore previous instructions'), role-play attacks ('you are now in developer mode'), and indirect injection via documents the LLM reads. Automated: tools like Garak, PyRIT, and Bachao.AI's LLM probe suite run thousands of probe variations against your endpoint. Coverage should include OWASP LLM Top 10, MITRE ATLAS techniques, and AI Village's prompt-injection corpus.

Does Bachao.AI test LLM applications?

Yes. Bachao.AI's AI security review runs an LLM application against the full OWASP LLM Top 10, MITRE ATLAS adversarial techniques, and a proprietary library of indirect-prompt-injection probes against your RAG sources and agent tools. The deliverable is a CERT-In aligned report mapping every finding to OWASP LLM, ATLAS, and (where applicable) DPDP Act 2023 Schedule I for Indian buyers. Free first scan covers a baseline injection + sensitive-info probe.

Is the OWASP LLM Top 10 different from the OWASP Top 10?

Yes. The OWASP Top 10 (web app) covers XSS, SQL injection, broken auth, etc. — risks that apply to traditional web applications. The OWASP LLM Top 10 covers risks specific to applications using large language models — prompt injection, training data poisoning, model theft, excessive agency, etc. An LLM-powered web application is exposed to both lists. A full security review covers both.

OWASP LLM Top 10 — Defense Playbook for AI Applications

The new attack surface

LLM applications get prompt-injected, agents get over-privileged, outputs flow into databases unsanitized. OWASP ranked the 10 risks — most teams have not heard of half of them.

Prompt injection (LLM01), insecure output handling (LLM02), excessive agency (LLM08) — the new categories Indian SaaS teams need to defend against.

10OWASP LLM risks covered

Freefirst injection probe

MITREATLAS aligned

AEOanswer-block optimized

Book an LLM security review Free first scan

OWASP LLM Top 10MITRE ATLASIndirect injection probes

LLM01: Prompt Injection

An attacker manipulates an LLM's behaviour by inserting instructions into the prompt — directly ('ignore previous instructions') or indirectly (a malicious instruction planted in a webpage, document, or RAG corpus the LLM reads). Traditional input validation fails because LLMs are trained to follow natural-language instructions. Defenses: separate system and user prompts using structured APIs, sanitize and filter inputs, treat external content as untrusted, constrain agent tool access.

LLM02: Insecure Output Handling

An application passes LLM output to a downstream system without sanitization. Rendering as HTML → XSS. Passing as SQL → SQL injection. Executing as shell → RCE. Sending as email → spam/phishing. Defense: treat every LLM output as untrusted user input. Encode, parameterise, validate against an allow-list before downstream use.

LLM03: Training Data Poisoning

An attacker introduces malicious content into the dataset used to train or fine-tune an LLM, causing backdoor behaviour on specific triggers. Most production apps using OpenAI / Anthropic / Google API models inherit base-model defenses, but the risk applies to your fine-tuning corpus and your RAG sources. Defense: verify training data provenance, isolate fine-tuning sets, monitor outputs for trigger-based anomalies.

LLM04: Model Denial of Service

An attacker submits prompts that consume disproportionate compute, hitting context-window limits, triggering loops, or running up an unbounded bill. Defense: per-user prompt rate limits, max-token enforcement, timeout budgets per request, billing alerts on cost-per-user anomalies.

LLM05: Supply Chain Vulnerabilities

Risk in the LLM dependency tree — frameworks (LangChain, LlamaIndex), vector DB clients, embedding libraries, model artifacts. A compromised library shipped through normal package management can exfiltrate prompts or steer outputs. Defense: pin model versions, audit dependency tree, subscribe to provider security advisories, use SBOM for AI pipelines.

LLM06: Sensitive Information Disclosure

The LLM reveals secrets, PII, training data, or system prompts because they were included in the prompt or fine-tuning. Defense: never send secrets into the prompt unless absolutely necessary, redact at the boundary, classify RAG documents and gate retrieval by user permission, audit logs for accidental disclosure.

LLM07: Insecure Plugin Design

A plugin or tool the LLM can invoke has weak input validation, exposes admin-level operations, or trusts LLM-generated parameters without verification. Defense: tight tool input schemas, server-side validation of every tool invocation, no admin operations exposed to LLM-callable surface, audit every tool call.

LLM08: Excessive Agency

An LLM agent has more permissions, tools, or autonomy than the task requires. When (not if) it is prompt-injected, the blast radius is the full permission set. Defense: least-privilege tool access, human-in-the-loop for high-risk actions, scoped credentials per invocation, dry-run mode for destructive operations, default-deny on unrecognised intents.

LLM09: Overreliance

Users or downstream systems trust LLM output as authoritative when it should be reviewed. Hallucinations enter business logic, code, medical advice, legal text. Defense: human-in-the-loop checkpoints for high-stakes outputs, display confidence + provenance, never present LLM output without traceable sourcing for regulated domains.

LLM10: Model Theft

An attacker exfiltrates a proprietary model — weights, fine-tuning data, system prompts — through API abuse, side-channel attacks, or insider access. Defense: rate-limit + monitor for distillation patterns, segment model artifacts by access tier, encrypt at rest, audit every model-export action, watermark outputs where feasible.

How to test an LLM application for these risks

Testing LLM applications is hybrid manual + automated. Manual red-team explores novel prompt vectors — role-play, multi-turn escalation, indirect injection via documents the agent reads. Automated probes — Garak, PyRIT, Bachao.AI's LLM probe suite — run thousands of variations against your endpoint and report regressions on every deploy. Coverage should span OWASP LLM Top 10, MITRE ATLAS, and AI Village's indirect-prompt-injection corpus. Treat findings as standard P1/P2 security bugs, with a defined SLA for remediation.

Get an LLM security review for your AI application

Free first probe covers baseline prompt injection + sensitive-info disclosure. Full review maps every finding to OWASP LLM + MITRE ATLAS.

Book a free probe Talk to founder

Explore more products

Bachao.AI covers your entire security surface — from code to cloud to compliance.

AI VAPT Scanner

Automated penetration testing for web apps and APIs. Results in under 2 hours.

Learn more →

API Security Testing

OWASP API Top 10 coverage for REST and GraphQL endpoints.

Learn more →

Cloud Security Audit

AWS, Azure & GCP misconfiguration detection with DPDP mapping.

Learn more →

The new attack surface

LLM applications get prompt-injected, agents get over-privileged, outputs flow into databases unsanitized. OWASP ranked the 10 risks — most teams have not heard of half of them.

Prompt injection (LLM01), insecure output handling (LLM02), excessive agency (LLM08) — the new categories Indian SaaS teams need to defend against.

10OWASP LLM risks covered

Freefirst injection probe

MITREATLAS aligned

AEOanswer-block optimized

Book an LLM security review Free first scan

OWASP LLM Top 10MITRE ATLASIndirect injection probes

LLM01: Prompt Injection

LLM02: Insecure Output Handling

LLM03: Training Data Poisoning

LLM04: Model Denial of Service

LLM05: Supply Chain Vulnerabilities

LLM06: Sensitive Information Disclosure

LLM07: Insecure Plugin Design

LLM08: Excessive Agency

LLM09: Overreliance

LLM10: Model Theft

How to test an LLM application for these risks

Get an LLM security review for your AI application

Free first probe covers baseline prompt injection + sensitive-info disclosure. Full review maps every finding to OWASP LLM + MITRE ATLAS.

Book a free probe Talk to founder

Explore more products

Bachao.AI covers your entire security surface — from code to cloud to compliance.

OWASP LLM Top 10 — A Defense Playbook for AI Applications

LLM01: Prompt Injection

LLM02: Insecure Output Handling

LLM03: Training Data Poisoning

LLM04: Model Denial of Service

LLM05: Supply Chain Vulnerabilities

LLM06: Sensitive Information Disclosure

LLM07: Insecure Plugin Design

LLM08: Excessive Agency

LLM09: Overreliance

LLM10: Model Theft

How to test an LLM application for these risks

Get an LLM security review for your AI application

Explore more products

AI VAPT Scanner

API Security Testing

Cloud Security Audit

OWASP LLM Top 10 — A Defense Playbook for AI Applications

LLM01: Prompt Injection

LLM02: Insecure Output Handling

LLM03: Training Data Poisoning

LLM04: Model Denial of Service

LLM05: Supply Chain Vulnerabilities

LLM06: Sensitive Information Disclosure

LLM07: Insecure Plugin Design

LLM08: Excessive Agency

LLM09: Overreliance

LLM10: Model Theft

How to test an LLM application for these risks

Get an LLM security review for your AI application

Explore more products

AI VAPT Scanner

API Security Testing

Cloud Security Audit