OWASP Top 10 for LLMs
The OWASP Top 10 for Large Language Models (LLMs) is a comprehensive list of the most critical security risks associated with LLM applications. This resource is designed to help developers, security professionals, and organizations identify, understand, and mitigate vulnerabilities in these systems, ensuring safer and more robust deployments in real-world applications.
You can detect OWASP risks by utilizing DeepEval's 50+ vulnerability_type
options and 10+ attack_enhancement
strategies when scanning your LLM application.
This top 10 risks in 2025 are:
- Prompt Injection
- Sensitive Information Disclosure
- Supply Chain
- Data and Model Poisoning
- Improper Output Handling
- Excessive Agency
- System Prompt Leakage
- Vector and Embedding Weaknesses
- Misinformation
- Unbounded Consumption
Prompt Injection
Sensitive Information Disclosure involves crafting inputs that trick a model into revealing private, confidential, or sensitive data it has been exposed to during training or through operational settings. This may include passwords, API keys, or personal user data, violating privacy and security standards.
In DeepEval, you can apply prompt injection to any baseline attack for any vulnerability by including AttackEnhancement.PROMPT_INJECTION
in your attack enhancements. Here’s how:
from deepeval.red_teaming import AttackEnhancement
...
results = red_teamer.scan(
target_model=target_model,
attacks_per_vulnerability_type=5,
vulnerabilities=vulnerabilities,
)
print("Red Teaming Results: ", results)
If you're curious about how baseline attacks are generated in DeepEval before they are enhanced, read this section.
Sensitive Information Disclosure
Sensitive Information Disclosure involves crafting inputs that trick a model into revealing private, confidential, or sensitive data it has been exposed to during training or through operational settings. This may include passwords, API keys, or personal user data, violating privacy and security standards.
Unlike prompt injection, testing for sensitive information disclosure focuses on specifying certain vulnerabilities such as PromptLeakage
, PIILeakage
, and IntellectualProperty
.
from deepeval.vulnerability import PIILeakage, PromptLeakage, IntellectualProperty
from deepeval.vulnerability.PIILeakage import PIILeakageType
from deepeval.vulnerability.PromptLeakage import PromptLeakageType
from deepeval.vulnerability.IntellectualProperty import IntellectualPropertyType
# Sensitive information vulnerabilities available in DeepEval
sensitive_information_vulnerabilities = [
PIILeakage(types=[
PIILeakageType.DATABASE_ACCESS,
PIILeakageType.DIRECT,
PIILeakageType.SESSION_LEAK,
PIILeakageType.SOCIAL_MANIPULATION
]),
PromptLeakage(types=[
PromptLeakageType.SECRETS_AND_CREDENTIALS,
PromptLeakageType.INSTRUCTIONS,
PromptLeakageType.GUARDS,
PromptLeakageType.PERMISSIONS_AND_ROLES
]),
IntellectualProperty(types=[
IntellectualPropertyType.PATENT_DISCLOSURE
])
]
To use them for scanning, simply supply the list of vulnerabilities you defined.
# Use vulnerabilities in a red-teaming scan
results = red_teamer.scan(
target_model=target_model,
attacks_per_vulnerability_type=10,
vulnerabilities=sensitive_information_vulnerabilities,
)
Supply Chain
LLM Supply Chains are susceptible to various vulnerabilities, which can affect the integrity of training data, models, and deployment platforms. These risks can result in biased outputs, security breaches, or system failures.
Deepeval does not directly detect vulnerabilities in specific components of an LLM supply chain, such as outdated packages or licensing issues. However, it can still help evaluate the impact these defective components cause, such as security leaks, misinformation, and excessive functionalities. You can read more about these risks in other sections on this page.
Data and Model Poisoning
Data and Model Poisoning involves manipulating pre-training, fine-tuning, or embedding data to introduce vulnerabilities. These manipulations can compromise a model's ethical behavior, leading to harmful outputs such as bias or toxicity.
To test for training data poisoning, you can use the following vulnerabilities in Deepeval: Bias
, Toxicity
, IllegalActivity
, GraphicContent
, and PersonalSafety
.
from deepeval.vulnerability import Bias, Misinformation, Toxicity, IllegalActivity, GraphicContent, PersonalSafety
from deepeval.vulnerability.bias import BiasType
from deepeval.vulnerability.toxicity import ToxicityType
from deepeval.vulnerability.illegal_activity import IllegalActivity
from deepeval.vulnerability.graphic_content import GraphicContentType
from deepeval.vulnerability.personal_safety import PersonalSafetyType
# Sensitive information vulnerabilities available in DeepEval
sensitive_information_vulnerabilities = [
Bias(types=[type for type in BiasType]),
Toxicity(types=[type for type in ToxicityType]),
IllegalActivity(types=[type for type in IllegalActivityType]),
GraphicContent(types=[type for type in GraphicContentType]),
PersonalSafety(types=[type for type in PersonalSafetyType])
]
Improper Output Handling
Improper Output Handling occurs when the model generates outputs that are not adequately filtered, sanitized, or controlled, leading to harmful, incorrect, or irrelevant results. This vulnerability can cause confusion, security risks, or damage to reputation.
Excessive Agency
Excessive Agency involves granting a model too much autonomy, which can lead to unintended or harmful consequences. This vulnerability occurs when a model is given more control or decision-making power than necessary, which could result in actions that are outside the intended scope.
Unlike traditional vulnerabilities, excessive agency focuses on how much decision-making power is granted to the model, including aspects like Excessive Functionality
, Excessive Permissions
, and Excessive Autonomy
.
from deepeval.vulnerability import ExcessiveAgency
from deepeval.vulnerability.excessive_agency import ExcessiveAgencyType
# Excessive agency vulnerabilities available in DeepEval
excessive_agency_vulnerabilities = [
ExcessiveAgency(types=[
ExcessiveAgencyType.FUNCTIONALITY,
ExcessiveAgencyType.PERMISSIONS,
ExcessiveAgencyType.AUTONOMY
])
]
# Use vulnerabilities in a red-teaming scan
results = red_teamer.scan(
target_model=target_model,
attacks_per_vulnerability_type=10,
vulnerabilities=excessive_agency_vulnerabilities,
)
System Prompt Leakage
System Prompt Leakage involves an attacker gaining access to internal system prompts that guide the model’s behavior. This information can help attackers manipulate the model's output in ways that bypass restrictions and controls.
System prompt leakage focuses on the exposure of internal model prompts, such as Secrets and Credentials
, Instructions
, Guards
, and Permissions and Roles
, which can be exploited by attackers.
from deepeval.vulnerability import PromptLeakage
from deepeval.vulnerability.PromptLeakage import PromptLeakageType
# System prompt leakage vulnerabilities available in DeepEval
system_prompt_leakage_vulnerabilities = [
PromptLeakage(types=[
PromptLeakageType.SECRETS_AND_CREDENTIALS,
PromptLeakageType.INSTRUCTIONS,
PromptLeakageType.GUARDS,
PromptLeakageType.PERMISSIONS_AND_ROLES
])
]
# Use vulnerabilities in a red-teaming scan
results = red_teamer.scan(
target_model=target_model,
attacks_per_vulnerability_type=10,
vulnerabilities=system_prompt_leakage_vulnerabilities,
)
Vector and Embedding Weaknesses
Vector and Embedding Weaknesses occur when models use flawed, biased, or incomplete embeddings, which can impact the quality of generated outputs. These vulnerabilities can arise when the vector representations used to train a model are based on insufficient or biased data, leading to poor performance or biased outputs during inference.
Misinformation
Misinformation occurs when LLMs produce false or misleading information that appears credible. This vulnerability can lead to security breaches, reputational damage, and legal liability.
Testing for misinformation in LLMs involves identifying types of misinformation, such as Factual Errors
, Unsupported Claims
, and Expertise Misrepresentation
, as well as understanding the competitive context that may lead to misrepresented facts.
from deepeval.vulnerability import Misinformation, Competition
from deepeval.vulnerability.misinformation import MisinformationType
from deepeval.vulnerability.competition import CompetitionType
# Misinformation vulnerabilities available in DeepEval
misinformation_vulnerabilities = [
Misinformation(types=[
MisinformationType.FACTUAL_ERRORS,
MisinformationType.UNSUPPORTED_CLAIMS,
MisinformationType.EXPERTISE_MISREPRESENTATION
]),
Competition(types=[
CompetitionType.DISCREDITATION
])
]
# Use vulnerabilities in a red-teaming scan
results = red_teamer.scan(
target_model=target_model,
attacks_per_vulnerability_type=10,
vulnerabilities=misinformation_vulnerabilities,
)
Unbounded Consumption
Unbounded Consumption involves scenarios where the model generates an excessive or infinite number of requests, queries, or outputs without limits. This can lead to performance issues, system crashes, or increased costs due to uncontrolled resource consumption.