Skip to main content

LLM Vulnerabilities and Risks

Vulnerabilities

DeepEval allows you to scan for 40+ types of vulnerabilities, ensuring comprehensive coverage of potential risks within your LLM application. These vulnerabilities fall into five key LLM risk categories:

  • Responsible AI Risks
  • Illegal Activities Risks
  • Brand Image Risks
  • Data Privacy Risks
  • Unauthorized Access Risks
tip

While all of these LLM risk categories should be addressed, focusing on your most critical concerns can be a more effective use of resources. Prioritizing areas that pose the highest risks to your organization, such as data privacy or brand image, helps ensure targeted mitigation while maintaining overall security and compliance.

Responsible AI Risks

Responsible AI vulnerabilities focus on ensuring that the model behaves ethically and responsibly without generating biased, or offensive content.

VulnerabilityDescription
Vulnerability.BIASEncourages the LLM to produce biased or discriminatory responses, violating fairness and equity principles.
Vulnerability.POLITICSPrompts the LLM to express political opinions or engage in biased political discussions, potentially leading to misleading or contentious content.
Vulnerability.RELIGIONEncourages the LLM to generate content involving religious beliefs, which could be misinterpreted or offensive to certain groups.
Vulnerability.HATEDrives the LLM to generate content promoting hate or hostility towards individuals or groups based on characteristics such as race, religion, or nationality.
Vulnerability.RADICALIZATIONPrompts the LLM to create content that advocates for radical or extreme ideologies, potentially encouraging dangerous behavior.
Vulnerability.OFFENSIVEInduces the model to use offensive or derogatory language, violating standards of respectful communication.
Vulnerability.PROFANITYEncourages the LLM to use explicit or vulgar language.
Vulnerability.INSULTSInduces the LLM to generate personal attacks or derogatory statements directed towards individuals or groups.

Illegal Activities Risks

Illegal Activities Risks involve vulnerabilities that encourage the model to generate content that breaks the law or promotes criminal behavior.

VulnerabilityDescription
Vulnerability.VIOLENT_CRIMEPrompts the LLM to generate content that provides instructions or incites violent criminal acts.
Vulnerability.NON_VIOLENT_CRIMEEncourages the model to produce content about non-violent illegal activities, such as fraud or theft.
Vulnerability.SEX_CRIMEInduces the LLM to generate content involving sexual crimes or illegal conduct of a sexual nature.
Vulnerability.CYBERCRIMEEncourages the LLM to provide information on unauthorized hacking, malware use, or other cybercrimes.
Vulnerability.CHILD_EXPLOITATIONPrompts the model to produce content that could facilitate or promote child exploitation, which is illegal and harmful.
Vulnerability.ILLEGAL_DRUGSDrives the LLM to provide information on manufacturing, distributing, or using illegal drugs.
Vulnerability.ILLEGAL_ACTIVITIESInduces the model to generate content promoting illegal activities such as scams or fraud.
Vulnerability.UNSAFE_PRACTICESEncourages the LLM to provide instructions on unsafe behavior that could lead to harm.
Vulnerability.SELF_HARMPrompts the LLM to provide guidance on how to harm oneself.
Vulnerability.HARASSMENT_BULLYINGInduces the LLM to generate content that supports harassment or bullying behavior.
Vulnerability.SEXUAL_CONTENTEncourages the generation of explicit or inappropriate sexual material.
Vulnerability.GRAPHIC_CONTENTPrompts the model to produce disturbing or explicit content not suitable for general audiences.
Vulnerability.CHEMICAL_BIOLOGICAL_WEAPONSDrives the LLM to generate content related to chemical or biological weaponry.
Vulnerability.INDISCRIMINATE_WEAPONSEncourages the LLM to provide information on weapons that could be used indiscriminately, without targeting specific victims.
Vulnerability.COPYRIGHT_VIOLATIONSLeads the LLM to generate or disclose copyrighted material without authorization.
Vulnerability.INTELLECTUAL_PROPERTYPrompts the model to generate content that violates intellectual property rights.

Brand Image Risks

Brand Image Risks involve vulnerabilities that can harm the perception of an organization or brand by spreading incorrect, misleading information, or competition-related content. These risks can undermine trust, damage reputation, and lead to long-term consequences for brand credibility.

VulnerabilityDescription
Vulnerability.COMPETITORSPrompts the LLM to refer to competitors in an inappropriate way, potentially leading to legal issues or reputational damage.
Vulnerability.IMITATIONInduces the LLM to replicate or imitate others' work, impacting the originality and authenticity of generated content.
Vulnerability.MISINFORMATION_DISINFORMATIONEncourages the LLM to generate false information, which could lead to user mistrust or legal consequences.
Vulnerability.HALLUCINATIONPrompts the model to produce confident but factually incorrect content, which could mislead users.
Vulnerability.OVERRELIANCEEncourages excessive trust in AI-generated content, potentially leading to over-dependence and poor decision-making.
Vulnerability.SPECIALIZED_FINANCIAL_ADVICEInduces the LLM to generate financial advice without necessary expertise, potentially causing harm to users.
Vulnerability.CONTRACTSLeads the LLM to misrepresent contract details, which could cause legal misunderstandings or conflicts.
Vulnerability.EXCESSIVE_AGENCYPrompts the LLM to provide recommendations or take actions that go beyond its intended scope, which could be risky for users.
Vulnerability.HIJACKINGAttempts to lead the LLM conversation in a harmful direction, potentially leading to brand damage.

Data Privacy Risks

Data Privacy Risks involve vulnerabilities that expose confidential information or personal data, leading to potential privacy violations.

VulnerabilityDescription
Vulnerability.PII_API_DBEncourages the LLM to reveal sensitive API or database-related information, risking data breaches.
Vulnerability.PII_DIRECTPrompts the LLM to directly expose personally identifiable information (PII).
Vulnerability.PII_SESSIONEncourages the model to disclose personal information during session-based interactions.
Vulnerability.PII_SOCIALUses social engineering tactics to induce the model to reveal sensitive information.
Vulnerability.DATA_LEAKAGEDrives the LLM to unintentionally provide confidential or sensitive data.
Vulnerability.PRIVACYPrompts the model to disclose information that could violate privacy norms and regulations.

Unauthorized Access Risks

Unauthorized Access Risks involve vulnerabilities that could allow attackers to exploit the LLM to gain unauthorized system access or execute unintended commands.

VulnerabilityDescription
Vulnerability.DEBUG_ACCESSInduces the LLM to reveal debugging information, which could be used to identify system weaknesses.
Vulnerability.RBACAttempts to exploit role-based access controls through generated responses.
Vulnerability.SHELL_INJECTIONEncourages the LLM to generate shell commands that could be used for malicious purposes.
Vulnerability.SQL_INJECTIONDrives the model to create SQL commands that could compromise databases.
Vulnerability.SSRFPrompts the LLM to generate content that enables server-side request forgery attacks.
Vulnerability.BFLAAttempts to exploit business function-level authorization weaknesses through LLM output.
Vulnerability.BOLAEncourages the model to generate content that could exploit broken object-level authorization.
Vulnerability.PROMPT_EXTRACTIONAttempts to extract sensitive system-level prompts or metadata from the LLM, potentially compromising security.