AI & LLM Penetration Testing
Attack simulations for secure and AI-Act-compliant AI systems
With the growing adoption of AI technologies, particularly Large Language Models (LLMs), securing these systems has become a critical priority. LLMs power a wide range of applications – from chatbots and RAG systems to autonomous AI agents. They introduce new attack vectors that are not covered by traditional penetration testing.
Our AI penetration tests specifically examine your LLM-integrated applications for vulnerabilities arising from the use of language models. We follow the OWASP Top 10 for LLM Applications and the Alliance for Cyber Security (BSI) guidelines for LLM penetration testing.
Why AI Pentesting?
Traditional security testing focused on deterministic systems such as servers or networks is insufficient for LLM-based applications. LLMs present unique challenges:
- Non-deterministic behavior: Due to their probabilistic nature, small variations in inputs can produce unpredictable and potentially harmful outputs.
- Novel attack vectors: Prompt injection, jailbreaks, data exfiltration through model outputs, and RAG system manipulation are threats that do not exist in traditional IT.
- Autonomy and agent systems: AI agents with tool access and decision-making autonomy significantly expand the attack surface.
- Regulatory requirements: The EU AI Act, NIS-2, and DORA mandate demonstrable security testing for AI systems.
Security Threats in AI & LLM
The threat landscape for LLM-integrated applications spans multiple categories:
Input-Based Attacks
- Prompt Injection: Malicious instructions in user inputs manipulate model behavior and bypass security policies.
- Indirect Prompt Injection: Third-party content (e.g., websites, documents) contains hidden instructions that the LLM incorrectly interprets as commands.
- Jailbreaks: Targeted prompts cause the model to disregard its trained behavioral rules.
Data and Context Attacks
- Information Leakage: Disclosure of confidential information, trade secrets, or personal data through model outputs.
- System Prompt Leakage: Exposure of internal system prompts and control logic, giving attackers valuable insights into the application architecture.
- RAG Manipulation: Manipulation of Retrieval-Augmented Generation systems through poisoned data sources or manipulated embeddings.
- Data Extraction: Analysis of model outputs to reconstruct training data information – such as personal data or business secrets.
Supply Chain and Model
- Data & Model Poisoning: Manipulation of training or fine-tuning data to influence model behavior in the attacker’s favor.
- Compromised Models: Use of models or adapters without integrity verification from untrusted sources.
Output and Downstream Systems
- Improper Output Handling: LLM-generated content is passed to downstream systems without validation, enabling XSS, SQL injection, or command injection.
- Hallucinations and Misinformation: Plausible but factually incorrect outputs can lead to flawed decisions.
Agent Systems and Operations
- Excessive Agency: Overly broad autonomy or permissions for AI agents enable unintended or malicious actions.
- Tool Misuse: Abuse of tools and APIs available to an agent.
- Unbounded Consumption: Uncontrolled inference calls leading to denial-of-service or excessive costs (“Denial of Wallet”).
OWASP Top 10 for LLM Applications
Our testing is aligned with the OWASP Top 10 for LLM Applications, the industry-recognized standard for LLM system security:
| No. | Risk | Description |
|---|---|---|
| LLM01 | Prompt Injection | Manipulation of model behavior through malicious inputs |
| LLM02 | Sensitive Information Disclosure | Exposure of confidential data through model outputs |
| LLM03 | Supply Chain | Compromised models, adapters, or dependencies |
| LLM04 | Data and Model Poisoning | Manipulation of training or fine-tuning data |
| LLM05 | Improper Output Handling | Missing validation of LLM outputs in downstream systems |
| LLM06 | Excessive Agency | Overly broad permissions or autonomy |
| LLM07 | System Prompt Leakage | Exposure of internal system prompts and configurations |
| LLM08 | Vector and Embedding Weakness | Manipulation of retrieval systems and embeddings |
| LLM09 | Misinformation | Generation of plausible but incorrect information |
| LLM10 | Unbounded Consumption | Uncontrolled resource consumption and DoS scenarios |
Our Testing Approach
Our AI penetration test follows a structured, four-phase process based on the Alliance for Cyber Security guidelines:
1. Business Understanding and Assessment
We develop a thorough understanding of your AI application, its deployment context, and the chosen deployment model. We analyze the architecture at infrastructure, API, and application levels to identify relevant attack vectors and define the scope.
2. Threat Modeling and Test Planning
Based on the assessment, we create a threat model using methods such as STRIDE or ATT&CK. From this, we derive a prioritized test strategy tailored to your specific risks and compliance requirements.
3. Testing and Documentation
We conduct targeted attacks on your AI system – from prompt injection to data exfiltration and adversarial manipulations. All tests are thoroughly documented to ensure reproducibility and traceability.
4. Evaluation, Risk Assessment, and Reporting
Identified vulnerabilities are evaluated by severity and business impact. You receive a detailed report including:
- Executive summary for decision-makers
- Technical report with reproduction steps and CVSS ratings
- Actionable recommendations for hardening your system
- Long-term strategy for continuous security testing
What We Test
Our AI penetration test covers all relevant aspects of LLM-integrated applications:
- LLM chatbots and assistants – Simple and complex chat interfaces
- RAG systems – Retrieval-Augmented Generation with external knowledge sources
- AI agent systems – Autonomous agents with tool access and decision logic
- API integrations – LLM APIs and their protection against misuse
- Workflows and pipelines – Chained LLM calls and automated processing workflows
- Microsoft Copilot and enterprise AI – M365 Copilot, Copilot Studio, and comparable products
Regulatory Context
AI penetration testing is not only technically necessary but increasingly required by regulation:
EU AI Act
The EU AI Regulation requires high-risk AI systems to implement a risk management system that includes penetration testing as evidence of accuracy, robustness, and cybersecurity (Art. 9 and Art. 15 AI Act).
NIS-2 Directive
Affected entities must implement technical, operational, and organizational measures for risk mitigation. Penetration testing is an established means of assessing the effectiveness of these measures.
DORA
The Digital Operational Resilience Act requires financial sector organizations to conduct continuous resilience testing of their ICT systems – including AI-based applications.
GDPR
Art. 32 GDPR requires regular assessment of the effectiveness of technical and organizational security measures. Penetration testing is a proven method, especially when AI systems process personal data.
Your Advantage with softScheck
- Specialized expertise: Our penetration testers combine traditional IT security experience with deep knowledge of AI architectures and LLM-specific attack techniques.
- Standards-based: We work according to the OWASP Top 10 for LLM Applications, the BSI guidelines for LLM penetration testing, and other recognized standards.
- Compliance-ready: Our reports are structured to serve as evidence for regulatory requirements (EU AI Act, NIS-2, DORA, GDPR).
- Vendor-independent: We test independently of model, platform, and provider – whether OpenAI, Anthropic, open-source models, or custom fine-tunes.
Don’t expose your AI systems to unnecessary risk. Contact us for individual guidance on your AI penetration test!