Organizations are racing to deploy LLMs in production. But traditional AppSec tools weren't designed to test the model interaction layer. And as AI agents gain access to tools, delegated authority, and multi-step workflows, the attack surface compounds.
HackerOne security engineers Manjesh S and Parveen Yadav, drawing from their BSides presentation on adversarial LLM techniques, will walk through live demonstrations of the most critical LLM attack vectors and real exploits with real business impact.
You'll learn:
- How attackers extract system prompts, API keys, and internal logic from production LLMs, and why this is often the first step in a broader attack chain
- Why jailbreaking and prompt injection bypass guardrails, content filters, and safety layers, including through multi-turn attack chains that progressively erode model defenses
- How indirect prompt injection enables data exfiltration, XSS, and unauthorized tool execution, turning your AI into an attack vector against your own users
- What a structured AI red teaming program looks like, from attack surface mapping to least-privilege tool design to defenses that go beyond system-prompt-level controls
- Where human-led offensive testing outperforms automated AI safety tools, and how to combine both for continuous assurance at scale
AI agents are already making decisions on behalf of your organization. See what attackers can do to them before they find out in production.