The Challenge:
As Snap built and deployed generative AI features, including Lens and MyAI Text2Image, the company needed to know what could go wrong before users and bad actors found out. Traditional security testing had no playbook for this.Automated scanners couldn't think like adversaries. Internal teams couldn't simulate the creative range of real-world attack attempts.
Snap needed a way to probe AI systems for both safety failures (generating harmful content) and security failures (compromising confidentiality, integrity, or availability), at a scale and diversity that internal resources alone couldn't provide.
The Solution:
Snap partnered with HackerOne to build one of the first enterprise AI red teaming programs of its kind. Using CTF-style exercises, Snap engaged 21 researchers from around the world, selected specifically for the diversity of perspective they'd bring to identifying harmful content and novel exploits.
Hai, HackerOne’s coordinated system of AI agents, translated researcher submissions across seven European languages in real time, enabling global collaboration without communication friction. Bounties were dynamically adjusted across more than 100 flags to optimize researcher engagement and push beyond expected findings.
The Outcome:
Snap surfaced previously unknown vulnerabilities in its generative AI systems that adversarial datasets and automated tools had missed. The safety benchmarks developed through this process have since become a reference point for harmful content testing across the tech industry. After a decade of partnership and $1M in bounties paid, Snap continues to push the program into new territory, including hardware products and LLM agent simulations.
"AI red teaming allows us to explore the possibilities of what attackers might achieve, not just what's likely. Working with HackerOne has shown us that human ingenuity often outperforms adversarial datasets or AI-generated attacks."
—Ilana Arbisser, Technical Lead, AI Safety at Snap Inc.