HackerOne Partners with IBM to Advance AI Protections for Granite Models
At HackerOne, we believe that the safest AI is the most thoroughly tested AI. That’s why we’re excited to announce a new partnership with IBM to launch a bug bounty program aimed at strengthening the security of IBM’s Granite family of AI models.
With up to $100,000 in bounty payouts available, this program invites select researchers from the HackerOne community to adversarially test IBM’s Granite models, specifically to uncover jailbreaks and flaws in enterprise-like environments where guardrails, such as Granite Guardian, are active.
As AI adoption grows and foundational models move from research environments into enterprise production systems, the stakes are higher than ever. Organizations must ensure their models perform as intended—securely, reliably, and ethically. Our collaboration with IBM is designed to test Granite in the wild, simulating the real-world conditions and threat scenarios enterprises face today.
Researchers will be tasked with identifying flaws in the AI model or guardrails that can result in unintended behaviors, revealing potential harms to the end user or paths cybercriminals could exploit. These insights won’t just harden the Granite models, they’ll also help IBM’s AI policy, safety, and governance teams proactively generate synthetic data to align future versions of the model more effectively.
At HackerOne, we’ve seen firsthand how impactful this kind of community-powered adversarial testing can be.
“HackerOne's community of researchers has proven invaluable in testing the safety and security of real-world AI systems. More than finding flaws, they are advancing the frontier of AI security—probing edge cases, exposing novel failure modes, and surfacing risks before anyone else sees them. This partnership with IBM builds on that momentum, showing how community-driven insights can power safer development, strengthen trust, and accelerate adoption.”
—Dane Sherrets, Staff Innovation Architect at HackerOne
Granite Guardian, an open-source guardrail model released by IBM, will be in place from day one. The objective isn’t to test the model in a sandbox but in the kinds of deployment scenarios IBM expects its enterprise customers to use. This elevates the value of the program: researchers aren’t just testing the model, they’re shaping its future deployment.
“Granite Guardian enforces secure control flow over model inferences, like a software firewall for AI. It's central to our efforts to secure AI behavior at the system level, and through HackerOne we are stress-testing this foundation to ensure safe and robust model deployment.”
—Ambrish Rawat, a senior research scientist and Master Inventor at IBM Research, who specializes in AI safety and security
Built for AI Security
This partnership is a continuation of HackerOne’s broader mission to secure AI across the development life cycle. Our AI Red Teaming solution combines cutting-edge generative model testing with our unmatched community of security researchers. It’s purpose-built to:
- Uncover prompt injection, jailbreaks, and hallucinations
- Stress test guardrails and mitigation controls
- Expose privacy, bias, and safety risks
- Enable model developers and deployers to integrate insights into tuning, retraining, and model alignment
Trusted by Anthropic, Adobe, and Snap Inc., HackerOne gives you a safe environment to discover the unknowns before your adversaries do. The first cohort of researchers is already being assembled to begin testing Granite, set to begin in the coming weeks.