Rethinking Risk in the Age of AI

Kara Sprague
CEO
Image
Layers of Defense for AI Security Graphic

CISOs are navigating a security landscape that is rapidly transforming thanks to generative AI. Attack surface areas are quickly expanding as enterprises are deploying AI-powered applications, attackers are leveraging sophisticated AI-powered capabilities, and vendors are racing to deliver AI-powered solutions for defenders.

AI adoption is accelerating at an unprecedented pace. According to McKinsey, 92% of executives expect to increase AI spending in the next three years, with more than half planning to boost budgets by 10% or more. Reported use of AI increased significantly in 2024: 78% of respondents now say their organizations use AI in at least one business function, up from 72% earlier in the year and 55% a year before. This surge in deployment underscores the urgency for mature, tested security strategies that can keep pace with innovation.

AI is rewriting the rules of engagement for adversaries and defenders alike. At HackerOne, we're seeing firsthand how attack surfaces are expanding with the broad adoption and deployment of generative AI technologies. We’re also seeing the breadth of new attack vectors targeting these AI systems–prompt injections, jailbreaking, dependency attacks, data poisoning, denial of service, misuse and abuse, and many more. This isn’t hypothetical risk; it’s operational reality. And it’s why forward-leaning organizations such as Anthropic and Snap are redefining what responsible security for AI looks like.

Lessons from the Frontlines: Anthropic and Snap

When Anthropic wanted to put its AI safety defenses to the test, they didn’t turn inward. They turned to the global community of ethical hackers on HackerOne. Their jailbreak challenge surfaced high-impact issues in reinforcement learning environments and informed stronger policy guardrails. The wide range of techniques employed by researchers during the challenge played a key role in enhancing Claude’s overall safety and resilience. 

Anthropic observed several particularly effective strategies, including encoded prompts to evade classifiers, role-play scenarios to manipulate responses, keyword substitution, and advanced prompt injection attacks. These discoveries helped identify fringe cases and key areas for Anthropic to reexamine, while also validating where guardrails remained strong. It was an early public test of its kind, pioneering in both transparency and community engagement, and it proved the value of outside-in testing to validate safety claims.

Similarly, Snap has spent more than a decade partnering with HackerOne to proactively test and harden its infrastructure. Most recently, that included a dual-track AI red teaming strategy:

  • AI Safety Red Teaming focused on preventing the generation of harmful content, such as offensive language or instructions for dangerous activities.
  • AI Security Red Teaming ensured that bad actors couldn’t exploit AI systems to compromise confidentiality, integrity, or availability.

This comprehensive approach tested generative AI systems for novel abuse cases from both behavioral and systemic perspectives.

What CISOs Need to Know

  1. AI systems require a new security playbook. They introduce entirely new security challenges that require purpose-built defenses. The same models that generate useful content—text, code, images—can also be exploited to produce misleading, harmful, or non-compliant outputs.
     
  2. Human ingenuity still matters. While AI can support detection and analysis, it cannot replace the creativity, intuition, or contextual understanding of skilled security professionals. Hybrid testing approaches—combining automation with human insight—are essential.
     
  3. The cybersecurity skills gap requires scalable solutions. With the cybersecurity talent shortage showing no signs of easing, relying solely on internal expertise is no longer viable. Crowdsourced security platforms like HackerOne provide on-demand access to a global community of ethical hackers, delivering specialized expertise, diverse attack techniques, and real-world insight into emerging AI threats.
     
  4. Outside perspectives improve security outcomes. Leading AI organizations like Anthropic and Snap are turning to independent security researchers to harden their systems. These collaborations accelerate threat discovery and drive faster security improvements.
     
  5. Return on Mitigation (RoM) is a better way to quantify offensive security value. Traditional ROI calculations don’t fully capture the value of proactive security. Return on Mitigation (RoM) is a new approach that quantifies the potential loss avoided through proactive vulnerability discovery and resolution.

The Path Forward

Building trust in AI requires us to treat security as a design principle, not a patch. That means involving security researchers early, integrating AI system testing into model development lifecycles, and recognizing that innovation without guardrails is a liability.

As CISOs, you are stewards of both business risk and strategy. The question is no longer whether AI will impact your threat model—it already has. The question is: How will you respond?

At HackerOne, we’re committed to helping you answer with confidence, clarity, and the backing of the world’s most skilled security researchers. Because the strongest AI defenses aren’t built in isolation. They’re built together.

To dive deeper into how leading organizations are building trust and resilience into their AI systems, download our new ebook: Securing the Future of AI.

About the Author

Kara Sprague
Kara Sprague
CEO

Kara Sprague is the CEO of HackerOne. She holds over 20 years of experience assisting public and private technology companies in growing their businesses at a global scale.