Responsible AI at HackerOne: 2026 Update

In 2023, we published our first take on how HackerOne approaches AI responsibly. A lot has changed since then: the models are more capable, the attack surface has grown, and the industry's understanding of what "responsible AI" actually requires has matured considerably. What hasn't changed is our core conviction: the best security outcomes come from human judgment amplified by AI, not replaced by it.

This post is an update. We want to be direct about what we're building, how it works under the hood, and our commitment to protecting customer and researcher data.

The Guiding Principle: Human Judgment & Accountability

We design our AI systems to be agentic, meaning they can reason, take multi-step actions, and operate autonomously across complex security workflows. But agentic doesn't mean unsupervised.

Every agent action and its reasoning are traceable. We log agent decisions and maintain full audit trails so that any step in an automated workflow can be reviewed. For decisions that carry real consequences, such as paying out a bounty, the default configuration is a human-in-the-loop must verify before anything happens.

Our No-Training Commitment

We do not train, fine-tune, or otherwise improve GenAI or large language models on confidential customer or researcher data. We hold our approved AI inference partners (including AWS Bedrock and Anthropic) to the same rigorous standard: zero data retention and no use of inputs or outputs for model training.

Our Guardrails

Feature	Our Policy
Model Training	We do not use confidential customer or researcher data to train, fine-tune, or otherwise improve our models or those of our vendors.
Data Usage and Retention	We use stateless inference; once a task is complete, the data is not stored by our AI partners.
Sensitive Actions	Customers retain full control over all critical workflows; by default, agents cannot execute sensitive actions, including bounty payouts or scope modifications, without human oversight.
Third-Party Processing	We only use enterprise-grade partners for AI processing: Amazon Web Services (AWS) and Anthropic. These partners undergo rigorous review to ensure they meet our high bar for security and privacy standards.

What We’re Building: AI as a Security Force Multiplier

The examples below give a sense of the use cases where AI is being applied across the platform. This is a snapshot of a growing set of capabilities, not an exhaustive list.

For Customers and Security Teams

To eliminate manual triage fatigue: Agentic validation automatically evaluates researcher submissions for noise, scope alignment, and duplicates, ensuring the security team only spends time on high-quality, actionable findings.
To achieve continuous security testing coverage at scale: Agentic penetration testing autonomously runs the full reconnaissance, exploitation and reporting lifecycle with human experts validating findings before they reach engineering teams, providing a testing cadence that traditional methods can’t match.
To secure AI deployments: Agentic prompt injection testing executes structured, adversarial attacks against your system stack to prove your guardrails hold under real-world pressure.
To prioritize remediation with higher confidence: Exploit verification agents attempt to generate functional proofs-of-concept directly from eligible reports to confirm exploitability instantly.
To turn raw program data into strategic intelligence: Chat-based program insights allow you to query your findings repository in natural language for instant trend analysis and program statistics.
To fix vulnerabilities at the root cause: Remediation intelligence analyzes recurring trends across your program to suggest systemic fixes that eliminate entire classes of issues.

For Researchers

To maximize your earning potential and faster turnaround: Report submission assistance helps you structure findings and fill technical gaps, leading to faster triage and quicker payouts.
To reduce the wait time for feedback: Agentic validation provides near-instant initial feedback, allowing you to iterate faster and move on to your next finding.

How We Build: Privacy-First Architecture

Secure, Stateless Processing

Our AI system is built on agentic orchestration across multiple models. Inference is stateless and remains entirely within the infrastructural boundaries of our approved, private AI processing partners (e.g., AWS Bedrock and Anthropic). There is no side door through AI.

Permission-Based Access

Permissions are enforced at every layer:

Agent-based workflows (such as agentic validation) operate within the permissions granted to the agent or the invoking user (On Behalf Of) — the program and findings scope for which the agent has been authorized — following the same approach as user-level access controls.
Chat-based interactions are anchored to the querying user's existing permission set. If you can't see a finding in the UI, you can't surface it through AI chat either.

Scoped Data Usage: The Principle of Progressive Context

To be effective, agents require specific context, but we ensure they only see what they need to see. We use a just-in-time processing model where agents are granted transient access to the minimum data necessary to complete a workflow. Once the task is finished, the session is cleared.

Our AI system processes data to deliver specific security outcomes, including:

Submission Context: Researcher submissions and vulnerability data and metadata
Environmental Mapping: Attack surface, asset, and reconnaissance data
Other Security and Business Context: Context from integrations when enabled (e.g., source code, issue trackers, change management tickets)

How We Validate Our Own AI Security

We don't just claim our AI systems are secure; we test them, openly.

Our agentic AI system and its capabilities are in scope for HackerOne's own bug bounty program. If there's a vulnerability in how we've built this, we want researchers to find it and report it through the same process we use for everything else.

For new capability rollouts, we run targeted evaluations including Spot Checks, AI Red Teaming, and Agentic Prompt Injection Testing.

Where This Is Going

AI capabilities will continue to expand: the models will get more capable, the workflows more autonomous, and the coverage broader. What won't change is the foundation: human accountability for consequential decisions, extensive auditability of agent reasoning and actions, and a commitment to not train, fine-tune, or otherwise improve AI models with confidential customer or researcher data.

Security is ultimately about trust. We believe the right way to earn it is to be transparent about how these systems work, hold them to the same standards we apply everywhere else on the platform, and keep humans in control and accountable for what matters most.

For a deeper look at the architecture behind these principles, including how RAG, function calling, and autonomous agents each interact, see our technical Hai Security & Trust documentation.