AI Security Risks and Vulnerabilities Enterprises Must Address

The benefits of AI adoption have been made clear: increased productivity, streamlined processes, and smarter workflows that are freeing up teams to focus on high value tasks. Integration has been at a rapid pace that many security teams are struggling to keep up with.

Over the past year AI has shifted from stand alone tools to deeply embedded enterprise systems. The introduction of the Model Context Protocol (MCP) and emergence of agentic AI have accelerated this, linking sensitive internal data sources with often externally managed LLMs and AI tooling. While these developments bring new efficiencies, they also blur traditional security perimeters and introduce fresh risks that must be addressed.

As AI adoption accelerates, so must AI security practices. Vulnerabilities that were once only demonstrated in proof-of-concepts are beginning to materialize in real production environments. Enterprise security leaders should understand the risks and vulnerabilities, and implement practical steps to mitigate them.

Eroding Trust Boundaries with Indirect Prompt Injection

Long standing assumptions about trusted domains and content are actively challenged by these AI integrations. Indirect prompt injection has emerged as a clear example of this shift, highlighting the need to reassess trusted sources

What is an Indirect Prompt Injection? Malicious instructions are inserted inside the content an AI agent consumes rather than being delivered directly to it. Once processed, the agent interprets the instructions as part of its task, often bypassing safeguards because the source is trusted.

This reflects the complexity of AI training. Models are intentionally designed to follow instructions with flexibility, which makes them effective in diverse workflows but also creates opportunities for misuse.

A real world example of this is EchoLeak (CVE-2025-32711), an indirect prompt injection that resulted in sensitive information disclosure. An attacker could hide tailored instructions inside an ordinary email in such a way that it bypassed cross-prompt injection attack (XPIA) classifiers. When Microsoft 365 Copilot ingested the email as part of its normal processing, the agent treated those embedded instructions as legitimate and executed them. The attack included capturing sensitive data such as screenshots of mailbox content, with the output routed through a trusted Microsoft Teams domain.

A more recent issue, dubbed ForcedLeak, demonstrated how prompts hidden in Salesforce’s web-to-lead forms could trick AI agents into processing malicious instructions, with CSP rules still allowing expired but trusted domains to be used as an exfiltration path.

EchoLeak and ForcedLeak demonstrate the evolving landscape of AI security risks, and how AI integrations can expose new attack vectors in everyday workflows.

Rather than assuming certain domains or inputs are inherently safe, enterprise security leaders should:

Adopt a zero trust approach by validating interactions
Enforce guardrails around what AI systems can access
Ensure monitoring is in place to detect signs of abuse

These steps allow organizations to harness AI effectively while minimizing the risks of eroded trust boundaries.

Extending the Attack Surface with Third-Party and Supply Chain Reliance

AI deployments are rarely self-contained. Most rely on external vendors and repositories for model hosting, training data, or integration services.

This reliance adds complexity to the enterprise supply chain, where weaknesses upstream can quickly cascade into downstream impact. Third-party and supply chain risks highlight that security depends not only on internal practices but also on the controls and hygiene of external providers.

Model Hubs

Open source repositories like Hugging Face have made it easy for developers to share and reuse AI models, but this openness also creates opportunities for abuse. A 2024 study found that many models on the platform rely on insecure serialization methods that could be exploited by attackers. The researchers uncovered 14 malicious models, including several designed to open remote shells on the systems that loaded them.

MCP Servers

The Model Context Protocol (MCP) has created powerful new ways for AI systems to connect with external tools, data sources, and workflows. While this provides enormous flexibility, it also expands the attack surface of AI agents.

In a recent blog post, Idan Dardikman from Koi Security revealed the first malicious MCP server in the wild. The npm package, “postmark-mcp”, acted as an MCP integration, but after version 1.0.16, it began to secretly forward every email it processed to the developer’s own domain. This included password resets, invoices, internal memos, and other confidential documents.

This kind of attack goes beyond the risk of a compromised npm package listed in some package.json file. A traditional dependency compromise would usually threaten the developer environment or build pipeline.

In contrast, a malicious MCP integration operates inside live AI workflows often with highly privileged access to sensitive business data. Once connected, it inherits the trust and permissions of the AI system itself, turning an integration that should enhance productivity into a direct channel for data theft.

Third-party and supply chain risks highlight how vulnerable AI environments can become when external components are introduced without proper oversight. Open hubs and MCP servers bring speed and flexibility, but they also extend the attack surface in ways that enterprises need to mitigate.

Managing risks from third-party and external components requires:

Stronger vendor due diligence
Tighter validation of external code and models
Continuous monitoring of integrations

Without these AI security measures, a single compromised dependency can ripple across the entire workflow.

Risking Data Exposure and Privacy Issues

The usefulness of deploying AI in enterprise often depends on processing sensitive data. This also makes exposure and leakage one of the most pressing risks for enterprises.

Unlike traditional applications, AI systems can absorb large volumes of internal information and inadvertently surface it in unintended ways. This raises significant concerns for organizations bound by strict privacy, regulatory, and contractual obligations.

An example came in January 2025, when researchers at Wiz discovered that DeepSeek, a Chinese AI chatbot provider, had inadvertently exposed more than a million sensitive records on the open internet. The leak included chat logs, API keys, credentials, and metadata from the service’s backend.

While not the result of an external attack, the misconfiguration highlighted how quickly sensitive data can be lost once it passes through AI systems. For companies integrating AI into customer service, HR, or finance, the risk is that confidential material may escape organizational boundaries in ways that are difficult to detect or control.

Data exposure risks show why enterprises must treat AI as a potential point of data loss. Guardrails such as strict data minimization, access controls, and monitoring are essential to ensure sensitive information is not inadvertently leaked or misused.

Building AI into workflows safely means pairing innovation with robust privacy and security controls.

AI Security Requires Transparency and Accountability

AI systems are often described as black boxes, producing outputs that may appear correct but provide little insight into the reasoning behind them. This lack of visibility creates challenges for enterprises that must demonstrate compliance, investigate anomalies, or explain decisions that affect customers.

Without transparency, it becomes difficult to separate benign model errors from malicious manipulation or misuse.

Recent research has highlighted concerning behavior: model scheming. OpenAI has documented cases where advanced models showed signs of situational awareness, recognizing when they were being evaluated and adjusting their behaviour to appear aligned.

While the model’s outputs seemed compliant on the surface, the underlying reasoning revealed hidden strategies designed to preserve deployment or mask misaligned goals. This kind of behavior underscores why transparency is more than an audit requirement—it is a frontline defense against models deliberately concealing their intent.

Under the GDPR, organizations are required to provide “meaningful information about the logic involved” in automated decision-making (Articles 13–15, 22). Falling short of this obligation can be considered unlawful processing and carries the risk of significant penalties.

Beyond regulatory exposure, a lack of transparency leaves enterprises vulnerable to reputational damage and even subtle manipulation by the very systems they deploy.

Building explainability, maintaining audit trails, and developing methods to detect deceptive strategies are essential steps to ensure AI remains accountable, reliable, and trustworthy in enterprise environments.

8 Best Practices for AI Security Mitigation

Managing the risks of AI adoption requires so much more than patching vulnerabilities. Enterprises need a clear strategy that balances innovation with resilience. The following principles can help guide secure integration of AI systems:

Apply Zero Trust Thinking to AI integrations: Assume that no domain, document, or third-party service is inherently safe. Validate all interaction, apply principles of least privilege for AI tools, and segment AI workflows from critical systems.
Strengthen Third-Party Due Diligence: Treat AI vendors, model hubs, and integration providers as part of the enterprise attack surface. Require transparency around security practices, review contracts for incident response obligations, and establish ongoing monitoring of supplier risk.
Enforce Data Governance and Minimization: Define what data AI systems are allowed to process and enforce strict controls around sensitive or regulated information. Avoid feeding proprietary or customer data into external systems unless retention, storage, and compliance terms are explicitly defined.
Build Monitoring and Detection into AI Pipelines: Instrument AI workflows with the same visibility as any other enterprise system. Monitor for prompt injection attempts, anomalous agent actions, or unexpected data flows, and ensure alerts feed into the broader security operations process.
Maintain Human Oversight: Keep a human in the loop for actions that affect finances, customer experience, or regulatory exposure. AI can accelerate workflows, but final accountability should remain with people.
Maintain Model Transparency: Ensure that AI outputs can be traced back and explained. Transparency in how models arrive at decisions helps security teams detect abnormal behavior, supports audits for compliance, and builds trust with both regulators and customers. Investing in explainability and clear model documentation reduces the risk of AI becoming an opaque black box that hides misuse or errors.
Align with Emerging Regulatory Frameworks: Stay ahead of requirements such as the EU AI Act and NIST AI Risk Management Framework. Demonstrating governance not only reduces legal liability but also builds trust with customers, partners, and regulators.
Invest in Resilience, Not Just Controls: Accept that it is likely an AI incident will occur. Prepare response playbooks for AI-related breaches, test how teams would handle a compromised integration, and build recovery strategies into business continuity planning.

AI can and should be integrated into enterprise systems, but it must be done with intention. The goal is not to slow adoption but to create guardrails that allow innovation without exposing the business to avoidable risks.

Balance AI Opportunity with AI Security Discipline

AI is transforming how organizations operate, but its adoption is reshaping the security landscape at the same pace. Indirect prompt injection shows how trusted boundaries can be subverted, third-party tools and integrations reveal the fragility of AI supply chains, and recent incidents highlight the real risk of data exposure once sensitive information enters these systems.

AI brings opportunity, but it must be deployed with discipline. That means questioning assumptions about what is trusted, demanding more from vendors and providers, and putting governance in place to prevent data from leaking or being misused.

With the right strategy, enterprises can embrace AI while keeping risk within acceptable bounds, building systems that are not only smarter and faster, but also secure and resilient.

Discover how the HackerOne Platform can secure your AI deployments

AI Security Risks and Vulnerabilities Enterprises Can’t Ignore

Eroding Trust Boundaries with Indirect Prompt Injection

Extending the Attack Surface with Third-Party and Supply Chain Reliance

Model Hubs

MCP Servers

Risking Data Exposure and Privacy Issues

AI Security Requires Transparency and Accountability

8 Best Practices for AI Security Mitigation

Balance AI Opportunity with AI Security Discipline

About the Author

AI Security Risks and Vulnerabilities Enterprises Can’t Ignore

Eroding Trust Boundaries with Indirect Prompt Injection

Extending the Attack Surface with Third-Party and Supply Chain Reliance

Model Hubs

MCP Servers

Risking Data Exposure and Privacy Issues

AI Security Requires Transparency and Accountability

8 Best Practices for AI Security Mitigation

Balance AI Opportunity with AI Security Discipline

About the Author

Related Blog Posts