The New Frontier of AI Security:Understanding Threats in MCP Servers and Agent-to-Agent Communication

The cybersecurity community has spent decades defending networks, endpoints, and identities against human adversaries. But the threat landscape is shifting beneath our feet. The adversaries are no longer always human, and the attacks no longer always target traditional infrastructure.

In 2026, the rapid adoption of the Model Context Protocol (MCP) and Agent-to-Agent (A2A) communication frameworks has created an entirely new class of risk. Security researchers have already documented critical vulnerabilities in widely used MCP servers, demonstrated sophisticated attacks that allow malicious agents to hijack trusted conversations, and identified fundamental gaps in how organizations govern autonomous AI systems.

This article examines the concrete threats emerging from MCP and A2A deployments, drawing on real-world vulnerabilities, proof-of-concept attacks from leading research teams, and the OWASP Top 10 for Agentic Applications, and provides actionable guidance for building defense-in-depth in an autonomous world.

The MCP Threat Landscape: When Tools Become Weapons

The Model Context Protocol, introduced by Anthropic in late 2024, was designed to solve a genuine problem: standardizing how AI models connect to external tools and data sources. With over 15,000 MCP servers now available and adoption accelerating across major platforms, the protocol has succeeded beyond expectations. But that success has a dark side.

The Authentication and Network Exposure Gap

The most pervasive vulnerability in MCP deployments stems from a simple design choice: binding servers to all network interfaces with no authentication required.

Security researchers at Backslash Security analyzed more than 7,000 MCP servers and identified hundreds explicitly bound to 0.0.0.0, making them accessible to anyone on the same local network. They dubbed this the “NeighborJack” vulnerability. In shared office WiFi, co-working spaces, or cloud VPCs, any device can invoke these servers’ tools without credentials.

This isn’t merely theoretical. The MCPJam inspector, a local-first development platform for MCP servers with tens of thousands of weekly downloads, was found to listen on 0.0.0.0 by default in versions 1.4.2 and earlier. Attackers could send a crafted HTTP request to trigger remote code execution (CVE-2026-23744, CVSS 9.8). The vulnerability required no authentication and no user interaction.

The Broader Vulnerability Landscape

The Backslash Security research identified two main categories of MCP vulnerabilities that, when combined, become catastrophic:

Network exposure (0.0.0.0 binding): Hundreds of servers accessible to anyone on the local network
Excessive permissions and OS injection: Dozens of servers allowing arbitrary command execution on the host

When both vulnerabilities appear in the same server, any malicious actor on the same network can gain full control of the host machine, running commands, scraping memory, or impersonating tools used by AI agents.

The OWASP Top 10 for Agentic Applications 2026 categorizes these risks under ASI04: Agentic Supply Chain Vulnerabilities (compromised dependencies and plugins) and ASI05: Unexpected Code Execution (RCE through agent-generated or externally influenced code)

Agent-to-Agent Communication: The Stateful Attack Surface

While MCP connects agents to tools, Agent-to-Agent (A2A) protocols enable something more complex: autonomous systems conversing with each other, delegating tasks, and maintaining context across interactions. Google’s A2A protocol, designed for decentralized peer-to-peer coordination, represents a fundamental shift from stateless tool invocation to stateful collaboration

Agent Session Smuggling: A New Attack Technique

In October 2025, Palo Alto Networks’ Unit 42 research team published findings on a sophisticated attack they termed agent session smuggling. The technique exploits a core feature of stateful communication protocols: the ability to remember recent interactions and maintain coherent conversations.

The attack works as follows:

A client agent initiates a legitimate session with a remote agent
The malicious remote agent, while processing the request, covertly injects extra instructions across multiple turn interactions
The remote agent returns the expected response, completing the transaction transparently
The injected instructions remain invisible to end users, who only see the final consolidated output

This is not a vulnerability in the A2A protocol itself. Rather, it exploits the implicit trust relationships built into agent architectures. Agents are often designed to trust collaborating agents by default.

Proof of Concept: Financial Assistant Compromise

Unit 42 developed proof-of-concept attacks using a financial assistant as the victim and a compromised research assistant as the malicious agent.

Scenario 1: Sensitive information leakage – The research assistant issued seemingly harmless clarification questions that gradually tricked the financial assistant into disclosing its internal system configuration, chat history, tool schemas, and prior user conversations. These intermediate exchanges would remain completely invisible in production chatbot interfaces.

Scenario 2: Unauthorized tool invocation – The malicious agent manipulated the financial assistant into executing stock purchase operations without user knowledge or approval actions that should have required explicit confirmation.

The attack succeeded because of four key properties:

Stateful: The remote agent could persist context across multiple turns
Multi-turn: Progressive adaptation made detection significantly harder
Autonomous: AI-powered reasoning enabled dynamic instruction crafting
Invisible: End users never observed the smuggled interactions

Note that MCP servers generally operate statelessly, executing isolated tool invocations without preserving session history. A2A servers can persist state across interactions and leverage model-driven reasoning, enabling the kind of adaptive, multi-turn attacks demonstrated by Unit 42.

The OWASP Top 10 for Agentic Applications categorizes these risks under ASI07: Insecure Inter-Agent Communication (weak authentication, lack of encryption, poor semantic validation) and ASI06: Memory & Context Poisoning (corrupting memory stores with malicious data)

The Identity and Governance Gap

Underpinning both MCP and A2A threats is a fundamental failure of identity and governance.

The Machine Identity Crisis

Industry experts estimate that fewer than five percent of enterprises deploying autonomous agents have implemented adequate identity systems for those agents. Most rely on simple API tokens what Sectigo’s Jason Soroko describes as “a weapon waiting for a stolen shared secret”.

The challenge is compounded by three factors:

Exponential growth in access risk: Fleets of agents with varying privileges create countless attack paths
Blurred accountability: Autonomous actions make it difficult to determine who or what authorized a given operation
No kill switch: When an agent goes rogue, organizations discover that revocation isn’t a power cord, else, it’s the ability to instantly revoke cryptographic identity

The OWASP Top 10 identifies this as ASI03: Identity and Privilege Abuse. Attackers exploiting dynamic trust, cached credentials, and delegation chains to perform unintended actions.

The Visibility Gap

Security teams face a deployment reality they cannot monitor. Analysis of MCP deployments across enterprise environments found that 95 percent were running on employee endpoints where security tools had no visibility. Aaron Turner of IANS Research offered stark advice: “It is my opinion that you should treat MCPs as malware if they try to run on endpoints”.

This is shadow IT evolved into shadow agentic infrastructure, unmonitored, ungoverned, and increasingly critical to business operations.

Cascading Failure Risk

When agents communicate autonomously, a single compromised agent can trigger cascading failures. The compromised agent issues instructions to dozens of legitimate agents before detection. Security teams revoke its credentials, but the legitimate agents have already accepted assignments and queued subsequent actions. There is no mechanism to propagate revocation backwards.

The OWASP Top 10 dedicates a category to this: ASI08: Cascading Failures, where a single fault amplifies through networked agent ecosystems, turning small issues into system-wide outages or breaches.

Toward Holistic Mitigation: Defense-in-Depth for Agentic Systems

Addressing these interconnected threats requires moving beyond fragmented security strategies toward integrated defense across the entire stack.

Foundational: Identity and Authentication

Every agent must have a unique, bounded identity with short-lived credentials. This means:

Moving from shared secrets to cryptographic proof of possession
Isolating agent sessions and wiping cached context between tasks
Requiring re-authorization for privilege escalation
Implementing lifecycle management for agent credentials and roles

The Sandbox Model: Treat Data Access Like a Gun Range

Traditional data governance assumes authenticated users will handle data appropriately, the “library model.” But MCP breaks this model with persistent, dynamic connections that users create without IT involvement.

The alternative is the “gun range model”:

Data is accessed within sandboxed environments under organizational control
AI agents operate with scoped permissions and time-bound sessions
Every action is logged, monitored, and tied to a specific user, session, and purpose
When sessions end, access ends, nothing persists

This approach answers the audit question regulators will ask: what did the AI agent do with that data?

Technical Controls for MCP Environments

Based on documented vulnerabilities and OWASP guidance, organizations should implement:

Network isolation: Never bind MCP servers to 0.0.0.0; restrict to 127.0.0.1 or use authenticated proxies
Input validation: Validate all paths, URLs, and parameters against allow lists
Least privilege: Scope tools to minimum required permissions; require confirmation for destructive actions
Supply chain scanning: Maintain inventories (SBOM/AIBOM) of all components; pin dependencies and block untrusted sources
Continuous monitoring: Deploy behavioral analytics to detect anomalous tool invocation patterns

Tools like the open-source MCPSEC framework demonstrate how organizations can automate MCP security scanning, simulate attacks, and enforce policies in CI/CD pipelines.

Defending Against Agent Session Smuggling

Unit 42’s research points to layered defenses for A2A environments:

Human-in-the-loop (HitL) enforcement: For critical actions, execution should pause and trigger confirmation through channels the AI model cannot influence
Context-grounding techniques: Validate that remote agent instructions remain semantically aligned with the original user request’s intent
Cryptographic agent verification: Sign AgentCards and validate identities before session establishment
User visibility: Expose client agent activity through real-time dashboards, making invisible interactions visible

Invisible Governance Through Platform Ownership

The accessibility paradox plagues AI governance: more tools often produce worse outcomes because employees work around friction. The solution is invisible governance controls that run automatically while users experience simplicity.

This requires shifting ownership from security teams to data platform teams. The data platform team controls where data lives, how it moves, and who accesses it at the source. They own the layer where MCP and A2A governance must be built, not retrofitted.

The Regulatory Imperative

The governance gap now carries regulatory exposure. The EU AI Act imposes penalties up to 7 percent of global revenue for violations involving high-risk AI systems. Regulators will ask for audit trails, and “we didn’t know” is not an acceptable answer.

NIST’s AI Risk Management Framework remains voluntary globally, but adoption is accelerating. Organizations that treat governance as a cost rather than a safeguard will find themselves exposed.

Conclusion: Architecting for Trust in an Autonomous World

The evidence is clear: MCP servers with known vulnerabilities are deployed in production environments. Agent session smuggling enables covert hijacking of trusted conversations. Identity and governance gaps leave organizations blind to cascading failures. These are not hypothetical risks, they are documented, exploitable, and actively emerging.

But there is nothing inevitable about breach. The controls exist: cryptographic identity, sandbox architecture, context validation, continuous monitoring. The frameworks exist: OWASP Top 10 for Agentic Applications, NIST AI RMF, the emerging Agent Security Layer (ASL) protocol. What has been missing is the holistic perspective that connects capabilities into coherent defense.

In 2026, identity will be the ultimate control point for an autonomous world. Organizations that architect for trust, building governance into platforms, not layering it on top, will navigate the transition to agentic systems with resilience. Those that don’t will learn the hard way that when AI agents go rogue, the kill switch isn’t a power cord. It’s the ability to revoke identity instantly, contain cascades automatically, and answer the regulator’s question with a complete audit trail.

The stack is connected. The defense must be too.