Zscalerのブログ

Zscalerの最新ブログ情報を受信

Products & Solutions

MCP, A2A, and WebSockets: Why Firewalls Fail on AI Traffic (and the Fix)

image
MATT MCCABE
May 05, 2026 - 12 分で読了

Overview

AI traffic breaks legacy security because it’s conversational, persistent, and tool-driven—often over WebSockets and agent protocols like MCP and A2A. Firewalls can see connections and domains, but they can’t inspect multi-turn prompts/responses, agent actions, or fragmented streaming payloads. The fix is session-aware, inline content inspection with AI-aware access controls, DLP on prompts/responses, and continuous discovery (AI-SPM) to govern shadow and embedded AI.

MCP, A2A, and WebSockets: What they are and why they matter

These three protocols are increasingly common in agentic systems. Together, they shift security from inspecting individual requests to understanding entire workflows, which is a fundamentally harder problem.

Model Context Protocol (MCP)

MCP is emerging as a common way for AI systems to interact with databases, file systems, APIs, and development environments without requiring custom integrations for each one. In practice, MCP is what allows an AI-powered code editor to read a codebase, retrieve documentation, and execute commands within a single interaction. 

That same capability creates security blind spots:

  • Tool-driven workflows: A single user prompt triggers multiple backend calls that your security tools cannot see.
  • Identity gaps: MCP servers act on your behalf using delegated permissions, but traditional identity systems struggle to verify these automated actions.
  • High-velocity exchanges: Models and tools exchange information faster than legacy inspection systems can process.

Because these interactions occur at machine speed, inspection systems built for sequential, request-based analysis struggle to keep up.

Application-to-application (A2A)

A2A communication enables autonomous agents to coordinate workflows across different services. While MCP connects models to tools, A2A connects entire applications to each other.

This is what enables agent-driven workflows and embedded AI functionality within enterprise SaaS platforms. From a security perspective, this introduces activity that often occurs without clear visibility:

  • East-west data movement: Sensitive information flows between services without users uploading files or clicking buttons.
  • Permission sprawl: Each autonomous workflow requires tokens, service accounts, and access rights that accumulate faster than you can track.
  • Impersonation risks: A2A communications might claim to represent users or services without strong verification.

As these connections increase, it becomes harder to answer a fundamental question: which system is acting, and under whose authority?

WebSockets

WebSockets enable real-time AI interactions by maintaining persistent, bidirectional connections between users and services. Instead of opening and closing connections with each request, they keep a continuous stream active.

This is what allows AI tools to feel responsive and interactive. It also breaks how most inspection systems operate:

  • Incremental content delivery: Your data loss prevention (DLP) tools expect complete payloads to analyze, but WebSocket streams deliver content in fragments.
  • Session persistence: A WebSocket connection might stay open for hours, providing a long-lived channel that resembles a backdoor.
  • Real-time inspection gaps: By the time your security tools piece together enough fragments to analyze, much of the conversation has already completed.

AI protocol security: How MCP, A2A, and WebSockets break firewalls

Your firewall cannot read a conversation.

Enterprise artificial intelligence (AI) and machine learning (ML) traffic grew 83% year over year, according to the Zscaler ThreatLabz 2026 AI Security Report. The attack surface did not gradually expand. It accelerated before most security teams had a chance to adjust.

At the same time, the nature of traffic itself shifted.

AI interactions no longer follow predictable request and response patterns. They unfold across multi-turn conversations, trigger actions across systems, and move data through persistent connections. Legacy security models were not designed for that behavior.

Firewalls still see domains and connections. They do not see the source code pasted into a prompt, the sensitive data shared across multiple turns, or the actions an AI agent takes on behalf of a user.

That gap is structural.

What changed in AI traffic

Traditional web browsing is predictable. Your browser sends a request, gets a response, the connection closes. Security tools were built for exactly that pattern and they are good at it.

AI does not work that way.

Modern AI maintains ongoing conversations. It remembers context across turns, triggers chains of tool integrations, and streams data through persistent connections that stay open for minutes or hours. A single interaction can touch a dozen backend systems without the user clicking anything beyond "send."

That shift breaks nearly every assumption your security stack was designed around:

  • Multi-turn memory: The AI recalls what you shared three prompts ago and builds on it. Your firewall sees individual packets. It has no idea a conversation is even happening.
  • Tool-driven fan-out: One prompt to an AI coding assistant can trigger five separate API calls, covering codebase access, documentation queries, and file writes. Each call is a potential exposure point your tools never see.
  • Multimodal content: Text, code, images, and documents all flow through the same session. Web filtering was not built to track mixed content inside persistent connections.

The result is three risk categories that existing controls were not designed to catch:

  • Shadow AI proliferation: Employees adopt unsanctioned AI tools faster than any governance process can track, often to solve real problems, with no malicious intent.
  • AI-native attacks: Prompt injection manipulates AI behavior through crafted inputs; context poisoning corrupts the information AI relies on to make decisions.
  • Embedded AI by default: Enterprise SaaS platforms activate AI features automatically, often without the security team knowing it happened.

Why firewall-centric policies fail on AI interactions

Here is the core mismatch: firewalls were built for linear, transaction-style traffic. AI traffic is conversational, contextual, and continuous. Those are not compatible inspection models, and no amount of tuning closes that gap.

Your firewall knows a user connected to ChatGPT. It has no idea what they sent, what came back, or whether any of it contained regulated data, proprietary IP, or a prompt crafted to extract something it should not have.

The same applies to embedded file transfers. When users paste code snippets, configuration files, or internal documents into an AI conversation, that content travels inside an encrypted session stream. Traditional file monitoring never sees it.

Keyword-based DLP fares no better:

  • Users paraphrase sensitive content just enough to bypass detection rules
  • Multilingual prompts sail past English-focused keyword filters
  • Multi-turn leakage spreads exposure across dozens of turns, each one individually harmless, collectively significant

A common workaround is to isolate AI access inside virtual desktop infrastructure (VDI). It does not solve the problem. VDI adds overhead and latency while still lacking prompt-aware controls. You have contained the session. You have not inspected it. Isolation without inspection is not security.

Don't treat AI like web traffic. Treat it as multi-turn, contextual interactions that require inline, content-layer inspection and control.

What you actually need is inline, content-layer controls built for how AI traffic behaves, not how web traffic used to.

Know your AI estate first: The case for AI Security Posture Management (AI-SPM)

Before you can control AI, you have to know what you are dealing with.

Most security teams cannot answer the basic questions: 

  1. Which AI apps do employees actually use? 
  2. What data moves through them? 
  3. Which agents can act on behalf of users?
  4. Where are AI models running across your cloud infrastructure?

If those questions feel uncomfortable, that is exactly the visibility gap AI-SPM is designed to close. Enforcement built on an incomplete inventory is just guesswork with extra steps.

Here is what AI-SPM surfaces that traditional tools miss:

AI-SPM capabilityWhat it discoversTraditional security gap
AI Bill of MaterialsData sources, models, and runtime usage connectionsNo AI-specific asset tracking exists
Shadow AI detectionUnsanctioned applications and developer toolsGeneric web filtering only identifies known domains
Embedded SaaS AI mappingCopilots and agents within enterprise applicationsNo visibility into AI features inside approved SaaS
Permission analysisExcessive access rights granted to AI servicesStandard identity tools miss AI-specific context

Discovery is not a one-time exercise. As new AI tools get adopted, new agents get deployed, and embedded SaaS AI expands, your inventory has to stay current, or every policy downstream becomes unreliable.

Controls to prioritize

The goal is not to stop AI. It's to enable sanctioned AI securely while discovering and controlling shadow usage.

Here is what to prioritize, in order.

Access policy controls 

You cannot write access policies for applications you do not know exist. Start with discovery across every department, tool, and user group. Then enforce from there.

  • Shadow AI discovery: Find unsanctioned applications before they become incidents
  • Risk-based access: Configure allow, block, warn (caution), or coach by user role and application risk, not blanket rules
  • Isolation policies: Contain unknown or higher-risk tools without shutting down access entirely

Prompt-aware inspection

Your DLP sees file uploads. It does not see what employees type directly into an AI chat window, which is where most sensitive data actually leaks. Session-based inspection changes that.

  • Conversation visibility: Extract and classify prompts and responses across multi-turn sessions, not just individual requests
  • Sensitive data protection: Apply inline DLP using comprehensive dictionaries for source code, personally identifiable information (PII), and regulated data
  • AI-native threat detection: Identify prompt injection attempts, jailbreak patterns, and multi-turn policy evasion before they succeed

Browser isolation for risk reduction

Not every AI tool can be blocked outright, and blanket blocking is rarely the right answer. Browser isolation lets users keep working while containing the interaction.

  • Preserve productivity without removing access
  • Contain AI interactions from corporate resources
  • Apply granular controls, including copy/paste, downloads, and uploads, by user, app, and risk context

Developer AI environment security

Developer tools are your fastest-growing, least-governed AI attack surface. AI-powered code editors, command-line interfaces, and agent frameworks access proprietary source code, internal documentation, and development credentials without any of the controls applied to end-user AI apps.

The risk is structural. When a developer uses an MCP-connected integrated development environment (IDE), that session can trigger multiple back-end calls to internal systems. The traffic looks like generic app traffic to legacy tools. It is not.

  • Apply zero trust access and inline controls to AI developer environments, including IDEs, command-line interfaces (CLIs), and agent platforms, the same way you govern end-user generative AI apps
  • Inspect MCP-driven traffic flows, not just HTTP-based requests
  • Enforce allow/block/warn/isolate policies consistently across developer tools
  • Extend AI Bill of Materials (AI-BOM) visibility to include developer tool connections to large language models (LLMs), MCP servers, and agent frameworks

Audit and compliance logging 

Controls without evidence are unenforceable. AI security logging is different from traditional application monitoring. You need conversational context, not just connection metadata. That distinction matters for incident response, policy refinement, and demonstrating compliance.

  • Capture interactions across all AI tools, including prompt and response content
  • Store logs with enough context to support investigation and misuse detection
  • Use log data actively to refine what gets warned vs. blocked and where isolation is needed

What this looks like in one platform

Point solutions give you fragmented visibility and inconsistent enforcement. When access controls, posture management, and runtime protection each live in separate tools, each one sees only part of the problem. The gaps between them are exactly where risk accumulates.

Zscaler organizes AI security into three integrated capabilities across the full lifecycle:

  • AI Asset Management: Continuously discovers AI across users, apps, agents, models, and infrastructure. It prioritizes risk with scoring and delivers guided remediation through AI-SPM.
  • Secure Access to AI Apps and Models: Enforces zero trust access governance with granular controls, applies, prompt-aware inspection with DLP, and content moderation, and extends the Zero Trust Exchange™ coverage to developer AI tooling and unmanaged devices.
  • Secure AI Infrastructure and Apps: Runs automated adversarial testing using simulated attack techniques, provides runtime protection against prompt injection, jailbreaks, and data leakage, and generates closed-loop policies that translate red teaming findings directly into enforceable runtime guardrails.

Discovery informs access policy. Access policy feeds posture assessment. Red teaming findings become runtime controls. That closed loop is what point solutions cannot replicate.

AI security requires zero trust, not more firewalls

The gap between what legacy tools can inspect and what AI is actually doing is already significant. It will widen. Autonomous agents are taking on more complex workflows. AI is embedding more deeply into core business processes. The window for getting ahead of this closes faster than most security programs are moving.

Organizations that act now will not just reduce risk. They will move faster. Teams that can use AI confidently, without working around security controls, have a real operational advantage over those that cannot.

The path forward is not blocking AI. It is knowing what AI runs in your environment, governing who can use it and how, and inspecting what moves through it, all on one platform, not five.

See how Zscaler AI Protect inspects prompt and response traffic across multi-turn sessions. [Request a demo]

See how AI traffic is evolving across the enterprise. [Read the ThreatLabz 2026 AI Security Report]

FAQ

AI interactions are multi-turn sessions with memory, tools, file uploads, and streaming responses. Data moves through unpredictable, high-velocity workflows, which means controls must understand context across an entire conversation, not just a single request. Legacy tools built for discrete HTTP transactions have no mechanism for this.

Model Context Protocol (MCP) connects AI models to tools and data sources. Application-to-Application (A2A) lets apps and agents exchange data and actions autonomously. WebSockets keep a persistent stream open for real-time interactions. Together, they turn AI into tool chains and continuous sessions that legacy inspection misses entirely.

Firewalls assume linear, transaction-style browsing. Generative AI requires prompt and response visibility, upload awareness, and session tracking across tool calls and streams, plus intent-aware decisions like coaching vs. blocking, not just domain and port filtering. Static allow and blocking is the wrong fit for the traffic pattern.

Inline controls sit between users, agents, and AI services to extract and classify prompts and responses, apply data loss prevention (DLP) and content moderation, and enforce actions, including allow, warn, block, or isolate, consistently across multi-turn sessions, including embedded AI in SaaS.

Start with discovery: you cannot govern what you cannot see. From there, prioritize access policies for sanctioned vs. shadow AI, prompt-aware inspection with DLP and content moderation, isolation for high-risk use cases, developer AI tooling governance, and logging for incident response and policy refinement, backed by continuous AI posture management.

Sometimes, but often not well. WebSocket upgrades to a long-lived, bidirectional stream, and most payloads are TLS-encrypted. A firewall can only inspect effectively if it supports TLS decryption, WebSocket protocol awareness, and content/DLP scanning at scale. Otherwise it sees metadata only, missing prompt and data exfiltration.

Auditors typically want an end-to-end audit trail: AI agent/tool inventory; identity and token issuance; policy decisions for each tool call; prompt/response and data-classification tags; DLP blocks/alerts; connector and API access logs; approvals and change records for workflows; and monitoring of exceptions, rate limits, and remediation actions. Correlate logs with ticket IDs and owners.

form submtited
お読みいただきありがとうございました

このブログは役に立ちましたか?

免責事項:このブログは、Zscalerが情報提供のみを目的として作成したものであり、「現状のまま」提供されています。記載された内容の正確性、完全性、信頼性については一切保証されません。Zscalerは、ブログ内の情報の誤りや欠如、またはその情報に基づいて行われるいかなる行為に関して一切の責任を負いません。また、ブログ内でリンクされているサードパーティーのWebサイトおよびリソースは、利便性のみを目的として提供されており、その内容や運用についても一切の責任を負いません。すべての内容は予告なく変更される場合があります。このブログにアクセスすることで、これらの条件に同意し、情報の確認および使用は自己責任で行うことを理解したものとみなされます。

Zscalerの最新ブログ情報を受信

このフォームを送信することで、Zscalerのプライバシー ポリシーに同意したものとみなされます。