Blog Zscaler

Ricevi gli ultimi aggiornamenti dal blog di Zscaler nella tua casella di posta

Products & Solutions

Shadow AI Data Risk: Your 30-Day Containment Strategy

image
MATT MCCABE
aprile 24, 2026 - 16 Minuti di lettura

Overview

Your employees shared sensitive data with artificial intelligence (AI) tools today. They did it to work faster, solve problems, and meet deadlines. They did it without malicious intent and without your security team's knowledge.

According to the Zscaler ThreatLabz 2026 AI Security Report, ChatGPT alone generated more than 410 million data loss prevention (DLP) policy violations in 2025, each one representing sensitive data that attempted to leave an organization through an AI tool. That is not a future risk. It is what happened last year, quietly, across organizations that thought they had reasonable controls in place.

A developer pastes production logs into ChatGPT to debug a live issue. A recruiter uploads a spreadsheet of candidate records to an AI summarization tool. A sales rep asks an AI assistant to draft a proposal using confidential pricing data. Each interaction feels like productivity. Each one sends company data to systems outside your control, and none of them shows up in your existing security logs.

This is what makes shadow AI fundamentally different from shadow IT. Shadow IT was about unauthorized devices and apps connecting to your network. Shadow AI is about sensitive data leaving through behavior that looks completely normal. The risk does not announce itself.

The good news is that you do not have to choose between enabling AI and protecting your data. What follows is a practical path forward: where data leaks actually happen, how to spot them before they become incidents, which controls work without killing productivity, and a 30-day plan to get from zero visibility to a defensible baseline.

Key takeaways

  • Shadow AI is the use of AI tools (including GenAI) for work without company approval or security oversight, often causing sensitive data to leave the organization through prompts, file uploads, and embedded assistants.
  • Biggest risks: data leakage (PII/source code/credentials), compliance exposure, and untracked AI access inside SaaS apps.
  • Fastest first steps (30 days): discover AI apps in use, classify tools (sanctioned/unsanctioned/unreviewed), enable prompt/upload inspection with inline DLP, apply role-based controls + coaching.

What is shadow AI, and why is it different from shadow IT?

Shadow AI is any AI tool that employees use for work without company approval. This means your team members are already using ChatGPT, Grammarly, or AI-powered browser extensions to get their jobs done faster, but your security team has no visibility into what data flows through these tools.

The key difference comes down to data flow. Shadow IT created risk by connecting unauthorized devices to your network. Shadow AI creates risk by sending sensitive data out through behavior that looks like normal work.

The definition has also expanded beyond public chatbots. Shadow AI now includes agentic AI, which refers to AI systems embedded inside platforms your organization already trusts and pays for. Microsoft Copilot, Salesforce Einstein, and ServiceNow AI features operate with user-level permissions inside your existing software-as-a-service (SaaS) environment. Unlike a public chatbot an employee chooses to open, these agents can act autonomously on behalf of users, reading, summarizing, and acting on data without a deliberate copy-paste decision. That makes them harder to detect and harder to govern with traditional controls.

Here is a small table comparing shadow AI to shadow IT:

 Primary riskTypical signal
Shadow ITUnauthorized apps/devices on the networkUnknown device/app access
Shadow AISensitive data leaving via prompts/uploads/agentsAI web traffic + prompt content

 

Common shadow AI categories

The most common types of unsanctioned AI tools appearing in your environment include:

  • Public chatbots (ChatGPT, Gemini, Claude): Users paste sensitive content directly into prompts, often without realizing that many free-tier tools use conversation data to improve their models.
  • Writing assistants (Grammarly, Jasper): These tools access full document content and maintain session history, meaning sensitive drafts and communications persist beyond a single interaction.
  • Meeting tools (Otter.ai, Zoom AI): Complete audio and video recordings are captured and stored on third-party servers, often including unscripted discussion of confidential decisions.
  • Developer coding assistants (GitHub Copilot, CodeWhisperer): These process source code in real time, including embedded credentials, proprietary logic, and internal architecture details.
  • Embedded SaaS AI (Microsoft Copilots, Salesforce Einstein, ServiceNow AI): These operate inside platforms your teams already trust, with elevated permissions, making them the least visible and most underestimated shadow AI risk.
  • Browser extensions with AI features: AI-powered add-ons that request broad "read and change all website data" permissions can access everything visible in a browser session, including authenticated enterprise portals, customer relationship management (CRM) data, and internal documentation.

Where data leaks happen

Your existing security tools were built to catch file downloads, email attachments, and USB transfers. They were not built for AI. The result is a growing class of data exposure that produces no alerts, no logs, and no incident tickets until something goes wrong.

Enterprises transferred more than 18,000 terabytes of data to AI applications in 2025, a 93% increase year-over-year, according to ThreatLabz. That volume represents an enormous and largely uninspected data flow moving through tools that operate outside most organizations' security controls.

Prompts and copy-paste interactions

Picture a developer troubleshooting a production issue who copies an error log into ChatGPT for analysis. That log contains database connection strings, internal server names, API keys, and customer identifiers. The most common DLP violations detected in AI interactions include name leakage, Social Security numbers, source code, medical information, and credit card data: the full spectrum of regulated and sensitive enterprise content.

The most frequently exposed data types through prompts include:

  • Source code, often containing embedded credentials and proprietary business logic
  • Personal information such as customer records, employee data, and payment details
  • Credentials, including API keys, passwords, and access tokens, were shared for troubleshooting
  • Business documents such as contracts, strategic plans, and confidential communications

File and media uploads

Document uploads multiply your risk exponentially. A single spreadsheet uploaded for AI analysis might contain thousands of customer records. Meeting recordings capture unscripted conversations where participants discuss confidential matters freely, and those recordings are stored on third-party servers, often without explicit participant awareness.

AI responses and outputs

AI responses are an underappreciated leak vector. An AI system can reconstruct sensitive information from prior inputs and surface it in later responses, even in a different user's session if data isolation is inadequate. Beyond echo-back risk, AI outputs can generate hallucinated legal or compliance guidance that employees act on, produce content that violates regulatory requirements, or surface confidential context from earlier in a conversation thread. A single AI interaction rarely feels like a security event. The output it produces can create one.

Browser extensions and embedded assistants

Browser extensions operate with persistent access to your authenticated sessions. An AI extension with "read and change all website data" permissions can access everything visible in a browser session, including enterprise applications, CRM portals, and internal documentation systems. Embedded SaaS AI features carry similar risk: they operate inside platforms employees already trust, often with elevated permissions and without the same visibility or guardrails as standalone AI tools.

Data typePrimary leak vectorCommon scenario
Source codePrompts, file uploadsDeveloper debugging in public AI tools
Personal dataFile uploads, promptsHR team summarizing employee records
CredentialsPromptsAPI keys shared for troubleshooting help
ContractsFile uploadsLegal team reviewing documents in AI tools
System detailsScreenshots, promptsIT team uploading diagrams for analysis

 

How to detect shadow AI usage patterns

Most security teams have a meaningful visibility gap when it comes to AI traffic. Legacy monitoring tools were designed to inspect HTTP transactions. They were not built to govern multi-turn, WebSocket-based AI sessions or classify prompt content as it moves to external systems. Detecting shadow AI requires purpose-built visibility that can identify AI applications by type, inspect session content, and classify what is being sent in real time.

According to ThreatLabz, organizations blocked 39% of AI/ML transactions in 2025, a sign of governance in action. But that means the majority of AI traffic is passing through environments without consistent inspection or policy enforcement. You cannot govern what you cannot see.

Discover the GenAI apps in use

Start by building a complete inventory of every AI application accessed across your environment. This inventory should capture which users access which tools, from which departments, and on which devices. Classify each discovered application into three categories:

  • Sanctioned: Approved for use with appropriate safeguards
  • Unsanctioned: Prohibited due to security or compliance concerns
  • Unreviewed: Awaiting security evaluation and policy decision

Track newly seen AI apps as a high-signal indicator of an expanding shadow AI footprint. New applications emerging faster than they can be reviewed is one of the clearest signs that governance is lagging adoption.

Inspect prompts and responses

You need visibility into the actual prompts users send and the responses they receive. Effective inspection capabilities automatically classify sensitive data types, flagging personal information, credentials, and source code before it reaches external systems. This is the difference between reactive incident response and proactive data protection.

Identify high-signal behavior patterns

Look for these patterns that suggest problematic usage:

  • Repeated sessions: Habitual use of the same unsanctioned tool suggests embedded workflow dependency and a harder containment challenge ahead.
  • File upload attempts: Frequent uploads to unmanaged AI apps indicate a potential bulk data exposure path.
  • Tool hopping: Users switching between multiple AI tools signals they encountered a block or warning on one tool and are actively working around it, making their actual data exposure harder to track across multiple unsanctioned systems.
  • Department spikes: Unusual AI usage increases in Finance, HR, Legal, and Engineering teams each carry distinct data risk profiles worth monitoring separately.

Employee Self-Audit Checklist

Before using any AI tool for work, ask:

  1. Does this tool require a personal login rather than company single sign-on?
  2. Did this tool request permission to "read and change all websites"?
  3. Does the privacy policy mention using inputs for model training or improvement?
  4. Does it auto-appear inside your work apps without IT installation?

 

Controls that reduce risk without blocking productivity

Your goal should be enabling AI adoption safely, not preventing it entirely. Heavy-handed restrictions push usage underground, converting visible shadow AI into invisible shadow AI that creates even greater risk. The right controls let you say yes to AI safely, not just no to everything.

Control who accesses what AI

Granular access policies let you make nuanced decisions rather than simple allow-or-block choices. Role-based policies recognize that appropriate AI use varies significantly by job function:

  • Engineering teams: Need access to code-assistance tools but require guardrails around source code and credentials. Data shows engineering accounts for nearly half of all enterprise AI transactions, making it the highest-priority department for policy coverage.
  • Finance and HR teams: Handle regulated and personally identifiable information (PII) so stricter prompt inspection and upload restrictions apply.
  • Legal teams: Work with privileged and confidential documents that carry specific regulatory handling requirements.
  • Sales teams: Require content-generation tools but should be restricted from inputting confidential pricing, contracts, or customer data into unsanctioned platforms.

Conditional access factors in device management status, user risk score, and location, allowing you to apply tighter controls on unmanaged devices without blocking productivity on managed ones.

Protect data in motion

Inline DLP capabilities inspect content as it flows to AI applications, detecting and blocking sensitive data types, including credentials, source code, PII, and regulated data before they leave your environment. Zscaler's inline inspection does this across both prompts and file uploads without requiring traffic to be rerouted through a separate DLP tool.

Browser isolation provides a middle ground: allow users to interact with AI tools while restricting cut, copy, paste, upload, and download, reducing risk without hard blocks for high-risk but necessary AI interactions.

Enforce acceptable use

Content moderation rules define what types of interactions are permissible beyond just data sensitivity. Comprehensive audit trails capture user identity, application accessed, prompt content, and response received, providing the evidence trail needed for compliance requirements and incident response.

Coaching workflows matter here. When a policy is triggered, guide the user rather than just blocking and moving on. Explaining why an action was restricted and suggesting alternatives builds a security culture that scales better than enforcement alone.

Govern private and internally built AI

Internal teams building AI applications also require governance. Runtime guardrails protect against prompt injection and data leakage in privately deployed models. Developer-built AI often escapes traditional security review processes. In fact, Zscaler red teaming found critical vulnerabilities in 100% of enterprise AI systems tested, with most systems breachable in just 16 minutes. That applies to internally built apps as much as public ones.

A simple three-tier policy framework helps employees understand acceptable use:

The traffic light policy model

  • Green: Approved tools, used with public or non-sensitive information only. No restrictions apply.
  • Yellow: Sanctioned tools with safeguards. Data redaction required, managed device only, no regulated data in prompts or uploads.
  • Red: Prohibited. This includes credentials, regulated data, unreleased product plans, employee records, and confidential contracts.

Employees who want to use an AI tool not currently on the approved list should have a clear path to request a review. Define a simple intake process, such as a form, a Slack channel, or a ticketing workflow, so that tool requests go to security for evaluation rather than going underground.

Your 30-Day shadow AI containment plan

Note: This plan assumes you are starting from limited AI visibility. If partial controls are already in place, you can compress the timeline. The goal is a defensible baseline, not a perfect program on day one.

Days 1-7: Establish your baseline

Enable AI application detection across your environment. Identify your top 10 AI apps by usage volume and the top three departments by AI activity.

Define your "red data" categories: the data types that should never appear in an AI prompt or upload under any circumstances. Then set two baseline key performance indicators (KPIs) to measure against throughout the plan: total AI applications discovered across the environment, and volume of prompts and uploads containing sensitive data detected per week. Without these benchmarks, it is difficult to demonstrate progress or justify expanding controls.

Days 8-14: Put minimum viable guardrails in place

Block or warn on the highest-risk unsanctioned applications identified in Week 1. Enable prompt visibility and classification to track content flowing to AI systems.

Apply inline DLP starting with your highest-risk sensitive data detectors: credentials, source code, and PII. Add warn-and-coach workflows for flagged interactions. Do not just block. Explain what happened and why, and suggest a compliant alternative path.

Days 15-21: Close the exfiltration paths

Deploy browser isolation for high-risk AI categories. Restrict file uploads and downloads to unsanctioned tools.

Apply role-based policies targeting departments that handle particularly sensitive data. Finance, HR, Engineering, and Legal should be your first four. KPI checkpoint: what percentage of AI app usage is now under active policy?

Days 22-30: Sustain and scale

Publish the Traffic Light policy and tool request process. Stand up weekly reporting covering top applications, top violations, and usage trendlines.

Expand controls to cover privately deployed AI apps and models. Internally built AI carries the same data risk as public tools and is often subject to far less scrutiny. Deliver an executive dashboard covering AI adoption volume, blocked leak attempts, coached users, and overall policy coverage.

While organizational controls deploy, employees can take immediate steps:

  • Use temporary or incognito chat modes when AI tools offer them
  • Replace real identifiers with placeholders such as Client A or $X before including them in prompts
  • Pause before pasting any content containing credentials or sensitive identifiers

 

What a mature shadow AI program looks like

Your 30-day plan establishes the foundation. Sustaining it means shifting from reactive containment to continuous governance, and that requires the right architecture underneath it.

Organizations that get this right share a few things in common. Every AI application, prompt, response, and agent interaction is known and inventoried. Access decisions are based on user role, data sensitivity, and device status rather than blanket rules. Sensitive data is intercepted inline before it reaches unsanctioned systems. And usage logs map to compliance frameworks, so audits are tractable rather than painful.

The organizations that struggle are the ones managing this across five or six disconnected point tools. That fragmentation creates gaps, increases operational overhead, and makes it nearly impossible to report coherently on AI risk posture.

The Zero Trust Exchange™ from Zscaler brings it together on a single platform: AI asset discovery, access control, inline data protection, browser isolation, runtime guardrails, and governance alignment across the full AI lifecycle.

See how Zscaler gives you full visibility into your AI environment and the controls to govern it without slowing your teams down.

How Zscaler protects against shadow AI

Zscaler helps you contain shadow AI without turning productivity into an underground workaround, by making AI usage visible, governable, and defensible across the full AI lifecycle. Instead of relying on legacy controls that can’t see into modern AI sessions, Zscaler brings discovery, inline protection, and runtime enforcement together on one platform so “normal work” doesn’t become “silent exfiltration.” That means you can move from zero visibility to measurable control—while staying aligned with evolving AI governance frameworks and internal policy requirements:

  • Find and inventory shadow AI fast by discovering and classifying AI apps—and mapping the broader AI ecosystem (apps, services, models, and connected data) so newly seen tools don’t expand your blind spots.
  • Control access and reduce risky behavior with user- and group-based policies to allow, block, warn, or isolate AI app usage—so teams can keep working while you prevent the highest-risk interactions.
  • Stop sensitive data from leaking in prompts and uploads with high-performance inline inspection that detects and blocks regulated or confidential content (e.g., source code, PII/PHI/PCI) across AI channels before it leaves your environment.
  • Harden AI initiatives with continuous testing and governance alignment using automated AI red teaming and policy mapping to frameworks like NIST AI RMF and OWASP LLM Top 10—so your guardrails and compliance posture keep pace as AI usage scales.

Request a demo to see how Zscaler can help you get shadow AI under control in days—not quarters.

Shadow AI is the use of AI tools, including chatbots, assistants, browser extensions, and embedded AI features, without security or IT approval. It creates data risk because sensitive information routinely moves through prompts and uploads to third-party systems that operate outside your organization's security controls.

Most security controls were not built to treat interacting with an AI tool as a data transfer event. But it is one. According to the Zscaler ThreatLabz 2026 AI Security Report, ChatGPT alone generated more than 410 million DLP policy violations in 2025, representing sensitive data that attempted to leave organizations through a single AI application. Without inline inspection and governance, those data flows are invisible to security teams.

The highest-risk vectors are prompts involving copy-paste of sensitive content, file and media uploads including documents, spreadsheets, screenshots, and recordings, over-permissioned browser extensions that read all page content, and embedded AI features inside SaaS platforms that operate with user-level access. The most frequently detected data types include name and identity data, source code, medical information, and payment data.

Start with visibility. You cannot govern what you cannot see. Then apply contextual controls: allow approved tools, warn or block unsanctioned ones, inspect prompts with inline DLP, and use browser isolation for high-risk interactions. The goal is to make AI usage governed and auditable, not to eliminate it.

Shadow IT was about unauthorized apps and devices. Shadow AI is about unauthorized data flows: sensitive information moving through natural work behavior such as typing, pasting, and uploading to third-party AI systems. The risk is harder to detect because it does not look like an exfiltration event. It looks like someone doing their job.

Enable AI app discovery, identify your highest-risk apps and departments, define what data is never acceptable in a prompt, turn on inline DLP for your most sensitive data types, and add warn-and-coach workflows before going straight to hard blocks. Then publish a simple Traffic Light use policy so employees know what is expected

form submtited
Grazie per aver letto

Questo post è stato utile?

Esclusione di responsabilità: questo articolo del blog è stato creato da Zscaler esclusivamente a scopo informativo ed è fornito "così com'è", senza alcuna garanzia circa l'accuratezza, la completezza o l'affidabilità dei contenuti. Zscaler declina ogni responsabilità per eventuali errori o omissioni, così come per le eventuali azioni intraprese sulla base delle informazioni fornite. Eventuali link a siti web o risorse di terze parti sono offerti unicamente per praticità, e Zscaler non è responsabile del relativo contenuto, né delle pratiche adottate. Tutti i contenuti sono soggetti a modifiche senza preavviso. Accedendo a questo blog, l'utente accetta le presenti condizioni e riconosce di essere l'unico responsabile della verifica e dell'uso delle informazioni secondo quanto appropriato per rispondere alle proprie esigenze.

Ricevi gli ultimi aggiornamenti dal blog di Zscaler nella tua casella di posta

Inviando il modulo, si accetta la nostra Informativa sulla privacy.