LayerX’s 2025 Enterprise AI and SaaS Data Security Report gave the problem a number: 77% of employees have pasted company data into a generative AI tool. Half of those admitted to sensitive business data; 18% to highly sensitive or proprietary material. A fifth of copy-pastes and two-fifths of file uploads to GenAI sites contained PII or payment data. The kicker: 82% of that activity came from unmanaged personal accounts. So even when the data is flowing out, most enterprises can’t see it. Harness’s State of AI-Native Application Security 2025 survey sharpens the picture: 62% of security practitioners say they have no visibility into where LLMs are in use across their organization. Shadow AI isn’t a side effect of adoption. It’s the default. The real work is building detection and governance that shrinks the shadow instead of pretending you can ban it away.
The visibility gap isn’t a tooling gap first
When two-thirds of security teams can’t say where LLMs are running, the cause isn’t only “we didn’t buy a product.” LLMs show up in sanctioned apps (Copilot in M365, AI in Notion, Zoom, Salesforce), in homegrown integrations, in developer experiments on company cloud accounts, and in the browser tab where someone is logged into ChatGPT on a personal account. Traditional asset and app discovery was built for installed software and SSO-connected SaaS. It doesn’t naturally see “this user is sending prompts to an API” or “this session is hitting an AI provider.” So visibility breaks down along several axes: which applications use or embed LLMs, which accounts (corporate vs personal) are in use, and which data is going in. LayerX’s 82% unmanaged-account figure is the signal: the bulk of high-risk behavior is happening in channels the organization never designed to observe. You can’t govern what you don’t see. So the first goal of a shadow AI program is to make the invisible visible—then decide what to do with it.
Detection that doesn’t depend on one silver bullet
No single control sees everything. Effective detection is a combination of signals.
Application and SaaS usage. CASB, SaaS security posture, or similar tooling that knows which cloud apps are in use can be extended with an “AI-enabled” or “GenAI” classification. Traffic to known AI provider domains and APIs (OpenAI, Anthropic, Google AI, Azure OpenAI, etc.) and to apps that embed or resell them gives you a list of touchpoints. That tells you where AI is being used from the corporate environment, even when the account is personal. You’re not reading prompts; you’re seeing that a user or device is talking to an AI endpoint. That’s enough to start a risk conversation.
Identity and access. Where you enforce SSO, the IdP application catalog is a natural boundary. AI tools that support SAML/OIDC but aren’t in the catalog are shadow by definition. You won’t see every free-tier, no-SSO signup, but you’ll see team- or department-adopted tools that got connected to corporate identity. Those are prime candidates for “bring into governance or replace.”
Network and endpoint. Proxy, firewall, or SWG logs can show outbound calls to AI provider APIs. You need an allowlist of sanctioned destinations (e.g. your own API keys, approved SaaS that uses those providers) and then treat everything else as shadow until it’s reviewed. On the endpoint, browser or device telemetry can show repeated access to AI tool domains and AI-related extensions. That catches the long tail of web-only use that never shows up in SSO or as a formal “app.” Privacy and transparency matter: staff should know that corporate devices are monitored for security; frame it as “we need to see which AI tools touch our data so we can secure them,” not “we’re watching every tab.”
Data in motion (where you have it). DLP or secure web gateways that inspect content can sometimes detect paste/upload behavior or sensitive data heading to known AI domains. Coverage is uneven and can be brittle, but where it’s already in place, it’s an extra signal. Prioritize high-sensitivity data and regulated domains.
None of these alone is sufficient. Together they give you a map: which AI touchpoints exist, which are sanctioned, which are personal or unmanaged, and where the biggest data-exposure risk likely is.
Governance that enables instead of only blocking
If the first response to “we found shadow AI” is “block it all,” usage moves to personal devices, home networks, and mobile—and you see even less. The goal is to bring as much as possible into the light and govern it, and to restrict only what can’t be safely allowed.
Triage by data and risk. Not every unsanctioned use is equal. Summarizing public content in a free chatbot is not the same as pasting customer PII or source code. Use detection to prioritize: which apps, which teams, which data types (where inferable). Focus first on high-sensitivity data and high-risk functions (finance, HR, healthcare, legal, product/IP). That’s where the 77% and the unmanaged-account stats turn into real incidents and regulatory exposure.
Sanction where it fits. When a tool is widely used and the use case is legitimate and lower risk, consider bringing it in. Negotiate an enterprise agreement, add data and security terms, run it through SSO and acceptable-use boundaries, and put it on the approved list. Users keep working; you gain visibility and control. That’s the opposite of being the team that only says no.
Replace when you must. When a tool can’t meet your security or compliance bar, offer a sanctioned alternative and explain why the other one isn’t allowed. When there’s no alternative and the use case is business-critical, you’re in a risk decision: accept with explicit controls and approval, or restrict and document the impact. The decision should be informed by detection and risk, not by default denial.
Policy and communication. Publish a short list of approved AI tools and how to request one. Make it easy to ask “can I use X for Y?” and give clear, fast answers. If the first message from security is “stop,” the next round of shadow use will be harder to see. If the message is “we’re mapping what’s in use so we can secure it and approve good options,” you have a chance to align policy with reality. The 77% didn’t paste data because they wanted to cause a breach; they had a job to do and a tool that seemed to help. Governance that gives them a safe path is more effective than one that only forbids.
Close the loop with inventory and process
Treat shadow AI discovery as input to a living AI inventory. Each detected touchpoint is a candidate: add it, assign an owner, classify risk, and track whether it was sanctioned, replaced, or restricted. Over time, “we don’t know what’s out there” becomes “we know, and we’ve decided.” Detection feeds triage; triage feeds policy and approved lists; policy feeds communication and training. Re-run discovery periodically—new tools and new integrations show up constantly. The 62% visibility gap closes when detection and governance are a repeating process, not a one-off project.
Seventy-seven percent of employees have already put company data into a chatbot. Sixty-two percent of security teams can’t see where LLMs run. The fix isn’t to ban AI until the problem goes away; it’s to see it, triage it, and govern it so that the next incident isn’t the first time you’re hearing about the tool.
Worried about shadow AI and uncontrolled LLM usage? We do independent AI risk assessments and governance program design. Get in touch.