Back to Blog
AI GovernanceAsset InventoryShadow AIMCPRisk Management

Building an AI Asset Inventory: Models, Agents, MCP Servers, Datasets, and Prompts You Don't Know About

Stay Updated on AI Risk & Compliance

Get notified when we publish new insights on AI risk assessment, regulatory compliance, and security testing.

Eighty-six percent. That's how many organizations lack adequate visibility into AI data flows and don't maintain a usable asset inventory (IBM's 2025 Cost of a Data Breach work). Most practitioners already felt it: no real view of where AI runs, what data it touches, or what's been stood up in a corner nobody's looked at. You can't govern what you can't see. The gap isn't "we have a chatbot." It's models, agents, MCP servers, datasets, and prompts living in repos, vendor contracts, and ad hoc workflows. An inventory that reflects reality means expanding what you count as an asset, then putting discovery and classification in place so the list doesn't go stale the day after you publish it.

What most inventories miss

The typical first pass is a list of applications: the customer-facing chatbot, the internal copilot, maybe a vendor tool that "uses AI." That's a start. Also a small subset. AI assets span at least five layers that often get ignored or undercounted.

Models: foundation models, fine-tuned variants, embeddings, checkpoints. They live in model registries, object storage, vendor APIs. You need to know which are in use, where they're hosted (your infra vs. vendor), how they're versioned. A team that switched embedding models last quarter may have left the old one in the pipeline; both are assets. Same for models pulled from Hugging Face or other hubs and used in notebooks or internal tools. If you only track "we use GPT-4 for the chatbot," you're missing the model that powers search, the one in the data science pipeline, and the one behind the vendor's black box.

Agents don't just answer. They plan, call tools, take actions. Each agent is an asset: what it's allowed to do, which tools it can invoke, under what identity. Agents multiply your attack surface; a single agent can touch many downstream systems. An inventory that lists "applications" but not "agents" misses the automation making decisions or moving data without a human in the loop. Those need to be first-class entries: name, owner, tool allowlist, how they're monitored.

MCP servers. The Model Context Protocol is the standard way for agents and copilots to talk to external tools and data. Your teams are almost certainly using them. Cursor, Claude Code, GitHub Copilot, and internal agent frameworks all support MCP. A single environment can have dozens: filesystem, databases, Slack, internal APIs. MCP servers are AI assets. They define what your models and agents can read and do. Unlisted, they're invisible until something goes wrong. You need a registry of which MCP servers exist, who approved them, what they expose, which agents use them. Azure API Center and others are starting to offer MCP server inventory; if you're not there yet, start with a manual or script-driven list and tie it to your agent and app inventory.

Datasets: training sets, evaluation sets, RAG knowledge bases, vector stores. They drive model behavior and retrieval. The model is only part of the story; the data it was trained on or retrieves from is the other. Datasets belong in the inventory: what they contain (and sensitivity), where they live, who can change them, which models or applications depend on them. When a RAG index is updated, behavior can shift. When a training set is swapped, your fine-tuned model changes. No dataset inventory means you can't trace drift or compliance exposure back to a specific asset.

Prompts. System prompts, few-shot examples, prompt templates. Not "just text." They encode policy, constraints, sometimes sensitive context. A leaked system prompt can expose instructions or data you didn't intend to publish. Prompts that drift (e.g., copied and edited in a doc somewhere) create inconsistent behavior and compliance risk. Treat them as versioned assets: where they live, who can edit them, which systems use which version. That means knowing what's in production and how it's controlled, not locking everything down.

If your inventory only has "applications" and maybe "models," you're missing agents, MCP servers, datasets, and prompts. Each of those gaps is a place where risk and shadow AI hide.

Discovery: how you find what you didn't declare

Declared assets are the easy part. The rest show up when you run discovery that doesn't rely on people remembering to fill out a form.

Procurement and contracts: any new or renewed contract that might involve AI (SaaS, APIs, embedded features) should feed the inventory. Product name, vendor, what the AI does, what data it can access, who owns the relationship. If procurement doesn't have a step for "does this involve AI?" add one. Highest-leverage place to catch vendor-supplied models, agents, and data flows.

Code and repos. Scan for model references (API clients, model IDs, paths to checkpoints), MCP server configs, prompt files. Many teams keep prompts in code, config, or docs. Repo scanning won't find everything, but it will find a lot of what never made it onto a spreadsheet. Automate where you can; run it on a schedule and when new repos or major changes land.

Runtime and network. Outbound traffic to known AI providers (OpenAI, Anthropic, Google, Azure OpenAI, Bedrock, etc.) is a signal. So is traffic to MCP endpoints if you can identify them. Correlate with your allowlist: sanctioned use vs. everything else. CASB, proxy, and firewall logs can show which apps and hosts are talking to AI services. You're not trying to block first. You're trying to see. What you find gets reconciled with the inventory and either sanctioned, replaced, or restricted with a clear reason.

Surveys and business-unit checks. Ask teams what they use for generation, summarization, coding help, or automation. Ask what they'd use if they needed to generate or classify something quickly. You'll get a mix of approved tools and shadow use. Use the answers to prioritize technical discovery and close the loop: add what you find, assign ownership, classify risk.

Discovery is iterative. The first pass will be messy. Shrink the unknown; don't chase perfection. Each cycle should add missing assets and flag ones that need assessment or remediation.

Classification: making the inventory actionable

An inventory that's only a list isn't enough. Each asset needs enough structure to support risk decisions and ongoing governance.

Ownership and source: who owns this asset (team or person)? How was it discovered (declared, procurement, scan, survey)? If everything came from self-report, you're missing things. Track source so you can weight and improve discovery.

Function and context. What does it do? Internal vs. customer-facing, human-in-the-loop vs. autonomous, decision-support vs. full automation. That drives risk tier and who cares. A model that suggests the next support ticket is different from one that denies a loan. One line of description plus deployment context is enough to triage.

Data and integration. What inputs does it use, and from where? What does it output, and who or what consumes it? Data lineage and integration points are where a lot of risk hides. A summarization tool with access to PII is a compliance and breach story. For MCP servers, document what they expose and which agents or apps call them.

Risk tier. Each asset should have a risk classification aligned with your framework (EU AI Act, NIST AI RMF, or your own taxonomy). The tier determines how much assessment and monitoring you do. Can be provisional at first. Make it explicit and revisable as you learn more.

Provenance and updates. Is the model or dataset fixed or updated? If updated, on what data and how often? For vendor systems, what do you know (or not know) about their model and refresh cycle? "We don't know" is valid if it's documented and triggers follow-up.

You don't need every field perfect on day one. You need enough to prioritize, assign accountability, and know what to do when something changes or a regulator asks.

Keeping It Current: Triggers and Loops

An inventory that's updated only when someone remembers is a snapshot from the past. It stays useful when updates are triggered by real events.

Tie inventory updates to procurement (new or renewed contracts that involve AI), release pipelines (new or changed models, agents, or prompts at deploy time), and periodic discovery (quarterly or so: scans, surveys, reconciliation). When a new regulation or policy lands, use it as a reason to refresh and reclassify. Treat compliance deadlines as inventory-hygiene deadlines.

For MCP servers and agents, add a trigger when new servers are registered or new agents are deployed. Whoever operates the MCP registry or agent framework should have a lightweight step: new asset → inventory entry (or link to existing). Over time, discovery plus these triggers turns "we don't know what's out there" into "we know, and we've decided what to do with it."

The Cost of Staying Blind

Eighty-six percent blind to AI data flows isn't a club you want to join. Shadow AI already accounts for a material share of breaches; those incidents cost more when the tool was never in the inventory. Know what you have (models, agents, MCP servers, datasets, prompts), then classify and monitor so governance and risk decisions are based on reality, not a stale spreadsheet. Start with the five layers. Run discovery. Add structure. Plug in triggers. Repeat. The inventory that matters is the one still being updated in six months.


Building or tightening an AI asset inventory? Reach out for independent AI risk assessments and inventory-driven governance.

Ready to Get Started?

Get an independent
AI risk assessment

Our team of offensive security engineers can assess your AI systems for vulnerabilities, bias, and regulatory compliance gaps. Evidence-backed findings, not compliance theater.

Request a Review