Shadow AI Scenarios

Maria in HR uploads a hiring contract to ChatGPT

A routine task with devastating consequences for data protection.

Step 1

Maria receives a hiring contract

The document contains the candidate's full name, national ID (kennitala), proposed salary of €65,000, home address, and emergency contact details.

Step 2

She opens ChatGPT

Maria types: "Summarize this contract and check for any issues." She attaches the full document, all 12 pages of it.

Step 3

The data is transmitted to external servers

The entire document content is sent to OpenAI's servers for processing.

Data leaked: Full name, national ID (451090-2389), salary (€65,000), home address, emergency contact, bank account for direct deposit

Step 4

Data is stored and potentially used for training

The personal data is now outside organizational control. GDPR Article 6 violation: no legal basis for this processing. The candidate was never informed their data would be sent to a third-party AI provider.

With Sanitica

Full Clean mode prevents this

Sanitica's full clean mode intercepts the document, permanently removes the candidate's name, national ID, salary, address, and contact details at binary level. ChatGPT receives a fully sanitized copy. Maria gets her summary. Zero data exposure.

The sales team uses Copilot on their SharePoint folder

AI tools enabled on shared folders expose years of accumulated sensitive data.

Step 1

Client proposals stored in SharePoint

The sales team's shared folder contains 3 years of client proposals, contracts, pricing sheets, and negotiation notes. Over 2,000 documents with client names, deal values, and competitive analysis.

Step 2

IT enables Microsoft Copilot

Copilot is enabled on the SharePoint library. It begins indexing every document in the folder, including tracked changes, comments, and metadata.

Step 3

Confidential data is surfaced

Copilot processes all documents. Hidden comments reveal internal pricing discussions: "We can go as low as €1.2M." Document metadata exposes organizational structure.

Data exposed: Internal pricing floors, negotiation strategies, client contact details, competitive analysis from hidden comments and tracked changes

Step 4

Any employee can query this data

"What's the lowest price we offered Nordic Solutions?" Now answerable by anyone with Copilot access. Confidential negotiation strategies exposed. Client NDAs potentially breached.

With Sanitica

Pseudonymize mode protects identities

Sanitica's pseudonymize mode processes documents before Copilot indexes them. Client names become consistent aliases (“Client-A7”), personal data is replaced, and confidential comments are removed. Copilot remains fully useful. It can reference “Client-A7” consistently across documents, but cannot expose real identities or pricing.

A developer pastes source code with API keys into an AI assistant

Debugging with AI can expose production credentials and customer data.

Step 1

Debugging a payment processing module

The developer is troubleshooting a bug in the payment integration. The code contains hardcoded API keys, database connection strings, and test fixtures with real customer data.

Step 2

200 lines pasted into AI coding assistant

"Why is this function failing?", along with the full source code including production credentials and customer email addresses from test fixtures.

Step 3

Credentials are now on external servers

The AI processes the full input. Production API keys, database passwords, and customer PII are transmitted and potentially stored.

Data leaked: Production API key (sk_live_4eC39HqLy...), database credentials (host, user, password), 15 customer email addresses from test fixtures

Step 4

Production systems at risk

These credentials could be accessed by the AI provider's employees, included in training data, or exposed in a future breach. Potential PCI-DSS violation on top of the GDPR breach.

With Sanitica

Full Clean mode replaces secrets

Sanitica's full clean mode scans the code before it reaches the AI assistant. API keys, credentials, and PII in test data are identified and permanently replaced with safe placeholders. The developer gets debugging help. No secrets leak.

Finance uploads expense reports with employee SSNs

Batch processing with AI tools can expose the personal data of your entire workforce.

Step 1

Monthly expense reports collected

The finance team collects expense reports from 200 employees. Each report contains the employee's name, social security number, bank account number, home address, and expense descriptions mentioning client names.

Step 2

Batch upload to AI tool

The finance manager uploads all 200 reports at once: "Categorize these expenses and flag any policy violations."

Step 3

200 employees' data exposed

Every employee's personal financial data is now on external servers.

Data leaked: 200 social security numbers, 200 bank account numbers, home addresses, personal spending patterns, client names from expense descriptions

Step 4

Massive breach affecting entire workforce

A breach at the AI provider could expose the financial identity of every employee. Mandatory GDPR notification within 72 hours. 200 data subjects affected. Potential identity theft.

With Sanitica

Full Clean mode strips personal data

Sanitica's full clean mode processes expense reports before the AI tool sees them. SSNs removed, bank numbers stripped, addresses deleted at binary level. The AI can still categorize expenses and flag violations based on amounts and categories. No personal data leaves the organization.

A lawyer sends a contract with tracked changes to a client

Hidden revision history reveals your entire negotiation strategy.

Step 1

Contract negotiated over 5 revisions

A partnership agreement goes through 5 rounds of internal revisions. Tracked changes record every edit: “Original price: €80M” → “Revised: €60M” → “Final: €50M.” Comments include: “They'll accept 50M, push for 45M first.”

Step 2

Lawyer exports as PDF and sends to client

The lawyer saves the Word document as PDF and emails it to the client. The visible text shows only the final terms: €50M.

Step 3

PDF retains full revision history

A PDF “saved from Word” often retains the original Word metadata, tracked changes, and comments in its data structure.

Data exposed: Full negotiation history (80M → 60M → 50M), internal comments revealing strategy, author names and edit timestamps, internal file paths

Step 4

Client reads the negotiation strategy

A simple metadata reader or AI tool reveals the full revision history. The client now knows your bottom line was €45M and your initial position was €80M. Your leverage is gone. Future negotiations are compromised.

With Sanitica

Metadata Only mode strips hidden history

Sanitica's metadata only mode strips tracked changes, comments, author history, and hidden fields from the document. The visible contract text stays exactly as written. The client sees only the final terms: no negotiation history, no internal comments, no metadata.

Shadow AI in action

Maria in HR uploads a hiring contract to ChatGPT

Maria receives a hiring contract

She opens ChatGPT

The data is transmitted to external servers

Data is stored and potentially used for training

Full Clean mode prevents this

The sales team uses Copilot on their SharePoint folder

Client proposals stored in SharePoint

IT enables Microsoft Copilot

Confidential data is surfaced

Any employee can query this data

Pseudonymize mode protects identities

A developer pastes source code with API keys into an AI assistant

Debugging a payment processing module

200 lines pasted into AI coding assistant

Credentials are now on external servers

Production systems at risk

Full Clean mode replaces secrets

Finance uploads expense reports with employee SSNs

Monthly expense reports collected

Batch upload to AI tool

200 employees' data exposed

Massive breach affecting entire workforce

Full Clean mode strips personal data

A lawyer sends a contract with tracked changes to a client

Contract negotiated over 5 revisions

Lawyer exports as PDF and sends to client

PDF retains full revision history

Client reads the negotiation strategy

Metadata Only mode strips hidden history

How exposed is your organization?