How to Audit ‘Incognito’ AI Privacy Claims

A practical audit framework for AI privacy claims: retention tests, ephemeral session checks, logs, and contract controls.

Vendors love the word incognito. It signals discretion, gives buyers a sense of safety, and implies that a chat disappears as if it never existed. But the recent Perplexity lawsuit reporting that “Incognito” chats may not be so private is a useful reminder for security and compliance teams: branding is not a control. If your business is evaluating third-party AI assistants, you need a repeatable way to test privacy claims, validate retention behavior, and document contractual protections before sensitive data ever enters the system.

This guide gives you that framework. It translates vague vendor promises into an audit process you can run with procurement, legal, IT, and security stakeholders. You will learn how to perform data retention testing, verify whether a session is truly session ephemeral, review logging behavior, and insist on contract controls that back up marketing claims. The goal is not to assume every vendor is deceptive. The goal is to avoid accidental exposure, unsupported assurances, and audit gaps when regulators, customers, or internal risk committees ask the hard questions.

For teams building AI governance programs, this article pairs well with practical procurement guidance like selecting an AI agent under outcome-based pricing and security-first integration planning such as integration patterns for data flows, middleware, and security. In other words: don’t evaluate AI privacy in isolation. Evaluate it as part of your broader control environment.

Why “Incognito” Is Not a Privacy Guarantee

Privacy language is marketing until it is measured

Security teams routinely see “private mode,” “temporary chat,” “ephemeral session,” and “no training on your data” in vendor materials. Those phrases may describe intended behavior, but they do not prove what the platform actually does under load, across subsystems, or after a policy change. In practice, a vendor can retain metadata, store logs for abuse detection, back up content in disaster recovery systems, or keep conversational traces in support tooling even if the user interface suggests deletion. That is why privacy verification has to focus on observable behavior and contractual evidence, not on labels.

The Perplexity lawsuit matters because it spotlights a familiar pattern: consumers and enterprises often infer stronger data protection than the product truly provides. For defenders, the lesson is broader than one company. Any subscription-less AI feature, copiloted workflow, or external LLM plugin can create hidden retention paths. If your policies allow staff to paste source code, credentials, customer data, or regulated data into an AI interface, then a misleading “incognito” claim becomes a governance problem, not a UX detail.

Ephemeral in the UI does not always mean ephemeral in the stack

Many AI services use multi-layered architectures: front-end session handling, application logs, analytics pipelines, abuse detection systems, content moderation stores, and backup systems. A session may disappear from the user dashboard while copies of the request persist in server logs or operational telemetry. Even when content is formally “deleted,” retention windows can vary by system and purpose. That is why you should separate user-visible deletion from backend retention during your review.

Pro tip: When a vendor says “temporary” or “incognito,” ask them to define the exact systems affected: UI history, application logs, object storage, backups, support tickets, analytics, and model-improvement pipelines. If they cannot map the flow, they probably have not governed it well either.

The real risk is not only privacy loss, but false assurance

False assurance is especially dangerous in regulated environments. Teams may route confidential documents into a tool because they believe the session is private, then later discover that logs were retained for 30, 90, or 180 days. That can trigger policy violations, breach notification obligations, contractual breaches, or audit findings. It can also undermine internal AI adoption if employees stop trusting the security team’s guidance. The better approach is to treat every vendor claim as a hypothesis to test.

Build a Vendor Audit Framework for AI Privacy

Start with a claim inventory

Before you do any technical validation, build a claim inventory. Capture every privacy statement from the website, documentation, terms of service, security whitepaper, DPA, and sales deck. Group claims into categories such as retention, deletion, training, access control, subprocessors, data residency, and logging. For each claim, assign an owner and define what evidence would confirm or refute it. This reduces the common problem where procurement, legal, and security review different parts of the vendor story but never reconcile them into one control set.

Use the same discipline you would apply to other high-risk vendor evaluations. A practical reference point is protecting data with vendor contracts and portability checks, which shows why operational and legal controls must move together. Privacy claims are no different. If the vendor says data is deleted on request, your evidence should include both a technical deletion path and a contractual deletion commitment with timing, scope, and exceptions spelled out.

Define what “private” means for your use case

Not every environment needs the same control bar. A developer testing public code snippets may tolerate a different risk profile than a hospital, financial institution, or SaaS company handling customer records. Create a use-case matrix that specifies which data classes are allowed, prohibited, or conditionally allowed. For each class, determine whether the tool must support no-retention mode, tenant isolation, regional hosting, customer-managed keys, or admin access restrictions. This prevents vague statements like “we can probably use it for low-risk tasks” from becoming policy.

Teams often benefit from the same kind of evaluation discipline used in other “claim-heavy” categories, whether they are assessing product efficacy or performance. For a useful mindset, see how buyers are taught to verify “what’s worth buying” by testing claims rather than repeating them in claims-driven shopper guides. The analogy is simple: trust the evidence, not the packaging.

Score the vendor on control depth, not control count

A long checklist is not enough. Five weak controls can be worse than two strong ones if they are not enforceable. Score each vendor on whether it provides measurable control depth: configurable retention windows, exportable audit logs, customer-controlled deletion, documented subprocessors, SSO/SAML, SCIM, data residency options, and explicit training opt-outs. Then score whether those controls are actually available on your purchased tier. Many vendors reserve the strongest privacy features for enterprise plans or custom contracts.

Audit Area	What the Vendor Says	What You Should Verify	Evidence to Collect	Risk if Missing
Chat retention	“Incognito chats are private”	Backend retention duration and systems affected	Retention policy, test account screenshots, support confirmation	Regulated or sensitive data persists unexpectedly
Deletion	“Users can delete chats”	Whether deletion removes logs, backups, and analytics	Deletion workflow, DPA language, retention exception list	Data remains accessible after deletion
Training use	“We do not train on your data”	Whether prompts or outputs are used for model improvement or abuse detection	Terms of service, privacy policy, opt-out setting	Confidential content enters training pipelines
Access control	“Enterprise-grade security”	SSO, RBAC, audit logs, admin controls, MFA enforcement	Admin console screenshots, security docs	Overbroad employee access and weak traceability
Data residency	“Global infrastructure”	Actual region of storage, processing, and support access	Subprocessor list, regional architecture statement	Cross-border transfer and compliance issues

How to Test Data Retention Claims in Practice

Run a controlled prompt test

One of the fastest ways to evaluate a vendor is to create a controlled test account and submit unique marker content. Use a string that is easy to identify later, such as a sentence containing a random token, a fake phone number, and a placeholder record ID. Then check whether the content appears in the session history, whether it remains retrievable after deletion, and whether it resurfaces in exports or support logs. If the tool has an “incognito” mode, repeat the same test there and compare behavior against a standard session.

This is the privacy equivalent of a lab test. You are not trying to prove the vendor is malicious; you are trying to discover what the system actually retains. If the vendor is serious, it should be able to explain the specific retention path and show you the controls that reduce it. If the vendor is evasive, that is itself a signal. Teams that are used to evaluating performance claims in other domains may find the same rigor useful as in debugging and testing local toolchains—because repeatable verification beats assumptions.

Test after deletion and after time passes

Retention bugs and policy mismatches often become visible only after the product has had time to process, index, replicate, or back up the data. Do not stop at immediate deletion. Return after 24 hours, then after 7 days, then after the advertised retention period, and attempt to recover the content through standard user interfaces, exports, API calls, or account recovery processes. If the vendor supports ticketed support, ask whether support agents can access deleted content for troubleshooting or abuse review.

Document the exact timestamps, screenshots, and account states. The goal is to have evidence that stands up in a review meeting or procurement exception process. This style of evidence collection mirrors the discipline of a real-world predictive AI risk review: you are checking whether the tool does what it claims under realistic conditions, not just in the demo.

Validate exports, backups, and hidden persistence paths

A vendor can delete a conversation from the interface while still retaining it in export packages, compliance archives, or backups. Ask specifically whether deleted sessions are purged from all retrievable stores on the same schedule. Ask whether any systems preserve content for legal hold, incident response, fraud detection, or platform safety. If the answer is yes, ask how those retention reasons are separated from ordinary user content and how long the exception lasts. Many organizations get tripped up by “temporary” retention that silently extends because logs are not considered content even though they still contain the content.

For security teams, this is where operational controls matter as much as policy language. A useful analogy is choosing the right security upgrade based on actual risk, not branding, as discussed in smart home security upgrade selection. In AI privacy, the equivalent question is: which systems actually see the data, and for how long?

Verify That Sessions Are Truly Ephemeral

Check whether a session ID survives logout and refresh

“Ephemeral session” should mean the user experience does not reconstruct the chat after logout, browser closure, or token refresh unless the vendor explicitly says it stores a history. Test this in a normal browser, private browser window, and mobile app if available. Observe whether the session ID changes, whether the content is reloaded from a backend store, and whether references to prior prompts remain accessible through browser history or application state. If the app uses embedded analytics or telemetry scripts, determine whether those scripts capture prompt fragments or identifiers.

Ephemeral claims also deserve operational scrutiny. If a tool claims to be temporary but still supports “continue where you left off,” “recent activity,” or cross-device sync, then the session is not truly ephemeral in the way many buyers assume. A good benchmark is whether an administrator can independently verify the storage path from request to deletion. If they cannot, then the term is doing more marketing than governance work.

Test cross-device and cross-browser persistence

Users often interact with AI tools on multiple endpoints. A session that seems ephemeral on desktop might persist on mobile due to account sync, push notifications, or cached conversation history. Test the same chat from different devices and browsers signed into the same account, then from different accounts on the same device. This reveals whether persistence is tied to the device, the identity provider, or the backend account record. If the vendor cannot explain this behavior cleanly, your privacy team should assume there is hidden persistence until proven otherwise.

Ask for architecture diagrams, not just product statements

Vendors that truly understand their privacy posture should be able to provide a high-level architecture diagram showing where messages enter, where they are stored, what telemetry is captured, and which components can access raw content. Request diagrams for production, support, analytics, backup, and abuse-detection workflows. This is especially important for companies that integrate with external model providers or multiple assistants. If a vendor uses a brokered architecture, the privacy story has to cover not only the front-end product but also every downstream model or processor.

For enterprise teams, the same logic applies when orchestrating multiple assistants across workflows. The legal and technical complexity described in bridging AI assistants in the enterprise becomes even more important when privacy promises depend on hidden routing behavior. If data leaves the product to reach another model, the “incognito” label may no longer mean what users think it means.

Audit Logs, Access Paths, and the Support Problem

Logs are data too

Many organizations focus so heavily on chat retention that they forget logs. Yet logs often contain the exact information security teams are trying to protect: prompt text, user IDs, timestamps, IP addresses, device fingerprints, response snippets, and error payloads. Ask whether logs are redacted, tokenized, or fully stored. Ask who can access them, under what approval process, and how long they are retained. If logs are needed for abuse detection, ask whether the vendor can isolate forensic content from normal operational logging.

Pro tip: If the vendor says “we only log metadata,” ask them to define metadata precisely. In many systems, “metadata” still includes enough context to reconstruct the user’s intent, the topic, and the confidential workflow.

Support access can defeat privacy mode

Even the best privacy settings can be weakened by customer support workflows. If support engineers can view prompt history, export sessions, or replay user interactions to troubleshoot issues, that access must be documented, limited, and logged. You should ask whether support access is case-based, role-based, time-bound, and approved through ticketing. You should also ask whether support personnel are in the same jurisdiction as your data or whether they can access content from global locations.

Strong vendor governance often mirrors the safeguards used in other sensitive systems. For instance, just as enterprise Apple security programs must limit admin sprawl and monitor privileged access, AI vendors need disciplined access controls around conversational content. Without that discipline, “private mode” is easy to promise and hard to trust.

Threat model the admin console and APIs

If the vendor offers an admin console or API, test what an administrator can see. Can an admin export conversations, search across users, recover deleted sessions, or alter retention settings? Can auditors review the logs? Can customer administrators segregate business units or legal entities? These capabilities are good when they are designed for governance, but they become risky when access is broader than documented. Your privacy review should therefore include not only end-user behavior, but also privileged access behavior.

Contract Controls That Turn Claims Into Commitments

Translate privacy promises into exact contract language

If a privacy feature matters to your business, it belongs in the contract, not just the marketing page. At minimum, your MSA, DPA, or order form should specify retention periods, deletion timelines, subprocessors, data use restrictions, and support access limits. If the vendor says it will not use your data for model training, the contract should state that clearly, with any narrow exceptions disclosed. If the vendor offers enterprise privacy options, name them explicitly so they cannot be removed later without notice.

Contract language should also align with the operational reality of the product. A claim of “immediate deletion” should not coexist with a seven-day backup retention exception unless that exception is disclosed and accepted. A claim of “no training” should not allow broad product-improvement usage. If the vendor expects to rely on legitimate interests, contractual safeguards become even more important to reduce ambiguity and keep the audit trail clean.

Require notice for material privacy changes

Vendors update terms, architecture, and data-use policies frequently. Require advance notice for material changes to retention, training, subprocessors, logging, or support-access procedures. Ideally, the contract should give you the right to object, terminate, or suspend use if privacy terms change materially. That is especially important for long-lived enterprise deployments, where a small policy shift can affect thousands of chats and multiple business units.

Map contractual controls to compliance obligations

For regulated teams, contract controls should map back to actual obligations such as GDPR data minimization, HIPAA safeguards, SOC 2 privacy commitments, contractual confidentiality terms, and internal policy requirements. Do not assume the vendor’s generic “enterprise security” language satisfies your legal standard. Document where each control lives: user settings, admin policy, DPA, security exhibit, subprocessors list, or audit report. This makes renewals and audits much easier because you can show exactly which promise supports which control.

When you manage these controls well, procurement becomes a defense function rather than a paperwork exercise. That is the same lesson behind vendor contract and portability checklists: if you cannot point to the clause, you do not really have the control.

Operational Guardrails for Security Teams

Create an AI data classification policy

Make the policy explicit about what employees may paste into external AI tools. Prohibit secrets, credentials, regulated data, customer PII, unreleased source code, legal advice, and incident-response details unless the vendor has been approved for that category. A strong policy usually has three levels: allowed, restricted, and prohibited. The more precise you are, the less likely employees will improvise their own judgment in the moment.

Restrict approved tools and route users through governance

Instead of banning AI entirely, define approved tools, approved tiers, and approved use cases. Require SSO, enforced MFA, vendor review, and logging before allowing any business data. If the product has a privacy mode, document exactly when employees must use it and what kinds of data are still forbidden. This is the practical version of safe AI adoption: not fear, but controlled usage.

Continuously re-test, because privacy claims drift

A vendor that passed review in January may fail by June after a product acquisition, architecture migration, or policy update. Re-test data retention, deletion, and session persistence on a schedule, and after any major vendor announcement. Keep a record of test artifacts so you can compare changes over time. For teams that want to stay ahead of product shifts, the discipline is similar to how analysts monitor evolving platform behavior in retention and monetization changes in AI products—capabilities and constraints do not stay static.

A Practical Vendor Audit Checklist You Can Reuse

Pre-contract questions

Ask whether the vendor trains on your prompts or outputs, whether it stores chat history by default, whether “incognito” disables only visible history or also backend storage, and whether support staff can access content. Ask for the subprocessor list, retention schedule, and data deletion path. Ask whether enterprise tenants can disable all non-essential logging. Then ask for the answers in writing.

Technical validation steps

Create a test account, submit unique markers, delete the chat, and verify whether the content persists in UI, export, support, analytics, and recovery paths. Repeat the test under standard and private sessions. Check cross-device behavior and run the test again after the retention period. Capture evidence in screenshots and timestamps so the results are defensible.

Contract and governance steps

Negotiate clauses that define retention, deletion, training restrictions, support access, breach notification, subprocessor notice, and the right to terminate if privacy terms materially change. Update your AI policy to reflect approved use cases and prohibited data classes. Assign an internal owner for periodic revalidation so the control does not die after procurement approval.

If you are building a broader control framework around AI and SaaS, it is worth aligning this process with adjacent governance patterns. The same principle appears in secure integration patterns and in procurement guides for outcome-based AI services. The common denominator is simple: if a vendor controls your data, your controls must reach into the vendor environment.

FAQ: Auditing “Incognito” AI Privacy Claims

What does “incognito” actually mean in AI tools?

Usually, it means the conversation is not shown in the user’s visible chat history, but it does not automatically mean no logs, no backups, no abuse-detection storage, and no training use. You have to verify the backend behavior.

How do I test data retention without breaking the vendor’s rules?

Use a test account, a harmless marker string, and normal product workflows. Avoid real personal data or secrets. The goal is to see whether the system retains or resurfaces a controlled test input.

What evidence should I collect during a vendor audit?

Collect screenshots, timestamps, policy links, DPA language, support responses, subprocessor lists, and results from deletion and recovery tests. Written evidence matters because privacy claims can change later.

Are logs always a problem?

No. Logs are often necessary for security and reliability. The issue is whether they are minimized, redacted, access-controlled, and retained for a documented reason. Logs become a problem when they quietly undermine “private mode” promises.

Should every AI vendor be treated the same?

No. Risk should be based on data class, deployment model, integration depth, and contractual protections. A public brainstorming tool has a different control requirement than a system handling customer records or code repositories.

What is the fastest way to reduce risk immediately?

Block sensitive data from external AI tools until approved, require SSO and enterprise contracts for any business use, and re-test any vendor privacy feature before trusting it. That combination prevents the most common accidental exposures.

Conclusion: Treat Privacy Claims Like Any Other Control Statement

The Perplexity lawsuit is not just a story about one product’s “incognito” mode. It is a reminder that privacy claims in AI are often broader than the evidence behind them. Security teams should respond by building a vendor audit framework that tests retention, verifies session ephemerality, reviews logging, and hardens contract controls. Once you make privacy measurable, you can compare vendors on facts instead of slogans.

If you are formalizing your AI governance program, build from the same playbook you use for other high-risk technology decisions. Evaluate the architecture, inspect the data paths, test the retention behavior, and lock down the contract. For additional context, see our guides on enterprise AI assistant governance, procurement questions that protect operations, and vendor contract and portability controls. The bottom line: in AI privacy, perfect secrecy is a myth, but verifiable controls are real.

Mac Malware Is Changing: What Jamf’s Trojan Spike Means for Enterprise Apple Security - A practical view of endpoint risk, admin controls, and enterprise detection.
Building Subscription-Less AI Features: Monetization and Retention Strategies for Offline Models - Useful context on how product architecture affects user data handling.
Selecting an AI Agent Under Outcome-Based Pricing: Procurement Questions That Protect Ops - A procurement lens for evaluating AI vendors before contract signature.
Bridging AI Assistants in the Enterprise: Technical and Legal Considerations for Multi-Assistant Workflows - Helps teams govern complex AI stacks without losing visibility.
Protecting Your Herd Data: A Practical Checklist for Vendor Contracts and Data Portability - A contract-first checklist that maps well to AI privacy controls.