20. Limits and risks of AI-driven due diligence

The deal team had 12 days to finish confirmatory diligence. The data room held 4,000 documents, including contracts, security policies, application inventories, board packs, cloud invoices, and customer reports. AI tools cut the first-pass review to two days. The team found vendor change-of-control clauses, summarized cyber policies, flagged missing uptime commitments, and produced a neat red-amber-green view for the investment committee.

Then the issue showed up. The AI summary said the target had a “standard SOC 2 Type II report with no major control exceptions.” The actual report covered only one product line, excluded the acquired European platform, and listed a carve-out around privileged access review. The investment thesis assumed the buyer could connect customer data into a shared analytics environment in the first 60 days. The control gap turned that into a security remediation workstream, a customer consent question, and a delayed revenue synergy gate.

The AI output was not useless. It was just treated as evidence when it was only a lead.

That is the core decision for deal teams: how far can AI-assisted diligence outputs be used before a human owner has verified the source, the logic, and the deal implication? The answer should change depending on materiality. AI can accelerate search, clustering, comparison, and anomaly detection. It should not set price, terms, Day-1 posture, or synergy timing without controls.

The real risk is not bad answers. It is unsupported confidence

Most teams think the risk of AI in diligence is hallucination. That is too narrow. The bigger issue is confidence without an audit trail.

In a deal, a weak conclusion rarely stays in the diligence report. It moves into the model, the debt case, the integration plan, the SPA, the TSA, or the Day-1 checklist. Once it moves, it hardens. The team starts to manage around a finding that may have been generated from stale policy text, partial contract pages, a copied management response, or a source that never supported the claim.

AI-assisted diligence usually fails in five ways.

It summarizes the wrong scope. A tool reads a policy, certificate, system inventory, or vendor contract and assumes it applies to the full company. In carve-outs, multi-product businesses, and roll-ups, that assumption is often wrong.
It collapses evidence levels. A management statement, a policy, a screenshot, and an operating log are treated as equivalent. They are not.
It hides missing context. The tool gives a clean answer because it was not given the document that would create doubt.
It creates weak attribution. The output says “contracts allow assignment” but cannot point to the exact clause, document version, entity, and counterparty.
It turns pattern detection into investment judgment. A model can flag high cloud spend variance. It cannot decide whether the variance is seasonality, waste, growth investment, migration cost, or a hidden run-rate reset without finance and technology review.

None of these problems means AI should be kept out of diligence. It means the deal team needs a reliance model.

A practical reliance model: find, shape, decide

The cleanest way to use AI in diligence is to classify every output into one of three reliance levels.

Level 1: find

AI is used to locate, extract, cluster, and compare information. The output is a pointer, not a conclusion.

Use it for:

finding clauses across contracts
clustering applications by owner, vendor, or business process
extracting security controls from policies and reports
comparing data room versions
identifying unusual spend lines in vendor, cloud, or infrastructure data
building question lists from missing artifacts

This level is usually safe when the data stays inside the approved diligence environment and the output is clearly labeled as unverified.

Decision trigger: If the output only changes what the team asks next, AI can operate as a first-pass accelerator.

Level 2: shape

AI is used to create a structured hypothesis: likely cost exposure, likely separation complexity, likely cyber gap, likely contract constraint. The output can guide workstream priorities, but it cannot stand alone.

Use it for:

drafting a first view of mandatory vs discretionary IT spend
mapping likely Day-1 dependencies from application and interface inventories
ranking contracts by assignment, termination, renewal, and audit risk
identifying patterns in incident tickets or vulnerability exports
comparing target KPIs against source-system definitions

This level requires a named diligence owner, source citations, sampling, and contrary-evidence checks.

Decision trigger: If the output changes diligence scope, resource allocation, or the priority of management sessions, require source traceability and human review before it is circulated beyond the workstream.

Level 3: decide

AI-influenced output is used to support price, terms, timing, value plan, Day-1 posture, financing narrative, or board approval. At this level, the tool is not deciding, but its output has become part of a deal decision.

Use it only when controls are in place:

every material claim ties to source documents and line references
the source scope is explicit by entity, business unit, product, geography, and period
a human owner signs off the conclusion
the conclusion is translated into a deal metric or decision
uncertainty is shown as a range, condition, or open item, not a single confident point

Decision trigger: If the output can move valuation, SPA protection, TSA terms, synergy timing, debt assumptions, or Day-1 go/no-go, treat it as Level 3. No AI-only conclusion should enter the IC pack.

What to ask for before AI outputs influence the deal

AI controls should not become a separate governance project. In diligence, they need to fit the clock. A strong evidence pack is enough if it forces the right discipline.

1. Source inventory

For every AI-assisted conclusion, ask: what source set was used?

The answer should name document types, dates, entities, and exclusions. “All contracts in the data room” is not enough. Better: “Top 40 vendor contracts by FY25 spend, uploaded through March 18, excluding local reseller agreements and any contracts under $250K annual spend.”

This matters because diligence often fails at the boundary. The target may provide corporate policies but not product-specific control evidence. It may provide group cloud invoices but not reseller pass-through spend. It may provide a customer master but not the CRM fields used by sales operations.

Evidence ask: request the document manifest, upload timestamps, version history for changed files, and a list of excluded folders or files the AI tool could not access.

2. Claim-level attribution

Every material output needs a citation that a reviewer can inspect quickly. A link to the source document is useful. The exact clause, page, table, ticket field, export row, or report section is better.

Weak attribution creates false speed. The team saves six hours on first-pass review and loses two days when legal, cyber, finance, and IT argue over where a statement came from.

Evidence ask: require a claim log for Level 2 and Level 3 outputs with four fields: claim, source, reviewer, deal implication.

3. Scope and applicability

AI tools often summarize what is present, not what is applicable. A SOC report may exclude a platform. A license agreement may apply only to one legal entity. A disaster recovery plan may cover production, not the data warehouse used for board reporting. A TSA schedule may provide access, but not change support.

Applicability is where deal economics sit.

Decision trigger: If an AI summary uses company-wide language, but the deal scope includes multiple legal entities, carved business units, regions, or acquired platforms, force a scope check before the output is used.

4. Sampling and replay

The fastest way to test an AI-assisted conclusion is to replay a sample manually.

For contract analysis, take 10 high-spend contracts and compare AI extraction against legal review. For cloud spend, take three months of billing detail and reconcile AI categories to finance GL and vendor invoices. For cyber evidence, take five claimed controls and ask for operating proof: logs, tickets, review records, exception lists.

Sampling should focus on deal-sensitive items, not random documents.

Evidence ask: require a sample test for each Level 3 conclusion. If the sample miss rate is more than 10% on material fields, downgrade the conclusion to a hypothesis and expand manual review.

5. Negative evidence

AI is good at summarizing documents that exist. It is weaker at proving what is missing. In diligence, missing evidence often matters more than present evidence.

Examples:

no privileged access review for the acquired platform
no interface inventory for parent-owned ERP feeds
no renewal notice terms for the top SaaS vendor
no source-system definition for the KPI used in the growth case
no audit trail from management reporting to the data warehouse

Decision trigger: If a conclusion depends on absence of risk, the team must prove the absence with an evidence request, not a clean AI summary.

The four controls that matter most

The control set should be short enough to run in a live process. Four controls do most of the work.

Control 1: no sensitive data outside the approved environment

Data leakage is a deal risk, not just an IT policy issue. Diligence data can include employee records, customer contracts, pricing terms, customer lists, source code, cyber findings, personal data, and non-public financials. Sending that content into an unmanaged AI tool can breach NDA terms, data protection rules, seller instructions, or internal policy.

The rule should be simple: no deal documents, customer data, code, credentials, logs, or personal data enter any AI tool unless the buyer’s legal, information security, and deal lead have approved the environment.

Decision trigger: If the tool trains on prompts, lacks enterprise retention controls, has unclear data residency, or cannot restrict access by deal team, do not use it for live diligence material.

The workaround is not to ban AI. Use approved environments, redacted inputs, synthetic examples, or tool outputs generated inside the buyer’s secure diligence setup.

Control 2: every material claim has an owner

AI outputs create ambiguity around accountability. If the cyber lead says a control gap is manageable, the cyber lead owns that judgment. If an AI summary says it, no one owns it unless the process makes them.

For each Level 3 conclusion, assign a human owner by workstream:

legal owns assignment, termination, audit, and change-of-control interpretations
cyber owns security posture, control gaps, incident implications, and connectivity gates
IT owns application, infrastructure, identity, and integration or separation feasibility
finance owns run-rate, one-time cost, EBITDA bridge, and working capital implications
deal lead owns whether the finding changes price, terms, timing, or walk-away posture

Decision trigger: If no named owner will sign the claim, it cannot be used in the IC memo.

Control 3: outputs must carry uncertainty

AI summaries often read cleaner than the evidence. That is dangerous in a deal because neat language gets promoted quickly.

A good output should state its confidence driver:

source is complete and independently reconciled
source is partial but supported by management interview
source is current but scope-limited
source is stale and needs refresh
source is management-provided and not yet tested
source conflicts with another artifact

This does not slow the team down. It speeds the real decision by showing what can be used, what needs follow-up, and what should be priced as uncertainty.

Decision trigger: If the evidence base is incomplete and the finding can change economics, express the conclusion as a range or condition. Do not turn it into a point estimate.

Control 4: separate speed from authority

AI can make the work faster. That does not make the output authoritative. The process must keep those ideas separate.

A practical rule:

AI may draft the first-pass finding.
The workstream lead validates the fact pattern.
The deal lead decides the implication.
The IC pack shows only validated conclusions, open risks, and explicit assumptions.

This control prevents a common failure: the diligence team includes AI-generated summaries in appendix pages, someone copies them into the main memo, and the investment committee reads them as fact.

Decision trigger: If an AI-generated sentence appears in the main IC narrative, the underlying claim should already have a claim log, owner sign-off, and source citation.

How AI-specific mistakes change deal economics

The problem is not theoretical. The wrong AI reliance posture can move real deal value.

False precision changes price

An AI model reviews cloud invoices and estimates “$2.8M annual savings from rightsizing.” The number looks specific, so the model takes it as EBITDA upside. Later, the infrastructure team finds that 60% of the spend supports customer-isolated environments with contractual uptime commitments. Rightsizing requires customer notification, architecture changes, and test cycles. Real first-year savings are $400K. The rest moves to year two or disappears.

The mechanism is precision without feasibility. Cost opportunities need a path: owner, workload, customer constraint, migration effort, timing, and one-time spend.

Decision trigger: If an AI output quantifies savings, require finance reconciliation and technology feasibility before the number enters EBITDA.

Weak attribution weakens protections

An AI contract review says top vendors permit assignment. Legal later finds that three contracts permit assignment only to affiliates under common control, not to the buyer, and one requires vendor consent for a carve-out. The SPA did not include a vendor-consent condition. The buyer inherits a delay and a negotiation after signing, when it has less negotiating power.

The mechanism is summary without clause-level review. Contract AI can triage volume. It cannot replace legal interpretation for deal terms.

Decision trigger: If a contract conclusion affects closing conditions, consent strategy, TSA scope, or price protection, require legal review of the exact clause.

Missing scope delays synergy timing

AI reads a target’s security documentation and summarizes “MFA enabled for all users.” The scope is the corporate identity platform. The acquired product line uses a separate identity store and has 40 privileged local accounts. The buyer’s Day-1 plan assumes network connectivity and shared customer support tooling. Cyber blocks connectivity until privileged access is remediated.

The mechanism is policy coverage mistaken for operating coverage.

Decision trigger: If Day-1 or Day-100 value depends on connectivity, require system-level identity and access evidence, not policy summaries.

Data leakage creates process risk

A junior team member uploads customer contracts and security findings into an unmanaged AI workspace to speed up summarization. The seller asks for confirmation that no deal data left approved systems. The buyer cannot give it cleanly. The process slows, the seller limits data room access, and management trust drops.

The mechanism is tool convenience overriding deal protocol.

Decision trigger: If the seller’s data room terms or NDA restrict extraction, processing, or third-party tool use, get written approval before using AI on source documents.

What best teams do differently

Strong teams do not debate whether AI is good or bad. They decide which diligence jobs are allowed, which outputs can travel, and what proof is needed before a finding changes the deal.

They start with the investment thesis. If the thesis depends on pricing discipline, AI review focuses on CRM data quality, quote-to-cash workflows, customer segmentation, and KPI definitions. If the thesis depends on cost-out, AI review focuses on vendor spend, cloud usage, license counts, duplicate applications, and contract exit terms. If the thesis depends on fast integration, AI review focuses on identity, interfaces, security controls, data access, and Day-1 dependencies.

They also make the data boundary explicit on day one. The deal lead, legal, and information security agree which tools are approved, which data classes are blocked, and how outputs must be stored. They do this before the team starts experimenting.

They build the claim log as they work, not after the readout. Each material AI-assisted finding gets a source, reviewer, and implication. This sounds administrative. It is not. It is what lets the team defend the conclusion when price, terms, or timing are challenged.

They test the model where it matters. They do not manually recheck 1,000 documents to prove a tool is accurate. They sample the 20 documents or datasets that could move the deal. Top vendor contracts. Identity exports. Cloud invoices. Cyber incident tickets. Board KPI packs. ERP interface lists. If the tool fails there, the team knows where reliance must stop.

They translate uncertainty into deal posture. If evidence is strong, the deal proceeds with a validated assumption. If evidence is partial, the team prices a range, writes a closing condition, adds an escrow, extends a TSA, delays a synergy line, or creates a Day-1 remediation gate. If evidence is weak and the issue is material, the team does not bury it in a risk appendix. It changes the recommendation.

A deal-team checklist for AI reliance

Use this checklist before AI-assisted diligence outputs influence price, terms, or timing.

What decision could this output affect? Price, SPA protection, TSA scope, synergy timing, Day-1 connectivity, debt case, or walk-away posture.
What source set was used? Name documents, systems, dates, entities, product lines, geographies, and exclusions.
What is the evidence level? Management statement, policy, contract clause, system export, operating log, invoice, ticket, audit report, or reconciled dataset.
Can the claim be traced? Exact source reference, document version, clause, page, row, ticket, or field.
Who owns the conclusion? Legal, cyber, IT, finance, tax, HR, commercial, or deal lead.
What was manually sampled? Material items first, not random files.
What could disprove the conclusion? Missing document, conflicting artifact, stale source, excluded scope, management interview, or system export.
What is the deal implication? No change, open item, price adjustment, term protection, Day-1 gate, TSA ask, one-time cost, delayed synergy, or stop/go issue.
What uncertainty remains? Express it as a range, condition, owner, and date for resolution.
Can the output be shown externally? Confirm NDA, data room, seller, and internal policy constraints before sharing.

If the team cannot answer these questions for a material AI-assisted conclusion, the conclusion is not ready for the deal pack.

Decision tree: when to rely, when to verify, when to stop

If AI is being used to search, summarize, cluster, or generate questions:
Proceed inside the approved environment. Label outputs as unverified and use them to focus the next evidence pull.

If AI is ranking risks or shaping hypotheses:
Proceed with workstream-owner review, source citations, and a sample check of deal-sensitive items.

If AI output supports a value number, cost estimate, synergy date, or remediation budget:
Require finance reconciliation, technical feasibility review, and an explicit range. Do not use a single-point estimate unless the source data is reconciled.

If AI output supports legal terms, consent strategy, customer obligations, regulatory exposure, or security posture:
Require specialist review of the source. AI can speed clause finding and control mapping, but the accountable expert must own the interpretation.

If the tool cannot show sources, cannot protect deal data, or was run outside approved controls:
Stop using the output. Re-run the analysis in an approved setup or rebuild the conclusion manually.

If the conclusion is material and evidence is still missing before signing:
Do not call it a risk to monitor. Convert it into a deal action: price adjustment, closing condition, escrow, indemnity, TSA term, Day-1 gate, or delayed value-plan assumption.

Monday morning: put AI controls into the diligence rhythm

If you are using AI in a live diligence process, do this in the next five business days.

Set the reliance rules. Deal lead owns it with legal and information security. Define approved tools, blocked data types, and the three reliance levels: find, shape, decide.
Create the claim log. Diligence workstream leads own it. Every material AI-assisted finding needs source, reviewer, and deal implication.
Map AI work to the thesis. Deal lead and tech diligence lead identify the 5-7 AI-assisted analyses that can actually move price, terms, timing, or Day-1 posture.
Sample the material outputs. Workstream owners manually replay the high-risk claims: top contracts, top cost lines, security controls, identity scope, KPI definitions, and interface dependencies.
Rewrite the IC output. Remove AI-generated language that lacks source traceability. Replace it with validated conclusions, open assumptions, and explicit decisions.

AI can make diligence faster. It can also make weak evidence look finished. The difference is process discipline. Use AI to compress the path to evidence, not to skip the judgment that protects the deal.