Enterprise AI Is Failing the Same Way Enterprise IT Always Did

By Richard Owen & Maurice FitzGerald

Field Notes on Customer AI · Edition 010 · June 30, 2026

Each Tuesday, Field Notes surfaces what we're seeing in the field: patterns from implementations, ideas worth stress-testing, and the occasional inconvenient truth about how Customer AI programs succeed or stall. No abstractions. No product pitches. Just the working knowledge that tends to matter.

This time we are covering what is one of the most common failure point of all enterprise IT programs, including those that involve AI: poor choice of pilot projects.

The Field Read

Enterprise AI Is Failing the Same Way Enterprise IT Always Did - Richard Owen

A research team at MIT spent two years studying AI deployments across major enterprises and concluded that 95% of generative AI pilots fail to produce measurable P&L impact. A Stanford study published last month examined 51 enterprise AI deployments across nine industries. Both arrived at roughly the same place, which is that the reason most enterprise AI programmes fail has almost nothing to do with AI. The vocabulary is new. The pattern is not.

The structural reason is worth spelling out. Most enterprise AI pilots are measured at a single node in the production chain. The pilot delivers a productivity uplift inside one team. The team reports the uplift. Nobody traces whether the uplift survives the trip through every downstream constraint that determines whether the firm's actual output changes. In most cases, it does not. The pilot is technically successful and financially invisible. This is what the ninety-five percent number actually looks like at close range.

There is a specific failure mode that deserves attention because organisations are building it at scale and calling it best practice. The Center of Excellence model places AI decisions in the hands of people who understand the technology and not the domain in which it will be deployed. The team picking the deployment targets does not own the production bottleneck and is not measured against the bottleneck. They are measured against the number of pilots launched, the number of business functions touched, and the performance metrics that have been selected for the pilot. Each of these is a measurement at a point selected for pilot purposes. The points selected are almost never the firm's actual problem area.

The MIT research produces one finding the enterprise market has been slow to absorb: vendor-deployed AI solutions succeed at roughly twice the rate of internally built ones. The scarcest variable in enterprise AI is not compute. It is domain expertise. And domain expertise does not live in the corporate center of AI expertise.

Read the full article: "Enterprise AI Is Failing the Same Way Enterprise IT Always Did"

The Practitioner's Take

The pilot that proved everything and changed nothing – by Maurice FitzGerald

At HP, we ran a pilot using predictive analytics to identify accounts at risk of non-renewal. The pilot worked. The model flagged about a dozen accounts that the customer success team had not yet identified as problematic. Eight of the twelve did not renew. The accuracy of the pilot project was really good. The business impact was zero.

The reason was simple. The pilot lived inside my CX team. (Yes, this was my fault.) The CX team had no authority over the renewal conversation, no budget for intervention, and no mechanism to escalate a predictive signal into the account management workflow. I suppose I did not think it was worth doing for a pilot. In any case, the insight reached a dead end multiple steps before the point where it could have changed an outcome. We had a prediction with no path to action.

I have watched this pattern repeat in every large organization I have worked in. The technology performs. The organisation does not reorganise around it. The pilot is declared a success in the quarterly review and quietly shelved, since it had no immediate impact.

So therefore: before you approve the next AI pilot, ask one question. Does the team running the pilot have the authority to act on what it finds? If the answer is no, the pilot will succeed in proving the technology works and will probably fail at everything else.

The Field Tactic

Three ways to get past pilot purgatory

Trace the path to the P&L. Before approving any AI pilot, map the production chain from the pilot's output to the financial outcome it is supposed to improve. Identify every node between the two. If any node along the path has not been touched, the improvement will be absorbed before it reaches the income statement. The map tells you where to invest next, not the pilot results.
Give the pilot team authority, not just intelligence. If the pilot identifies accounts at risk, the team running it needs the ability to trigger an intervention, not just a report. Intelligence without authority produces dashboards. Intelligence with authority produces outcomes.
Measure the firm's output, not the pilot's. Replace pilot-level metrics with financial outcomes: retention revenue protected, expansion revenue generated, cost avoided. If the pilot cannot demonstrate a connection to one of these within two quarters, the bottleneck is somewhere else in the chain. Move the investment to where it changes the number.

The Data Point

The number:

The 5% That Matter

That is the share of generative AI pilots that produce measurable P&L impact, according to MIT research published in 2026. Eighty-eight percent of companies now use AI in at least one function. Only thirty-nine percent report any EBIT impact. The gap between adoption and impact is not closing. It is widening.

A related finding from the same research: vendor-deployed AI solutions succeed at roughly sixty-seven percent, compared to thirty-three percent for internally built ones. Domain expertise, pooled across multiple deployments, is the variable that separates the two.

Source: MIT Enterprise AI Research, 2026; McKinsey State of AI, 2025; cited in Richard Owen, "Enterprise AI Is Failing the Same Way Enterprise IT Always Did" (See link above).

The Iconoclast Question

The Bottleneck Question

Your last AI pilot reported a productivity improvement inside one team. Did that improvement change your company's revenue, retention, or cost structure? If you cannot trace a line from the pilot to a financial outcome, the bottleneck is somewhere else, and the pilot proved nothing that matters.

The Field Bridge

The Customer AI Masterclass is the certification program Richard built for CX, CS, and RevOps leaders who need to move from survey-dependent reporting to predictive account intelligence. Eight units. Self-paced. Built for practitioners, not data scientists.

[ Explore the Customer AI Masterclass →]

Coming in Future Editions

Why NPS was never enough, and what replaces it.
The Executive Sponsorship issue.

Get the Field Guide

If you've been reading Field Notes, you know the problem isn't awareness - it's execution. Knowing that AI can improve retention or accelerate revenue doesn't tell you how to make it happen in your organisation. That's exactly the gap The Customer AI Field Guide was written to close. Authored by Richard Owen and Maurice FitzGerald (that's us), it's a practical execution guide for CX, CS, and RevOps leaders, covering how to identify at-risk accounts before they signal churn, convert customer insights into frontline action, build the financial case that gets CFO sign-off, and design Customer AI systems your teams will actually adopt. Theory optional. Results required.

[ Get the Customer AI Field Guide → Now on Amazon]

Field Notes publishes every Tuesday. Each edition focuses on one topic - a trap, a framework, a field observation, or a pattern worth examining. If something in here resonates, or if you're seeing something different in your own programs, we'd like to hear about it.

- Richard Owen & Maurice FitzGerald