Knowledge Base

What the One Number Doesn't Tell You

There's a particular intellectual honesty required when work you helped develop and promote turns out to have been built on imperfect foundations. Not the theatrical kind that requires disowning everything (the original ideas were sound for their time and genuinely moved the field forward) but the kind that requires you to look clearly at what the evidence now shows, and say so.

In 2003, Fred Reichheld published "The One Number You Need to Grow" in the Harvard Business Review, proposing that a single question, "would you recommend us to a friend?", captured more useful signal about customer motivation and future business performance than the entire existing landscape of customer measurement. Net Promoter Score was built on a chain of causality that seemed both intuitive and empirically grounded: operational excellence created loyal customers, loyal customers generated better financial outcomes, and the survey score was the instrument that made that chain visible. I was part of the Satmetrix team that helped develop the methodology, and with Laura Brooks I co-authored Answering the Ultimate Question, which tried to codify what we understood at the time to be best practices for NPS implementation. That body of work reflected the state of the art. It also rested on a number of assumptions that, with hindsight and better analytical tools, are not as well-supported as we believed.

Three of those assumptions are worth examining directly.

Operations, sentiment, and finance don't run in a straight line

The underlying logic of NPS rested on a causal chain: how you treat customers through day-to-day operations drives their sentiment; their sentiment is captured in their survey response; and that score predicts their future purchasing behavior. It's an appealing model. It's also considerably more complicated than the theory suggested.

The problem is that operational performance and purchasing behavior represent fundamentally different dimensions of customer response, and while they overlap and influence each other, they do not operate on a single axis of causation. A customer can have a very good operational experience with your product while simultaneously reducing their spend for reasons that have nothing to do with that experience. Conversely, a customer can have a disappointing operational experience and still expand their relationship with you because switching costs are high, the alternative is worse, or their internal sponsor has staked their credibility on the partnership. Sentiment and propensity to buy are related, but they are not the same variable, and they are not reliably predictable from each other.

This is, in hindsight, less surprising than it ought to have been. The NPS literature was always clear that sentiment is one input into purchasing decisions, not the only one. What was underestimated was how many other inputs there are, how much weight they can carry, and how non-linear the relationship between sentiment and behavior can be in practice. When you have genuine predictive analytics, models validated against actual customer behavior across large data sets, what you find is that the correlation between NPS scores and customer financial outcomes is real, but the causal mechanism is neither clean nor consistent. They are better thought of as two distinct dimensions that must be read in relation to each other, not as a single chain where improving one automatically improves the other.

The absence of a clean linear relationship does not, however, mean that sentiment is irrelevant. Some churn modellers and health scorecard designers have swung to exactly this conclusion, building predictive models that treat sentiment as noise, effectively discardable once you have enough behavioral data. The implicit claim in those models is that a deeply unhappy customer is just as likely to keep buying as a delighted one. Nobody with real-world experience in either consumer or B2B markets believes this, because it isn't true. Switching costs provide temporary insulation, not permanent immunity. Sooner or later, a sufficiently unhappy customer finds a way out, and the unhappier they are, the harder they look for one.

Health scorecards that include sentiment as one factor among many are conceptually on the right track. But the practical question is how much that one factor actually accounts for in customer behavior, and here's where the architecture breaks down. Most of those same scorecards don't have accurate sentiment data for most of their customers. If you only have survey responses from a fraction of your base, and the responses you do have are biased toward satisfied respondents, then the sentiment variable in your model is either missing or misleading for the majority of accounts you're trying to score. You're not factoring sentiment in. You're selectively factoring it in for the customers least likely to need attention, and calling the result comprehensive.

What you can't control explains more than you'd like

The second assumption is more obvious, which is probably why it was overlooked so consistently.

The NPS framework, and most of the customer experience apparatus built around it, proceeds from the implicit premise that customer outcomes are primarily a function of what the company does. Improve operations, improve sentiment, improve financial performance. The company is the independent variable. Everything else is noise.

This assumption ignores a substantial fraction of what actually drives customer behavior, particularly in business-to-business relationships. Whether a key contact at a client is about to leave their role, whether that organization is underperforming against its targets, whether it's in the middle of a leadership change or a cost restructuring driven by consultants brought in to find savings, whether a new owner has a pre-existing relationship with a competitor: none of these are factors you can observe from a survey, and none of them are within your span of control. All of them materially affect whether that customer renews, expands, or leaves.

The consumer context has its own version of the same truth. Divorce, illness, changes in household income, shifts in lifestyle and priorities: any of these can reset a customer's purchasing behavior independently of how well you've served them. These aren't rare events that can be safely treated as statistical noise. They are, in aggregate, a substantial portion of the variance in customer outcomes. And yet the standard NPS model places them entirely outside the frame.

The implication is uncomfortable but important. It means that customer outcomes will never be fully predictable from operational variables alone. It means that the mechanisms we have for influencing customer behavior are always going to be more limited than we'd prefer. And it means that there is a category of churn and contraction that no amount of operational excellence can prevent, not because the company did anything wrong, but because the customer's circumstances changed in ways that were outside anyone's control. The job of a sensible customer management system is to capture as many of those external signals as possible, distinguish what's within your control from what isn't, and set the sails accordingly, while remaining clear-eyed that you don't control the direction of the wind.

Survey NPS is not an accurate portrait of your business

The third assumption concerns the measurement instrument itself.

The entire edifice of NPS analysis rests on the premise that a survey sample accurately represents the health of your customer base. In practice, this assumption fails in several specific and predictable ways.

In business-to-business markets, the customer base is rarely homogeneous in any commercially meaningful sense. The range between a small business spending a modest amount on a standard product configuration and a large enterprise with complex, customized requirements represents not a spectrum but effectively a different category of relationship entirely. What constitutes good value, good service, and a good outcome varies so substantially across this range that summarizing it in a single NPS number obscures more than it reveals. Who responds to a survey, and whether they are the right people to be asking within a given account, introduces further uncertainty. And the sample drawn from a heterogeneous population tells you surprisingly little about the health of the whole.

The consumer context offers different problems, not a different conclusion. Businesses naturally want to segment their customers, by economic profile, by product usage, by behavioral attributes, which means that a reasonable sample of the total population becomes, once you apply the segmentation logic you actually need, a very small sample of each sub-population you care about. Statistical reliability degrades rapidly, and businesses make operational decisions on that basis anyway.

And across both contexts, there is now good evidence for something that is probably the most fundamental problem of all: a systematic bias in who responds to surveys. The evidence increasingly points to the conclusion that customers who are satisfied with a product or service are meaningfully more likely to respond than those who are not. This isn't a minor technical issue that can be corrected at the margins. If the survey population systematically over-represents your promoters, then the NPS you are measuring is not an accurate representation of the sentiment distribution in your actual customer base. Reichheld and Rob Markey's most recent book points in exactly the same direction: the gap between survey-derived NPS and the actual state of the customer base is real, and in the direction of overstatement. Predictive analytics, applied to the full customer portfolio, tends to confirm the same thing. The score you are tracking is almost certainly more optimistic than the truth.

What makes this particularly difficult to address is that very few people in the industry actually disagree with any of it. The practitioners who continue to rely exclusively on survey data will, if pressed, acknowledge that their data is incomplete. After all, they only hear from a fraction of their customers and they know it. Many are intellectually willing to admit that the data is flawed, that it may represent the wrong individuals, that in B2B the person filling out the survey is often not the person making the renewal decision. Nobody seriously disputes that the data is difficult to collect, infrequent, and decays between collection points. The evidence, in other words, is not in question. What's missing is the behavioral change that ought to follow from it. People nod along, acknowledge the limitations, and then keep running the same program with a bit of a shrug. You could think of it as a kind of flat-earth problem. The evidence that better alternatives now exist, that behavioral data and predictive analytics can provide a more complete, more accurate, and more timely picture of customer health across the full portfolio, is not ambiguous. It's just inconvenient, because acting on it means admitting that the measurement infrastructure you've spent years building is no longer fit for purpose. That's a hard conversation to have with yourself, and an even harder one to have with your board. So instead, most people don't have it at all.

Two dimensions, not one line

If these three assumptions no longer hold in the form they were originally stated, what should replace them?

The most useful reframe, and the one that makes analytical sense of what the data actually shows, is to stop treating sentiment and financial behavior as a single causal chain and start treating them as two distinct dimensions that need to be read in relation to each other. This is the structural logic of a 2x2 framework, with financial performance on one axis and customer loyalty on the other. The point of that structure is not to eliminate the connection between sentiment and outcomes (that connection is real) but to acknowledge that the connection is not deterministic, that the causality doesn't run in a single direction, and that organizations which conflate the two measures will systematically misread what's happening in their customer base. I wrote about how Rupert Soames applied exactly this logic when he ran Aggreko: financial performance and customer loyalty as parallel and genuinely independent measures, each telling you something the other cannot. The framework survives because the underlying insight survives: a tension you can see is one you can manage.

In practice, this means running two different models of your customer data rather than one. The first looks at the full set of operational and experiential signals that drive sentiment: product experience, service quality, relationship health, value realization. This tells you something important about what you're doing and how customers are experiencing it. The second looks at the signals that predict customer financial behavior: buying patterns, engagement signals, organizational changes at the customer, external factors that affect their propensity to spend. These models are not competing alternatives. They're complementary lenses, and the interesting information often sits in the gap between them: the account that feels good but is showing behavioral warning signs, or the account with a lukewarm survey score that keeps expanding its relationship despite it.

The practical implication for measurement programs is significant. A survey-based NPS program, implemented with the best available practices, tells you something real. It's just not the whole picture, and the sample biases mean it's probably a more flattering picture than reality. Supplementing it with a full-portfolio view of customer behavior, drawn from behavioral data rather than survey responses, gives you the second dimension that NPS alone can't provide. Neither measure is sufficient on its own. Together they create a considerably more accurate account of where your business actually stands, and a more actionable framework for deciding where to focus.

The original work on Net Promoter Score was not wrong. It was a significant step forward in how the field thought about customer measurement, and the intuition at its core, that customer loyalty connects to business outcomes in ways that matter, remains sound. What we now have that we didn't have in 2003 is the data, the analytical tools, and the accumulated experience to test the specific assumptions the framework rested on. Some of those assumptions held. Others were considerably less reliable than the theory suggested. The responsible thing to do with that information is to use it.

I'm Richard Owen, co-founder and CEO of OCX Cognition. We build predictive customer analytics for companies who'd prefer to know which customers are at risk before those customers have already decided to leave.