Net Promoter Score under fire?

futurelab default header

By: Stefan Kolle 

Reichheld’s Net Promoter Score (NPS) has taken the world of market research by surprise – and by storm over the past couple of years. Not being a researcher or statistician myself, I have taken the whole theory up on face value – particularly because I cannot remember ever having had such an ‘aha-Erlebnis’ as when first seeing the NPS explained. It makes such perfect sense, from a gut feeling point of view.

Now there seems to be some trouble brewing for the NPS. 

There have always been criticisms about the NPS, but to my feeling most of those came from traditional market research agencies that stand to lose a lot of business when all you need is one question, instead of the ridiculously bloated questionnaires so often used.

In the past couple of weeks, however, something more serious is in the works, and it has been brought to my attention over the weekend by Tim Keinigham, author of probably the most damning report. (Download the report here). To summarize – he questions the scientific validity of the research Reichheld built his theory on.
What follows is an unusually long post for this blog, but given the importance NPS has started playing in many organizations, I think it deserves some attention.

To give the executive summary – Tim makes some very valid points, that at least have started some doubts in my mind. On the other side I still have a strong feeling the NPS is a very useful tool – and during the discussion you will find out that I might have a dissident view on how to actually apply the NPS. I will follow this post up with a second article to start a discussion on that matter.

Let me make you part of the discussion I had with Tim (edited a little bit). I strongly advise you to read the article by Tim and his colleagues first, as otherwise the discussion might be hard to follow.

Original mail by Tim:

I thought you might like to know that a paper I co-authored concerning Net Promoter is now available at the Journal of Marketing.  The Journal of Marketing is allowing the article to be downloaded for free, and I have confirmed that it is fine with publishing a link to the article should you wish to do so (i.e.,
There are two key findings from the research:  
1) We did not find Net Promoter to be a good predictor of growth at all.
2) We found very strong evidence of research bias in the research reported by Reichheld in support of Net Promoter.  In particular, we were able to replicate a subset of Reichheld's reported data for his best case scenarios and compare it to a metric he claimed was examined and found to have a 0.00 correlation to growth, the ACSI.  Our findings clearly show that when using Reichheld’s own data, Net Promoter wasn’t superior to the ACSI. It is difficult to imagine a scenario other than research bias as the cause of this finding. This is a serious problem. We expect published research to be free of bias in management science, just as we do in all other fields of study. Managers have adopted Net Promoter based upon the belief that solid science underpinned the claims attributed to the metric. In fact, there would have been no HBR paper introducing Net Promoter without the research.
Also, I thought that the attached article that just came out in RESEARCH magazine might be of interest.
My initial reply:
This is a very interesting article, prompting me to do some further research myself before deciding whether to blog about it or not. I have a very specific question for you….and as the whole discussion centres in part about the apples vs apples debate, I want to make sure that we ARE in fact talking about apples.
One thing that immediately struck me as a potential conflict in the research methods used between Satmetrix and NCSB, is the fact that NCSB narrowed it’s sample down to actual purchasers, and then asked the question specifically relating to the company they had purchased from, whereas Satmetrix (and other research companies and brands that I know of using the NPS metric) ask about companies that respondents are aware of.
Firstly, this difference could explain the strong difference in your results and Reichhelds results. Secondly, in my view, that is the whole point of the exercise – to gain a comparison of likelihood to recommend of brands against each other, and capturing the non-purchasers position. It is this wider net casted that explains the results.
An example I use when explaining the NPS (and why I am – as of yet – still a ‘believer’) – let’s assume I have been driving a Renault Clio ever since I graduated. Fantastic little car, very happy with it. Two years ago I got a promotion and my manager thought I should visit customers in a more representative car – and I got a BMW. Using the NCSB (or typical market research) I would not fall in the sample, as I haven’t bought a Renault in the past two years – and have no intention to do so in the next two. However, if my nephew, just graduated, asks me which car to buy, I will recommend Renault Clio, very, very strongly, creating a value for the company.
The underlying idea is that we can capture the complex universe of a customers preferences a little bit better. I might rate Renault 10, VW 9, BMW 8 and Citroen 5. It is this relationship that provides insight, not the simple measure of what I recently bought and how I feel about that one specific product.


Tim’s Reply:
Stefan —
The NCSB data is made up of purchasers (actual customers).  Your distinction, however, between the Reichheld/Satmetrix data used to support Net Promoter and the NCSB is erroneous.  Satmetrix specifically states that customers were surveyed (see attached):
To identify the ‘right’ loyalty question that links to real behaviors (i.e., purchases and referrals made), survey data was collected from customers of targeted companies within six industries, including financial services, cable and telephony, ecommerce, auto insurance, Internet service providers and computer hardware.
In fact, few would take Net Promoter seriously as a "Loyalty" metric if the survey were executed using the methodology you describe (i.e., "awareness" of the brand qualifies a respondent to the survey); it would instead be a "Brand Image" metric.  This is also why efforts to improve Net Promoter scores focus on improving customers' experiences (as opposed to focusing on improving brand equity).  
It is troubling to me, however, that you ignore the big issue uncovered in our investigation: research bias.  Our research found very strong evidence of research bias. In particular, Reichheld states that Net Promoter is superior to the ACSI. He reports that their research found the ACSI to have a 0.00 correlation to growth, whereas Net Promoter linked strongly to growth. Besides that being a ridiculous claim about the ACSI (as there are numerous scientific studies showing otherwise), our findings clearly show that when using Reichheld's own data for his best case scenarios, Net Promoter wasn't superior to the ACSI. It is difficult to imagine a scenario other than research bias as the cause of this finding. [Given that we are using Reichheld's own data, it could not be a more apples-to-apples, unbiased comparison than this.]  
We expect research published in our most prestigious journals to be free of bias in management science, just as we do in all other fields of study. Managers have adopted Net Promoter based upon the belief that solid science underpinned the claims attributed to the metric. In fact, there would have been no HBR paper introducing Net Promoter without the research. Our research clearly challenges the claims of Reichheld's research.

My reply:

There are several issues here, only one of which I addressed in my email. Of course research bias is an issue that needs to be looked into – I’m however taking a very pragmatic view right now, leaving the moral issue aside. Being located in Europe, the ACSI has no relevance for me. The question that does have relevance for me and the readers of this blog right now,  is whether or not the NPS in itself is a useful metric.

My quote of the word ‘awareness’ comes straight out of the research methodology as described in the HBR article. A quick look through several related publication doesn’t clear up completely which exact methodology has been used. The NCSB description you use in your paper on the other hand is very clear.
For instance, even with the ‘customers of targeted companies’ distinction given in the Satmetrix white paper, it’s still actually not clear whether these were asked just about the one company they are a customer of, or all of the sampled companies within the industry.

As always, there is no one answer, or one size fits all. Personally, I do think that it is exactly the wider definition, i.e. what you call the brand image effect, that reveals the true value of the question, as this is the aggregate of all WoM, and it captures the influencers as well as the customers. As we know, an influencer does not have to be an actual customer of the company – but he can generate massive value for the company (I do not actually own an iPod, but I have sold at least 5 iPods by telling my older relatives that this was the only solution answering their needs for something easy to use).

I guess we have three discussions here:
1. What exact method was used, and can we compare apples with apples. I’m not sure still if we can.
2. Is there research bias? Probably, but this is of less relevance for me right now, as it does not answer the question of the usefulness of the NPS in itself
3. Where lies the true value of the NPS? I think we can safely state we disagree on that one  – and it is one that probably warrants an extensive discussion too.


And Tim again:
Dear Stefan –
The best reference with regard to the methodology used by Reichheld is contained in the Satmetrix White Paper that I forwarded to you earlier.  Reichheld makes a number of statements in the December 2003 Harvard Business Review article introducing Net Promoter that are more precisely clarified in this White Paper.  For example, Reichheld states that the research involved “more than 400 companies in more than a dozen industries. Satmetrix later provided more relevant detail on the analysis.  While data from customers of 400+ companies was collected, inclusion in the actual analysis was limited to firms meeting specific criteria.  As a result, “over 50 companies were included across a dozen targeted industries.”  Therefore, I would recommend relying on the White Paper when making conclusions regarding the methodology used by Reichheld/Satmetrix.
With regard to your specific issues, your argument that Reichheld/Satmetrix may not have conducted their research among customers is clearly inaccurate based upon a simple examination of the total research they conducted in support of Net Promoter.  It is important to recognize that their research was conducted in two phases: a micro-level (customer-level) analysis, and a macro-level (firm-level) analysis.
The micro-level analysis surveyed customers to gauge their responses to various satisfaction-loyalty questions.  They then resurveyed these same customers six-to-twelve months later to uncover their purchasing and referral behaviors.  The results of their research indicated that one question, “likelihood to recommend,” was overwhelmingly superior to other questions examined in explaining customers’ purchasing and referral behaviors.  Without question, we are talking about surveying “customers,” as linking to future purchase and referral behaviors of non-customers makes little sense.
The Reichheld/Satmetrix data was then aggregated to the firm level, so that Net Promoter scores could be analyzed within industry and correlated to their respective growth rates.  Therefore, this data represents customer information as well.
With regard to whether customers answered for multiple firms for which they conducted business, it is unclear why you would expect this to affect the outcome with regard to the relative performance of firms within industries.  As our examination of Net Promoter using the NCSB data concerns relative performance within industry, it is clearly a fair comparison of the Reichheld/Satmetrix methodology.  And the results conclusively show that Net Promoter was not superior to other metrics (or even a good measure in linking to growth when applied using the Reichheld/Satmetrix methodology).
Also, it is imperative to note that we conducted two tests reported in our Journal of Marketing paper.  In the second test, we replicated a subset of Reichheld’s data for his best case scenarios and compared it to a metric that he claimed was examined and found to have absolutely no correlation to growth.  Again, the reported superiority was clearly disproved.
As for whether or not Net Promoter is useful, there are clearly some serious statistical flaws with the metric.  I recommend that you refer to the studies conducted by Mark Molenaar and Larry Freed. Rather than rehash the issues that they have addressed so well, I will simply allow their findings to speak for themselves. For your convenience, they are attached.  Additionally, my co-authors and I have another paper scheduled for publication in the forthcoming issue of Managing Service Quality that tests the robustness of the micro-level analysis conducted by Reichheld/Satmetrix.  We find no support for the micro-level claims made by Reichheld either).  The citation for the paper is:
Keiningham, Timothy L., Bruce Cooil, Lerzan Aksoy, Tor Wallin Andreassen, and Jay Weiner (2007), “The Value of Different Customer Satisfaction and Loyalty Metrics in Predicting Customer Retention, Recommendation and Share-of-Wallet,” Managing Service Quality, vol. 17, no. 4, forthcoming.
Finally, your acknowledgement that the research used to support Net Promoter was “probably” biased speaks for itself.  In fact, it is virtually impossible to imagine any other cause for Reichheld’s own data for his best-case scenarios failing to be superior to the ACSI, which he reported had a 0.00 correlation to growth.  This is NOT just a moral issue, as you suggest in your email.  It disqualifies the research–period.  

Final comment:
Now, dear readers, having made you part of the discussion –  I leave it to each of you to draw your own conclusions on whether or not the NPS metric is discredited to the point of invalidity. I still have my doubts, as I’m still not sure whether we are comparing apples to apples.
And getting back to the final statement in Tim’s last mail – even though a bias driving the results of the NPS versus the ACSI would be very grave, it has no impact by definition on the question of whether the NPS is valid or useful in itself or not.  

Let me know what you think – and please come back for my other article and participate in the discussion.