the ethics of product suggestions

In one of today’s lectures, we briefly discussed association analysis - the relatively simple idea behind grocery store loyalty cards and product suggestions like Amazon’s “customers who bought A also purchased B, C and/or D”. The idea is to look at a customer’s entire order and use this information to increase sales through strategic suggestions. The discussion sparked a 9-year-old memory of a philosophy class I took in college, particularly the section on Kant and ethics/morality.

Immanuel Kant developed a set of decision rules for determining whether or not an act was moral. His first rule, the Formula of Universal Law, is the one that interests me today. It basically requires you to look at an act from a perspective of cold logic and ask yourself: “if everyone else did this, would it be possible?” If the answer is no, the act is immoral.

So what does this have to do with product suggestions?

image

Let’s imagine that I buy a pen on Amazon, but I don’t know what type of notebook to buy to go with it. I decide that I will purchase whichever notebook is listed first on the product suggestion list. Was that a moral decision?

According to Kant’s Universal Law, no. If every other customer in the market for both pens and notebooks chose to use the same decision method, there would never be any notebooks suggested.

Of course, this is an extreme and silly example. Customers do make their own purchasing decisions and so suggestions are nearly always available. But just know that every time you follow the advice of the Amazon suggestions, Immanuel Kant judges you a little.

the angry customer index

If you were creating a statistical model and were setting a percentage goal for how good you wanted it to be, what would you pick? 90%? 95% 99%?

It turns out that you should decide based on how much data you’re working with. In economics, we wanted to be at least 90% confident (statistically speaking) that a variable in our model was meaningful. Other disciplines would use 99%. However, as I’m learning in school, those seemingly strict levels are not strict enough when you get into larger data sets. As your data gets larger, you tend to reach your confidence goal more easily by virtue of the math. According to research, you actually want to select a confidence level of at least 99.7% for 1,000 observations, or 99.98% for 100,000 observations.

The problem is that this is when people’s eyes start to glaze over. It’s like when I read a soap bottle that says it kills 99.7% of germs. Is that much better than the one that kills 99.5% of germs? (Actually maybe I don’t want either of those.) So how do you explain to a manager why your model that you’re 99.00% confident in might not be good enough?

I suggest the “Angry Customer Index”.

Imagine you built a model based on 100 customers out of your entire global customer base. You are 99% confident that the model is correct, which in the context of your model means that on average 99 customers will be happier than they were before. Conversely, you might have up to 1 angry customer.

Now let’s say that you build the same model based on 100,000 customers, and you are similarly 99% confident the model is correct. On average you will have 99,000 happy customers, but potentially 1,000 angry customers.

This is an oversimplification, of course. But it’s a heck of a lot easier to explain than pointing to research discussing p-values and sample sizes. When you’re dealing with large numbers, you don’t have the luxury of a 1% margin of error. When 1 customer is dissatisfied with a product, it’s not really a big deal. But 1,000 customers upset in a short span of time could make headlines, and no executive will want that.