Why is K-means clustering inadequate for 1-1 consumer segmentation?

By Ravi EvaniFiled under Machine Learning

Segmentation helps marketers identify different categories of consumers in order to design and target relevant marketing strategies to suit each category. The concept of segmentation itself is around the assumption that people can be grouped into such categories, which in my view is a premise that doesn’t work for marketing messages to be relevant to consumers on a 1-1 basis. But I will cover that in a separate post. For now let’s assume that segmentation as a technique will work for 1-1 relevance and we are looking at the effectiveness of one of these techniques for this scenario.

Marketing analysts employ many algorithms (rule based or otherwise) to do this segmentation where the most popular one these days is to use a statistical model based on algorithms like K-means. Here is a great example for how K-means is implemented.

With K-means you come up with a random number of clusters (K) and assign each consumer based on the attributes selected to a nearest cluster, then re-evaluate the center of the cluster again. You partition N attributes into K partitions, each cluster containing observation with the nearest means.. Then you repeat this iteratively re-assigning customers to the new nearest clusters. This is iteratively done till the distance between the center and the customers are minimized or the movement is stopped.

There are two problems associated with this approach when dealing with 1-1 relevance with potentially 100s of data points.

Problem #1:
You need to decide on the attributes that you want to look at when clustering. Which attributes do you pick? Well, you could say things like “location is more important than age”. But you would be saying that because you have some background information supporting that point of view. But if you were weighing between attributes like “visits on weekends” vs “actions on weekdays” how do you decide?

Problem #2: You need to decide on the number of clusters required. When you are dealing with a few attributes 3-5 clusters may be sufficient. But let’s say you are looking at hundreds of attributes – what’s the right number to start with? Do you go up or down from there? This could get very subjective and error prone.

The results of K-means vary dramatically based on how #1 and #2 are solved. So, In my view K-means needs to be bolstered with additional techniques to solve the two problems when it comes to 1-1 relevance.

I will provide a couple of options in a subsequent post where others have potentially addressed these problems outside of the context of consumer segmentation. Those techniques can potentially be used in segmentation scenarios as well.