K-means clustering is not a free lunch

Standard

I came across this awesome blog post and had to share here for posterity: http://varianceexplained.org/r/kmeans-free-lunch/

It makes great points about the underlying assumptions of using K-means clustering to find groups in your data and suggests other methods that perhaps you should be using instead. I know I’ve previously fallen in to the trap of just throwing data at k-means and hoping for the best.

Enjoy!

plot_kmeans-1

Example of a wrong thing!