Nice overview. I remember using this in one of my coursework projects during my studies.
I remember one of the tricky aspects when I tried it was how to initialise the clusters.
When I picked random data points as the cluster centres, the results could be quite poor if they happened to be initialised near others.