In Winter Term 2017/2018 I was substitute professor at Univeristy Heidelberg, and giving the lecture “Knowledge Discovery in Databases”, i.e., the data mining lecture.

While I won’t make all my slides available, I decided to make the chapter on cluster analysis available. Largely, because there do not appear to be good current books on this topic. Many of the books on data mining barely cover the basics. And I am constantly surprised to see how little people know beyond k-means. But clustering is much broader than k-means!

As I hope to give this lecture frequently at some point, I appreciate feedback to further improve them. This year, I almost completely reworked them, so there are a lot of things to fine tune.

There exist three versions of the slides:

These slides took me about 9 sessions of 90 minutes each.
On one hand, I was not very fast this year, and I probably need to cut down on the extra blackboard material, too. Next time, I would try to use at most 8 sessions for this, to be able to cover other important topics such as outlier detection in more detail, that were a bit too short this time.

I hope the slides will be interesting and useful, and I would appreciate if you give me credit, e.g., by citing my work appropriately.