Cluster analysis lecture notes
In Winter Term 2017/2018 I was substitute professor at Univeristy Heidelberg, and giving the lecture “Knowledge Discovery in Databases”, i.e., the data mining lecture.
While I won’t make all my slides available, I decided to make the chapter on cluster analysis available. Largely, because there do not appear to be good current books on this topic. Many of the books on data mining barely cover the basics. And I am constantly surprised to see how little people know beyond k-means. But clustering is much broader than k-means!
As I hope to give this lecture frequently at some point, I appreciate feedback to further improve them. This year, I almost completely reworked them, so there are a lot of things to fine tune.
There exist three versions of the slides:
- the screen version, 433 overlays, 6 MB
- the print version, 53 pages, with 3 slides per page, 4 MB
- the lecturers version, 80 pages, 2 slides each, with additional - private - notes of what I explain on the blackboard only
These slides took me about 9 sessions of 90 minutes each.
On one hand, I was not very fast this year, and I probably need to cut down on
the extra blackboard material, too. Next time, I would try to use at most 8 sessions for this,
to be able to cover other important topics such as outlier detection in more detail, that were
a bit too short this time.
I hope the slides will be interesting and useful, and I would appreciate if you give me credit, e.g., by citing my work appropriately.