# Cluster analysis lecture notes

In Winter Term 2017/2018 I was *substitute* professor at Univeristy Heidelberg,
and giving the lecture “Knowledge Discovery in Databases”, i.e., the data mining lecture.

While I won’t make all my slides available, I decided to make the **chapter on
cluster analysis** available. Largely, because there do not appear to be good
current books on this topic. Many of the books on data mining barely cover the basics.
And I am constantly surprised to see how little people know beyond k-means.
But clustering is much broader than k-means!

As I hope to give this lecture frequently at some point, I appreciate feedback to further improve them. This year, I almost completely reworked them, so there are a lot of things to fine tune.

There exist three versions of the slides:

- the screen version, 433 overlays, 6 MB
- the print version, 53 pages, with 3 slides per page, 4 MB
- the lecturers version, 80 pages, 2 slides each, with additional - private - notes of what I explain on the blackboard only

These slides took me about 9 sessions of 90 minutes each.

On one hand, I was not very fast this year, and I probably need to cut down on
the extra blackboard material, too. Next time, I would try to use at most 8 sessions for this,
to be able to cover other important topics such as outlier detection in more detail, that were
a bit too short this time.

I hope the slides will be interesting and useful, and I would appreciate if you give me credit, e.g., by citing my work appropriately.