Unsupervised learning

What does this tutorial cover/explore?

  • Dimensionality reduction:
    • Linear: PCA
    • Non-linear: tSNE, UMAP
  • Clustering:
    • K-means
    • Hierarchical clustering

We split the two 90 minutes sessions into a lecture and a workshop. In that space of time is quite difficult to cover the area so vast as Unsupervised Learning. Our goal here was to talk about the methods, explain their applications and some intuitions around them. In order to fully understand them we would recommend exploring each method in more detail in the materials we linked in the repository.

The tutorial was written in R using learnr package allowing for it to be self contained and one should be able to run at home as well. There are some questions and exercises there, as well as the points to ponder about - look out for 🛁.

The materials for the lecture and workshop are available on Github together with instructions on how to work with them.

Kasia Kedzierska
Kasia Kedzierska
DPhil Candidate in Genomic Medicine and Statistics

I’m a computational biologists (i.e. data scientist for genomic data). My research interests include ML for computational biology, epigenomics, tumor evolution and heterogeneity. I like plotting readable figures to illustrate the point I’m making.