This is an old revision of the document!
The exercises will be solved in Python, using various popular libraries that are usually integrated in machine learning projects:
All tasks are tutorial based and every exercise will be associated with at least one “TODO” within the code. Those tasks can be found in the exercises package, but our recommendation is to follow the entire skeleton code for a better understanding of the concepts presented in this laboratory class. Each functionality is properly documented and for some exercises, there are also hints placed in the code.
In this exercise, you will learn how to properly evaluate a clustering model. We chose a K-means clustering algorithm for this example, but feel free to explore other alternatives. You can find out more about K-means clustering algorithms here. For all the associated tasks, you don't have to use any input file, because the clusters are generated in the skeleton. The model must learn how to group together points in a 2D space.
Compute the silhouette score of the model by using a Scikit-learn function found in the metrics package.
⚠️⚠️ NON-DEMO TASK
Solve the tasks marked with TODO - TASK A.
Fetch the centres of the clusters (the model should already have them ready for you ) and plot them together with a colourful 2D representation of the data groups. Your plot should look similar to the one below:
You can also play around with the standard deviation of the generated blobs and observe the different outcomes of the clustering algorithm:
CLUSTERS_STD = 2
You should be able to discuss these observations with the assistant.
⚠️⚠️ NON-DEMO TASK
Look at the hint above and solve the tasks marked with TODO - TASK B. Make at least 3 changes to the standard deviation. That means that 3 plots should be generated. Save each plot in a separate file.
⚠️⚠️ NON-DEMO TASK
Please take a minute to fill in the feedback form for this lab.
⠀