Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ewis:laboratoare:09 [2023/05/10 18:00]
alexandru.predescu [K-Means Clustering]
ewis:laboratoare:09 [2023/05/10 18:02] (current)
alexandru.predescu [K-Means Clustering]
Line 112: Line 112:
   * $x_i$ = data point $i$   * $x_i$ = data point $i$
   * $\bar{x_j}$ = cluster centroid $j$   * $\bar{x_j}$ = cluster centroid $j$
- 
-== The Elbow Method == 
- 
-Below is a plot of sum of squared distances (WCSS). If the plot looks like an arm, then the elbow on the arm is optimal k. In this example, the optimal number of clusters is 4. 
- 
-{{ :​ewis:​laboratoare:​lab9:​elbow_method.png?​400 |}} 
  
 The WCSS (inertia) is already provided in the result. The WCSS (inertia) is already provided in the result.
Line 125: Line 119:
 print(inertia) print(inertia)
 </​code>​ </​code>​
 +
 +== The Elbow Method ==
 +
 +Below is a plot of sum of squared distances (WCSS). If the plot looks like an arm, then the elbow on the arm is optimal k. In this example, the optimal number of clusters is 4.
 +
 +{{ :​ewis:​laboratoare:​lab9:​elbow_method.png?​400 |}}
  
  
Line 136: Line 136:
   * -1 – the sample is assigned to the wrong cluster   * -1 – the sample is assigned to the wrong cluster
  
-The clustering evaluation using both Elbow Method and Silhouette Coefficient is shown below. In this example, the optimal number of clusters is 4, as shown by both methods (looks like an arm, has the highest silhouette coefficient,​ k=4). +The Silhouette Score is calculated using the scikit-learn provided function //​silhouette_score//​.
- +
-{{ :​ewis:​laboratoare:​lab9:​clustering_evaluation.png?​400 |}} +
- +
-The Silhouette Score is then calculated using the scikit-learn provided function //​silhouette_score//​.+
  
 <code python> <code python>
Line 146: Line 142:
 print(s) print(s)
 </​code>​ </​code>​
 +
 +The clustering evaluation using both Elbow Method and Silhouette Coefficient is shown below. In this example, the optimal number of clusters is 4, as shown by both methods (looks like an arm, has the highest silhouette coefficient,​ k=4).
 +
 +{{ :​ewis:​laboratoare:​lab9:​clustering_evaluation.png?​400 |}}
  
 <note tip> <note tip>
ewis/laboratoare/09.1683730829.txt.gz · Last modified: 2023/05/10 18:00 by alexandru.predescu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0