Random initialization trap
Random initialization trap when the centroids are randomly initialized, each run of k means produce different WCSS. Incorrect choice of centroids lead to suboptimal clustering. To solve the issue of incorrect centroids, we use K-means++, where we select the centroids as far as possible at initialization. The idea is to have centroids to create distinct clusters centers to have optimal clustering to converge fast. let’s explain that with an example We have a dataset as shown in the scatterplot below and we have to cluster the data into three clusters. Data set for grouping into 3 clusters Based on the random initialization of centroids, we have have clustering 1 and clustering 2 shown below different clusters based on different initialization of centroids This shows that clustering will be different based on different initialization for the centroids. The circled point displays how data points are grouped differently based on different initialization for centr...