8.9 k-means to create clusters

Now we are ready to perform clustering. As we are using k-means clustering, we will use kmeans() function from base R’s stats package. Recall that k-means is a hungry algorithm and it starts making clusters randomly. In order to avoid a situation where we start off on sub optimal points, kmeans() allows us to specify a number of starting points to try. The nstart argument in the function specifies this number.

set.seed(309)

km_cluster <- kmeans(cluster_data_pro, 
                     4, 
                     nstart = 25)

km_cluster is a list of 9 items. we are interested in the cluster membership of each point, which is stored in km_cluster$cluster. Add these clusters to the data.

cluster_data$km_cluster <- km_cluster$cluster