8.9 k-means to create clusters
Now we are ready to perform clustering. As we are using k-means clustering, we will use kmeans()
function from base R’s stats
package. Recall that k-means is a hungry algorithm and it starts making clusters randomly. In order to avoid a situation where we start off on sub optimal points, kmeans()
allows us to specify a number of starting points to try. The nstart
argument in the function specifies this number.
set.seed(309)
km_cluster <- kmeans(cluster_data_pro,
4,
nstart = 25)
km_cluster
is a list of 9 items. we are interested in the cluster membership of each point, which is stored in km_cluster$cluster
. Add these clusters to the data.
cluster_data$km_cluster <- km_cluster$cluster