19 Cluster Analysis

{#cha:clustering}

20200902 Cluster analysis(or clustering) is widely used in data mining to identify groups of similar data. It is well supported in R (R Core Team 2021) with many packages available for preparing for cluster analysis, identifying a good number of clusters, performing a clustering, and evaluating the clustering. A variety of cluster analysis algorithms are available, each generating a cluster index for each data item, as the representation of the clustering. The measure of performance often involves measuring the distances of points withion a cluster and between clusters.

We have briefly introduced the KMeans clustering algorithm in decision trees as an algorithm in Section ??.



Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.