19 Cluster Analysis

{#cha:clustering}

20200902 Cluster analysis(or clustering) is widely used in data mining to identify groups of similar data. It is well supported in R (R Core Team 2021) with many packages available for preparing for cluster analysis, identifying a good number of clusters, performing a clustering, and evaluating the clustering. A variety of cluster analysis algorithms are available, each generating a cluster index for each data item, as the representation of the clustering. The measure of performance often involves measuring the distances of points withion a cluster and between clusters.

We have briefly introduced the KMeans clustering algorithm in decision trees as an algorithm in Section ??.

References

R Core Team. 2021. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.


Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.