10.6 Binning

20240818

A binning function is provided by Rattle, coded by Daniele Medri. The Rattle interface provides an option to choose between Quantile binning, KMeans binning, and Equal Width binning. For each option the default number of bins is 4, and we can change this to suit our needs. The generated variables are prefixed with either BQU_, BKM_, or BEW_ respectively, with a suffix of n for the number of bins. Thus, we can create multiple binnings for any variable.

An example of why we might want to do this is to visualise data. A mosaic plot, for exapmle, is only uesful for categoric data and so we could turn Sunshine into a categoric by binning. Also talk about binning to show box plot for different targets.

Note that quantile binning is the same as equal count binning.



Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0