10.30 Rescale Data using Recenter in Rattle

20240802

A common normalisation is to recenter our data. The simplest approach to do this is to subtract the mean value of a variable from each observation’s value of the variable (to recenter the variable) and to then divide the values by the standard deviation of the variable values, which rescales the variable back to a range of values having a mean of zero and standard deviation of 1.

Rattle relies on the base::scale() function from the base package to perform the re-centering:

ds %<>% mutate(RRC_evaporation = scale(evaporation)[,1])

Note that the resulting mean is not precisely zero due to the precision of the calculations, but pretty close. The resulting standard deviation is generally exactly 1.



Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0