8.3 Normalise Variable Names
Variable names are normalised so as to have some certainty in interacting with the data. The convenience function janitor::clean_names() can do this.
# Review the variables before normalising their names.
names(ds)
## [1] "Date" "Location" "MinTemp" "MaxTemp"
## [5] "Rainfall" "Evaporation" "Sunshine" "WindGustDir"
## [9] "WindGustSpeed" "WindDir9am" "WindDir3pm" "WindSpeed9am"
## [13] "WindSpeed3pm" "Humidity9am" "Humidity3pm" "Pressure9am"
## [17] "Pressure3pm" "Cloud9am" "Cloud3pm" "Temp9am"
## [21] "Temp3pm" "RainToday" "RISK_MM" "RainTomorrow"
# Capture the original variable names for use in plots.
<- names(ds)
vnames
# Normalise the variable names.
%<>% clean_names(numerals="right")
ds
# Confirm the results are as expected.
names(ds)
## [1] "date" "location" "min_temp" "max_temp"
## [5] "rainfall" "evaporation" "sunshine" "wind_gust_dir"
## [9] "wind_gust_speed" "wind_dir_9am" "wind_dir_3pm" "wind_speed_9am"
## [13] "wind_speed_3pm" "humidity_9am" "humidity_3pm" "pressure_9am"
## [17] "pressure_3pm" "cloud_9am" "cloud_3pm" "temp_9am"
## [21] "temp_3pm" "rain_today" "risk_mm" "rain_tomorrow"
# Index the original variable names by the new names.
names(vnames) <- names(ds)
vnames
## date location min_temp max_temp rainfall
## "Date" "Location" "MinTemp" "MaxTemp" "Rainfall"
## evaporation sunshine wind_gust_dir wind_gust_speed wind_dir_9am
## "Evaporation" "Sunshine" "WindGustDir" "WindGustSpeed" "WindDir9am"
## wind_dir_3pm wind_speed_9am wind_speed_3pm humidity_9am humidity_3pm
## "WindDir3pm" "WindSpeed9am" "WindSpeed3pm" "Humidity9am" "Humidity3pm"
## pressure_9am pressure_3pm cloud_9am cloud_3pm temp_9am
## "Pressure9am" "Pressure3pm" "Cloud9am" "Cloud3pm" "Temp9am"
## temp_3pm rain_today risk_mm rain_tomorrow
## "Temp3pm" "RainToday" "RISK_MM" "RainTomorrow"
Notice that we capture the original variable names in the variable for reference, and particularly when generating plots and wanting to use the original names.
The variable names now conform to our expectations of them and in accordance to our chosen style as documented in Chapter 23.
Your donation will support ongoing development and give you access to the PDF version of the book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.