Data Science Desktop Survival Guide
by Graham Williams |
|||||
Reviewing Variable Names |
The names of the variables within the dataset as supplied to us may not be in any particular form and may use different conventions. For example, they may use a mix of upper and lower case letters (TempToday9AM) or be very long (Temperature_Recorded_Today_9am) or use sequential numbers to identify each variable (V004 or V004_rainToday) or use codes (XVn34_rain) or any number of other conventions. Often we prefer to simplify the variable names to ease our processing and thinking and to enforce a standard and consistent naming convention for ourselves.
We use base::names() to list the names of the variables within a dataset.
# Review the variables to consider normalising their names.
names(ds)
Notice that the names here use a scheme whereby the initial letter is capitalised and each word within the variable name is also capitalised. That's a reasonable naming scheme and is preferred by some.
|