Data Science Desktop Survival Guide
by Graham Williams |
|||||
Factors |
20180908 For datasets that we load into R we will not always have examples of all possible levels of a factor. Consequently it is not always possible to automatically list all of the levels automatically. Be default the tidyverse ingests these variables as character so that we can take specific action to convert them to factor as required.
We first review the number of unique levels for each of the factors.
# Observe the unique levels.
ds[charc] %>% sapply(unique)
If we decide to convert all of these variables from character into factor, then we can do so using base::factor().
|
# Convert all chracter variables to be factors.
ds[charc] %<>% map(factor)
We don't actually do so here instead considering each character variable in turn to decide how to handle it, especially that we might observe that evaporation and sunshine appear to be numeric. A oneliner to do the conversion:
|
ds %<>% mutate_if(sapply(ds, is.character), as.factor)
|