Go to TogaWare.com Home Page. Data Science Desktop Survival Guide
by Graham Williams
Duck Duck Go

Factors

20180908 For datasets that we load into R we will not always have examples of all possible levels of a factor. Consequently it is not always possible to automatically list all of the levels automatically. Be default the tidyverse ingests these variables as character so that we can take specific action to convert them to factor as required.

We first review the number of unique levels for each of the factors.

# Observe the unique levels.

ds[charc] %>% sapply(unique)
##       location          
##  [1,] "Albury"          
##  [2,] "BadgerysCreek"   
##  [3,] "Cobar"           
##  [4,] "CoffsHarbour"    
##  [5,] "Moree"           
##  [6,] "Newcastle"       
##  [7,] "NorahHead"       
##  [8,] "NorfolkIsland"   
##  [9,] "Penrith"         
## [10,] "Richmond"        
## [11,] "Sydney"          
## [12,] "SydneyAirport"   
## [13,] "WaggaWagga"      
## [14,] "Williamtown"     
## [15,] "Wollongong"      
## [16,] "Canberra"        
## [17,] "Tuggeranong"     
## [18,] "MountGinini"     
## [19,] "Ballarat"        
## [20,] "Bendigo"         
## [21,] "Sale"            
## [22,] "MelbourneAirport"
## [23,] "Melbourne"       
## [24,] "Mildura"         
....

If we decide to convert all of these variables from character into factor, then we can do so using base::factor().

# Convert all chracter variables to be factors.

ds[charc] %<>% map(factor)

We don't actually do so here instead considering each character variable in turn to decide how to handle it, especially that we might observe that evaporation and sunshine appear to be numeric.

A oneliner to do the conversion:

ds %<>% mutate_if(sapply(ds, is.character), as.factor)


Support further development by purchasing the PDF version of the book.
Other online resources include the GNU/Linux Desktop Survival Guide.
Books available on Amazon include Data Mining with Rattle and Essentials of Data Science.
Popular open source software includes rattle and wajig.
Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 2000-2020 Togaware Pty Ltd. . Creative Commons ShareAlike V4.