10.51 Ordered Factor

20180723 Given our knowledge that compass directions have an obvious order, we convert the direction variables into an ordered factor. We do so using ordered=TRUE with base::factor().

# Note the names of the wind direction variables.

ds %>% 
  select(contains("_dir")) %>% 
  names() %T>%
  print() ->
vnames
## [1] "wind_gust_dir" "wind_dir_9am"  "wind_dir_3pm"
# Convert these variables from character to factor.

ds[vnames] %<>% 
  lapply(factor, levels=compass, ordered=TRUE) %>% 
  data.frame() %>% 
  as_tibble()
# Confirm they are now factors.

ds[vnames] %>% sapply(class)
##      wind_gust_dir wind_dir_9am wind_dir_3pm
## [1,] "ordered"     "ordered"    "ordered"   
## [2,] "factor"      "factor"     "factor"

We can again obtain a distribution of the variables to confirm that all we have changed is the data type.

# Verify the distribution has not changed.

ds %>%
  select(contains("_dir")) %>%
  sapply(table)
##     wind_gust_dir wind_dir_9am wind_dir_3pm
## N           13806        17553        13369
## NNE         10358        12645        10551
## NE          11402        11887        12981
## ENE         12884        12351        12356
## E           14449        14595        12999
## ESE         11947        12532        13414
## SE          14670        14647        16406
## SSE         13997        14323        14179
## S           14195        13436        15062
## SSW         13992        11958        12973
## SW          13731        13000        14086
....


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0