10.50 Wind Directions

20180723 The three wind direction variables (wind_gust_dir, wind_dir_9am, wind_dir_3pm) are also identified as character. We review the distribution of values here with dplyr::select() identifying any variable that tidyselect::contains() the string _dir and then build a base::table() over those variables.

# Review the distribution of observations across levels.

ds %>%
  select(contains("_dir")) %>%
  sapply(table)
##     wind_gust_dir wind_dir_9am wind_dir_3pm
## N           12894        16321        12402
## NNE          9520        11693         9663
## NE          10340        10906        11925
## ENE         11713        11255        11306
## E           13074        13304        11832
## ESE         10792        11423        12195
## SE          13364        13367        15053
## SSE         12815        13065        13001
## S           13019        12295        13797
## SSW         12819        11003        11829
## SW          12625        11990        12976
## WSW         12959         9801        13654
## W           14122        11834        14361
## WNW         11707        10751        12701
## NW          11512        12187        12084
## NNW          9454        11165        11048

Observe all 16 compass directions are represented and it would make sense to convert this into a factor. Notice that the directions are in alphabetic order and conversion to factor will retain that. Instead we can construct an ordered factor to capture the compass order (from N, NNE, to NW and NNW). We note the ordering of the directions here.

# Levels of wind direction are ordered compas directions.

compass <- c("N", "NNE", "NE", "ENE",
             "E", "ESE", "SE", "SSE",
             "S", "SSW", "SW", "WSW",
             "W", "WNW", "NW", "NNW")


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0