Data Science Desktop Survival Guide by Graham Williams Desktop Survival Project Home Preface Data Science Introducing R R Constructs R Tasks R Strings R Read, Write, and Create Data Template Data Exploration Data Wrangling Data Visualisation Statistics ML Template ML Scenarios ML Activities ML Applications ML Algorithms Cluster Analysis Decision Trees Computer Vision Graph Data Privacy Literate Data Science Coding with Style Resources Bibliography Index

## Wind Directions

20180723 The three wind direction variables (`wind_gust_dir`, `wind_dir_9am`, `wind_dir_3pm`) are also identified as character. We review the distribution of values here with dplyr::select() identifying any variable that tidyselect::contains() the string `_dir` and then build a base::table() over those variables.

# Review the distribution of observations across levels.

ds %>%
select(contains("_dir")) %>%
sapply(table)
 ```## wind_gust_dir wind_dir_9am wind_dir_3pm ## N 10989 13978 10475 ## NNE 7937 9782 8002 ## NE 8715 9335 10092 ## ENE 9965 9592 9605 ## E 11071 11237 10123 ## ESE 9055 9536 10290 ## SE 11331 11398 12919 ## SSE 10946 10954 11089 ## S 11043 10519 11788 ## SSW 10809 9272 9902 ## SW 10793 10135 11166 ## WSW 11136 8392 11700 ## W 12122 10183 12411 ## WNW 10045 9067 10846 ## NW 9705 10488 10315 ## NNW 7954 9468 9358 ```

Observe all 16 compass directions are represented and it would make sense to convert this into a factor. Notice that the directions are in alphabetic order and conversion to factor will retain that. Instead we can construct an ordered factor to capture the compass order (from N, NNE, to NW and NNW). We note the ordering of the directions here.

 # Levels of wind direction are ordered compas directions. compass <- c("N", "NNE", "NE", "ENE",              "E", "ESE", "SE", "SSE",              "S", "SSW", "SW", "WSW",              "W", "WNW", "NW", "NNW")