Data Science Desktop Survival Guide by Graham Williams Desktop Survival Project Home Preface Data Science Introducing R R Constructs R Tasks R Strings R Read, Write, and Create Data Template Data Exploration Data Wrangling Data Visualisation Statistics ML Template ML Scenarios ML Activities ML Applications ML Algorithms Cluster Analysis Decision Trees Computer Vision Graph Data Privacy Literate Data Science Coding with Style Resources Bibliography Index

## Algorithms Data Review

20210103 We can review a random sample of the dataset.

ds  %>% sample_frac()
 ```## # A tibble: 176,747 x 24 ## date location min_temp max_temp rainfall evaporation sunshine ## ## 1 2014-12-04 Walpole 12.8 20.3 1.4 4.8 8.5 ## 2 2018-10-31 CoffsHa~ 16.3 28.3 0 4.8 8.5 ## 3 2018-04-25 Norfolk~ 17.8 22.2 38.6 5.2 8.5 ## 4 2018-06-13 PerthAi~ 6.5 17.7 6.2 1.6 1.6 ## 5 2019-10-07 Townsvi~ 16.5 29 0 8 8.5 ## 6 2009-04-08 Nurioot~ 9.9 21.8 0 4 9.7 ## 7 2020-04-08 Perth 17.1 32.6 0 7 10.5 .... ```

And glimpse all of the variables.

glimpse(ds)
 ```## Rows: 176,747 ## Columns: 24 ## \$ date 2008-12-01, 2008-12-02, 2008-12-03, 2008-12-04,... ## \$ location "Albury", "Albury", "Albury", "Albury", "Albury"... ## \$ min_temp 13.4, 7.4, 12.9, 9.2, 17.5, 14.6, 14.3, 7.7, 9.7... ## \$ max_temp 22.9, 25.1, 25.7, 28.0, 32.3, 29.7, 25.0, 26.7, ... ## \$ rainfall 0.6, 0.0, 0.0, 0.0, 1.0, 0.2, 0.0, 0.0, 0.0, 1.4... ## \$ evaporation 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8... ## \$ sunshine 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5... ## \$ wind_gust_dir W, WNW, WSW, NE, W, WNW, W, W, NNW, W, N, NNE, W... ## \$ wind_gust_speed 44, 44, 46, 24, 41, 56, 50, 35, 80, 28, 30, 31, ... ## \$ wind_dir_9am W, NNW, W, SE, ENE, W, SW, SSE, SE, S, SSE, NE, ... ## \$ wind_dir_3pm WNW, WSW, WSW, E, NW, W, W, W, NW, SSE, ESE, ENE... ## \$ wind_speed_9am 20, 4, 19, 11, 7, 19, 20, 6, 7, 15, 17, 15, 28, ... ## \$ wind_speed_3pm 24, 22, 26, 9, 20, 24, 24, 17, 28, 11, 6, 13, 28... ## \$ humidity_9am 71, 44, 38, 45, 82, 55, 49, 48, 42, 58, 48, 89, ... ## \$ humidity_3pm 22, 25, 30, 16, 33, 23, 19, 19, 9, 27, 22, 91, 9... ## \$ pressure_9am 1007.7, 1010.6, 1007.6, 1017.6, 1010.8, 1009.2, ... ## \$ pressure_3pm 1007.1, 1007.8, 1008.7, 1012.8, 1006.0, 1005.4, ... ## \$ cloud_9am 8, 5, 5, 5, 7, 5, 1, 5, 5, 5, 5, 8, 8, 5, 5, 0, ... ## \$ cloud_3pm 5, 5, 2, 5, 8, 5, 5, 5, 5, 5, 5, 8, 8, 7, 5, 5, ... ## \$ temp_9am 16.9, 17.2, 21.0, 18.1, 17.8, 20.6, 18.1, 16.3, ... ## \$ temp_3pm 21.8, 24.3, 23.2, 26.5, 29.7, 28.9, 24.6, 25.5, ... ## \$ rain_today No, No, No, No, No, No, No, No, No, Yes, No, Yes... ## \$ risk_mm 0.0, 0.0, 0.0, 1.0, 0.2, 0.0, 0.0, 0.0, 1.4, 0.0... ## \$ rain_tomorrow No, No, No, No, No, No, No, No, Yes, No, Yes, Ye... ```