14.2 ML Data and Variables
20210104
The rattle::weatherAUS dataset is loaded into the template
variable ds and further template variables are setup as
introduced by Graham J. Williams (2017). See
Chapter 8 for details.
dsname <- "weatherAUS"
ds <- get(dsname)
nobs <- nrow(ds)
vnames <- names(ds)
ds %<>% clean_names(numerals="right")
names(vnames) <- names(ds)
vars <- names(ds)
target <- "rain_tomorrow"
vars <- c(target, vars) %>% unique() %>% rev()It is always useful to remind ourselves of the dataset with a random sample:
## # A tibble: 208,495 × 7
## date location pressure_3pm evaporation humidity_9am wind_speed_9am
## <date> <chr> <dbl> <dbl> <int> <dbl>
## 1 2021-08-06 Witchcliffe 1021. NA 96 11
## 2 2019-03-24 Darwin 1007. 5.8 91 20
## 3 2021-10-10 Watsonia 1012. 5.4 89 13
## 4 2018-10-28 Mildura 1017. 8.9 58 15
## 5 2020-09-10 WaggaWagga 1026. 4.6 65 28
## 6 2021-08-10 MelbourneAir… 1018 2.4 80 37
## 7 2017-04-12 Melbourne 1024. 2.8 79 9
## 8 2021-09-19 MountGinini NA NA 95 26
....
References
Williams, Graham J. 2017. The Essentials of Data Science: Knowledge Discovery Using r. The r Series. CRC Press.
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0