14.2 ML Data and Variables
20210104
The rattle::weatherAUS dataset is loaded into the template
variable ds and further template variables are setup as
introduced by Graham J. Williams (2017). See
Chapter 8 for details.
dsname <- "weatherAUS"
ds     <- get(dsname)
    
nobs   <- nrow(ds)
vnames <- names(ds)
ds    %<>% clean_names(numerals="right")
names(vnames) <- names(ds)
vars   <- names(ds)
target <- "rain_tomorrow"
vars   <- c(target, vars) %>% unique() %>% rev()It is always useful to remind ourselves of the dataset with a random sample:
## # A tibble: 226,868 × 7
##    date       location wind_dir_9am rainfall pressure_9am rain_tomorrow sunshine
##    <date>     <chr>    <ord>           <dbl>        <dbl> <fct>            <dbl>
##  1 2009-08-03 Sydney   WNW               0          1020. No                 8.8
##  2 2011-11-02 Watsonia WSW               0.4        1016. Yes                0  
##  3 2015-04-26 Badgery… WSW              18          1008. No                NA  
##  4 2011-12-21 Brisbane W                 0          1015. Yes                9.3
##  5 2019-02-01 Penrith  SSW               0.4          NA  Yes               NA  
##  6 2021-02-18 Norfolk… E                 0          1018. Yes               NA  
##  7 2013-08-07 Katheri… SE                0          1017. No                NA  
##  8 2018-09-11 Adelaide NNE               0          1011. No                NA  
....References
Williams, Graham J. 2017. The Essentials of Data Science: Knowledge Discovery Using r. The r Series. CRC Press.
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0
