14.2 ML Data and Variables

20210104

The rattle::weatherAUS dataset is loaded into the template variable ds and further template variables are setup as introduced by Graham J. Williams (2017). See Chapter 8 for details.

dsname <- "weatherAUS"
ds     <- get(dsname)
    
nobs   <- nrow(ds)

vnames <- names(ds)
ds    %<>% clean_names(numerals="right")
names(vnames) <- names(ds)

vars   <- names(ds)
target <- "rain_tomorrow"
vars   <- c(target, vars) %>% unique() %>% rev()

It is always useful to remind ourselves of the dataset with a random sample:

ds  %>% sample_frac() %>% select(date, location, sample(3:length(vars), 5))
## # A tibble: 191,431 x 7
##    date       location  humidity_3pm sunshine humidity_9am temp_9am pressure_9am
##    <date>     <chr>            <int>    <dbl>        <int>    <dbl>        <dbl>
##  1 2018-01-26 Uluru               15     NA             44     32.2        1003.
##  2 2008-10-20 Canberra            20     12.1           60     16.4        1021.
##  3 2020-11-06 SalmonGu…           37     NA             61     15.6          NA 
##  4 2011-01-01 Badgerys…           27     NA             74     24.4        1014.
##  5 2010-11-23 Ballarat            38     NA             55     21.2        1020.
##  6 2010-06-17 Canberra            69      2.4           91      9.3        1014.
##  7 2017-06-07 Cobar               32     NA             84      5.3        1027.
##  8 2015-12-18 Richmond            27     NA             75     21.9        1017.
##  9 2008-12-18 Tuggeran…           39     NA             72     17.5        1004.
## 10 2015-06-15 Launcest…           88     NA             98      7.9          NA 
## # … with 191,421 more rows


Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.