10.1 Wrangling Setup

Packages are loaded into the currently running R session from your local library directories on disk. Missing packages can be installed using utils::install.packages() within R. On Ubuntu, for example, R packages can also be installed using $ wajig install r-cran-<pkgname>.

# Load required packages from local library into the R session.

library(rattle)       # weather dataset.
library(readr)        # Efficient reading of CSV data.
library(dplyr)        # Wrangling: glimpse().
library(lobstr)       # Inspect R data structures.
library(tidyr)        # Prepare a tidy dataset, gather().
library(magrittr)     # Pipes %>% and %T>% and equals().
library(glue)         # Format strings.
library(janitor)      # Cleanup: clean_names().
library(lubridate)    # Dates and time.
library(FSelector)    # Feature selection, information.gain().
library(stringi)      # String concat operator %s+%.
library(stringr)      # String operations.
library(randomForest) # Impute missing values with na.roughfix().
library(ggplot2)      # Visualise data.
library(purrr)        # simplify(), set_names()

The rattle::weatherAUS dataset is loaded into the template variable ds and further template variables are setup as introduced by Graham J. Williams (2017). See Chapter 8 for details.

dsname <- "weatherAUS"
ds     <- get(dsname)
nobs   <- nrow(ds)

vnames <- names(ds)
ds    %<>% clean_names(numerals="right")
names(vnames) <- names(ds)

vars   <- names(ds)
target <- "rain_tomorrow"
vars   <- c(target, vars) %>% unique() %>% rev()

