Data Science Desktop Survival Guide
by Graham Williams
Create Data Setup
wajig install r-cran-<pkgname>.
# Load required packages from local library into the R session.
library(dplyr) # Wrangling: select() sample_frac().
library(magrittr) # Data pipelines: %>% %<>% %T>% equals().
library(randomForest) # Model: randomForest() na.roughfix() for missing data.
library(readxl) # Read Excel spreadsheets: read_excel().
library(wakefield) # Generate random datasets.
library(xlsx) # Write Excel spreadsheets: write.xlsx() saveWorkbook().
library(rattle) # Dataset: weather.
After loading the required packages into the library we access the rattle::weatherAUS dataset and save it into the template dataset named ds, as per the template based approach introduced in Williams (2017). The dataset is modestly large and is used extensively in this book to illustrate the capabilities of R for the Data Scientist.
# Initialise the dataset as per the template.
ds <- weatherAUS
ds %>% sample_frac()