Missing Targets

20180726 Sometimes there may be further operations to perform on the dataset prior to modelling. A common task is to deal with missing values. Here we remove observations with a missing target. As with any missing data we should also analyse whether there is any pattern to the missing targets. This may be indicative of a systemic data issue rather than simply randomly missing values.

# Check the dimensions to start with.

## [1] 176747     24

# Identify observations with a missing target.

ds %>%
  pull(target) %>%
  is.na() ->

# Check how many are found.

## [1] 4317

# Remove observations with a missing target.

ds %<>% filter(!missing.target)

# Confirm the filter delivered the expected dataset.

## [1] 172430     24

