10.10 Filter Rows Having Missing Values

20201202 To select the rows from a dataset which have missing values in any of the columns across the dataset we stats::filter() dplyr::across() tidyr::everything() that base::is.na() and reduce it within the stats::filter() using the or operator. In the example we randomly sample a few rows and columns to show the result.

ds %>%
  filter(across(everything(), is.na) %>% reduce(`|`)) %>%
  sample_frac() %>%
  select(date, location, sample(3:length(vars), 4))
## # A tibble: 127,202 x 6
##    date       location    wind_speed_3pm wind_gust_speed pressure_9am cloud_3pm
##    <date>     <chr>                <dbl>           <dbl>        <dbl>     <int>
##  1 2013-09-17 Albury                  22              35        1002.         8
##  2 2012-10-23 Tuggeranong             13              35        1028.        NA
##  3 2017-10-08 SalmonGums              15              31          NA         NA
##  4 2018-11-22 GoldCoast               31              83        1004.        NA
##  5 2014-04-18 GoldCoast               17              24        1017         NA
##  6 2020-08-12 Brisbane                 7              19        1021.         2
##  7 2011-02-03 Canberra                17              46        1010.         4
##  8 2020-08-08 Launceston               4              22          NA         NA
##  9 2014-01-13 Ballarat                15              43        1017.        NA
## 10 2015-02-18 Witchcliffe             15              33        1014.        NA
## # … with 127,192 more rows


Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.