Go to TogaWare.com Home Page. Data Science Desktop Survival Guide
by Graham Williams
Duck Duck Go



CLICK HERE TO VISIT THE UPDATED SURVIVAL GUIDE

Filter Rows Having Missing Values

20201202 To select the rows from a dataset which have missing values in any of the columns across the dataset we dplyr::filter() dplyr::across() tidyselect::everything() that base::is.na() and reduce it within the dplyr::filter() using the or operator. In the example we randomly sample a few rows and columns to show the result.

ds %>%
  filter(across(everything(), is.na) %>% reduce(`|`)) %>%
  sample_frac() %>%
  select(date, location, sample(3:length(vars), 4))
## # A tibble: 114,944 x 6
##    date       location      wind_gust_speed temp_9am sunshine temp_3pm
##    <date>     <chr>                   <dbl>    <dbl>    <dbl>    <dbl>
##  1 2015-01-20 NorahHead                  48     20.2     NA       23.1
##  2 2009-10-02 Canberra                   NA     12.6     NA       16.9
##  3 2014-07-13 Williamtown                59     10.3     NA       14.4
##  4 2014-12-05 BadgerysCreek              59     21.2     NA       30.5
##  5 2018-01-30 BadgerysCreek              41     24.1     NA       35.3
##  6 2009-02-21 Albury                     28     19.3     NA       29.8
##  7 2012-01-28 Woomera                    65     26.8     NA       31.8
##  8 2015-01-18 Williamtown                37     23.8     NA       26.8
##  9 2010-03-28 MountGinini                41     13.4     NA       18.1
## 10 2009-08-10 Albany                     NA     11.5      2.3     17.1
## # ... with 114,934 more rows


Support further development by purchasing the PDF version of the book.
Other online resources include the GNU/Linux Desktop Survival Guide.
Books available on Amazon include Data Mining with Rattle and Essentials of Data Science.
Popular open source software includes rattle and wajig.
Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 2000-2020 Togaware Pty Ltd. . Creative Commons ShareAlike V4.