10.10 Filter Rows Having Missing Values
20201202 To select the rows from a dataset which have missing values in any of the columns across the dataset we stats::filter() dplyr::across() tidyr::everything() that base::is.na() and reduce it within the stats::filter() using the or operator. In the example we randomly sample a few rows and columns to show the result.
%>%
ds filter(across(everything(), is.na) %>% reduce(`|`)) %>%
sample_frac() %>%
select(date, location, sample(3:length(vars), 4))
## # A tibble: 127,202 x 6
## date location wind_speed_3pm wind_gust_speed pressure_9am cloud_3pm
## <date> <chr> <dbl> <dbl> <dbl> <int>
## 1 2013-09-17 Albury 22 35 1002. 8
## 2 2012-10-23 Tuggeranong 13 35 1028. NA
## 3 2017-10-08 SalmonGums 15 31 NA NA
## 4 2018-11-22 GoldCoast 31 83 1004. NA
## 5 2014-04-18 GoldCoast 17 24 1017 NA
## 6 2020-08-12 Brisbane 7 19 1021. 2
## 7 2011-02-03 Canberra 17 46 1010. 4
## 8 2020-08-08 Launceston 4 22 NA NA
## 9 2014-01-13 Ballarat 15 43 1017. NA
## 10 2015-02-18 Witchcliffe 15 33 1014. NA
## # … with 127,192 more rows
Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.