9.8 Selecting Rows
20200419 Rows from a data frame can be dplyr::filter()’ed using specific conditions. The rows in the resulting data frame will be those for which the condition is true.
## # A tibble: 104,775 × 24
## date location min_temp max_temp rainfall evaporation sunshine
## <date> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2008-12-02 Albury 7.4 25.1 0 NA NA
## 2 2008-12-03 Albury 12.9 25.7 0 NA NA
## 3 2008-12-04 Albury 9.2 28 0 NA NA
## 4 2008-12-05 Albury 17.5 32.3 1 NA NA
## 5 2008-12-06 Albury 14.6 29.7 0.2 NA NA
## 6 2008-12-07 Albury 14.3 25 0 NA NA
## 7 2008-12-08 Albury 7.7 26.7 0 NA NA
## 8 2008-12-09 Albury 9.7 31.9 0 NA NA
## 9 2008-12-10 Albury 13.1 30.1 1.4 NA NA
## 10 2008-12-11 Albury 13.4 30.4 0 NA NA
## # ℹ 104,765 more rows
## # ℹ 17 more variables: wind_gust_dir <ord>, wind_gust_speed <dbl>,
## # wind_dir_9am <ord>, wind_dir_3pm <ord>, wind_speed_9am <dbl>,
## # wind_speed_3pm <dbl>, humidity_9am <int>, humidity_3pm <int>,
## # pressure_9am <dbl>, pressure_3pm <dbl>, cloud_9am <int>, cloud_3pm <int>,
## # temp_9am <dbl>, temp_3pm <dbl>, rain_today <fct>, risk_mm <dbl>,
## # rain_tomorrow <fct>
To select rows that have missing values, for example, use dplyr::filter() with purrr::pmap_lgl() to map base::is.na() for base::any() column in the row. In the following example we count the number with missing values using base::nrow(), formatted nicely using scales::comma():
ds %>%
select(-date) %>%
filter(purrr::pmap_lgl(., ~any(is.na(c(...))))) %>%
nrow() %>% scales::comma() %>% cat("rows have missing values.\n")
## 156,731 rows have missing values.
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0