3.14 Pipes: Tee Pipe

20210103 Another useful operation is the tee-pipe %T>% which causes the command that follows to be run as a side-pipe whilst piping the same data into that command and also into the then following command and the rest of the pipeline. The output from the first command is ignored, except for its side-effect, which might be to base::print() the intermediate results as below or to store the intermediate results before further processing, as in Section @ref(sec:pipe_tee_save).

A common use case is to whilst continuing on to assign the dataset itself to a variable. We will often see the following example.

# Demonstrate usage of a tee-pipe.

no_rain <-
  ds %>%
  filter(rainfall==0) %T>%
  print()
## # A tibble: 121,167 x 24
##    date       location min_temp max_temp rainfall evaporation sunshine
##    <date>     <chr>       <dbl>    <dbl>    <dbl>       <dbl>    <dbl>
##  1 2008-12-02 Albury        7.4     25.1        0          NA       NA
##  2 2008-12-03 Albury       12.9     25.7        0          NA       NA
##  3 2008-12-04 Albury        9.2     28          0          NA       NA
##  4 2008-12-07 Albury       14.3     25          0          NA       NA
##  5 2008-12-08 Albury        7.7     26.7        0          NA       NA
##  6 2008-12-09 Albury        9.7     31.9        0          NA       NA
##  7 2008-12-11 Albury       13.4     30.4        0          NA       NA
##  8 2008-12-15 Albury        8.4     24.6        0          NA       NA
##  9 2008-12-17 Albury       14.1     20.9        0          NA       NA
## 10 2008-12-20 Albury        9.8     25.6        0          NA       NA
## # … with 121,157 more rows, and 17 more variables: wind_gust_dir <ord>,
## #   wind_gust_speed <dbl>, wind_dir_9am <ord>, wind_dir_3pm <ord>,
## #   wind_speed_9am <dbl>, wind_speed_3pm <dbl>, humidity_9am <int>,
## #   humidity_3pm <int>, pressure_9am <dbl>, pressure_3pm <dbl>,
## #   cloud_9am <int>, cloud_3pm <int>, temp_9am <dbl>, temp_3pm <dbl>,
## #   rain_today <fct>, risk_mm <dbl>, rain_tomorrow <fct>

The tee-pipe processes the transformed dataset in two ways—once with base::print(), then continuing on with dplyr::select() and base::summary(). The tee-pipe splits the flow in two directions. The second flow continues the sequence of the pipeline.

ds %>%
  select(rainfall, min_temp, max_temp, sunshine) %>%
  filter(rainfall==0) %T>%
  print()  %>%
  select(min_temp, max_temp, sunshine) %>%
  summary()
## # A tibble: 121,167 x 4
##    rainfall min_temp max_temp sunshine
##       <dbl>    <dbl>    <dbl>    <dbl>
##  1        0      7.4     25.1       NA
##  2        0     12.9     25.7       NA
##  3        0      9.2     28         NA
##  4        0     14.3     25         NA
##  5        0      7.7     26.7       NA
##  6        0      9.7     31.9       NA
##  7        0     13.4     30.4       NA
##  8        0      8.4     24.6       NA
##  9        0     14.1     20.9       NA
## 10        0      9.8     25.6       NA
## # … with 121,157 more rows
##     min_temp       max_temp       sunshine    
##  Min.   :-8.7   Min.   :-2.1   Min.   : 0.00  
##  1st Qu.: 7.4   1st Qu.:19.7   1st Qu.: 6.90  
##  Median :12.1   Median :24.5   Median : 9.70  
##  Mean   :12.1   Mean   :24.9   Mean   : 8.72  
##  3rd Qu.:16.9   3rd Qu.:29.9   3rd Qu.:11.10  
##  Max.   :33.9   Max.   :48.9   Max.   :14.50  
##  NA's   :567    NA's   :614    NA's   :66012


Your donation will support ongoing development and give you access to the PDF version of the book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.