Go to TogaWare.com Home Page. Data Science Desktop Survival Guide
by Graham Williams
Duck Duck Go

Pipeline Syntactic Sugar

Raw For the technically minded we note that what is actually happening here is that the syntax, or how we write the sentences, is changed by R from the normal embedded functional approach in order to increase the ease with which we can read the code. This is an important goal as we need to always keep in mind that we write our code for others (and ourselves later on) to read. Below we see a pipeline version of a series of commands operating on the dataset and not how it is mapped by R into the functional construct. For many of us it will take quite a bit of effort to parse this traditional functional form of the expression so as to understand what it is doing. The pipeline alternative provides a clearer narrative
# Summarise observations with little or no rainfall.

weatherAUS %>%
  select(MinTemp, MaxTemp, Rainfall, Sunshine) %>%
  filter(Rainfall < 1) %>%
  summary()
##     MinTemp         MaxTemp        Rainfall          Sunshine    
##  Min.   :-8.70   Min.   :-2.1   Min.   :0.00000   Min.   : 0.00  
##  1st Qu.: 7.20   1st Qu.:19.1   1st Qu.:0.00000   1st Qu.: 6.20  
##  Median :11.90   Median :23.8   Median :0.00000   Median : 9.30  
##  Mean   :11.97   Mean   :24.3   Mean   :0.05825   Mean   : 8.37  
##  3rd Qu.:16.70   3rd Qu.:29.3   3rd Qu.:0.00000   3rd Qu.:11.00  
##  Max.   :33.90   Max.   :48.9   Max.   :0.90000   Max.   :14.50  
....

# Functional form equivalent to the pipeline above.

summary(filter(select(weatherAUS,
                      MinTemp, MaxTemp, Rainfall, Sunshine),
               Rainfall < 1))
##     MinTemp         MaxTemp        Rainfall          Sunshine    
##  Min.   :-8.70   Min.   :-2.1   Min.   :0.00000   Min.   : 0.00  
##  1st Qu.: 7.20   1st Qu.:19.1   1st Qu.:0.00000   1st Qu.: 6.20  
##  Median :11.90   Median :23.8   Median :0.00000   Median : 9.30  
##  Mean   :11.97   Mean   :24.3   Mean   :0.05825   Mean   : 8.37  
##  3rd Qu.:16.70   3rd Qu.:29.3   3rd Qu.:0.00000   3rd Qu.:11.00  
##  Max.   :33.90   Max.   :48.9   Max.   :0.90000   Max.   :14.50  
....

Anything that improves the readability of our code is useful. Computers are quite capable of doing the hard work of transforming a simpler sentence into this much more complex looking sentence.


Support further development by purchasing the PDF version of the book.
Other online resources include the GNU/Linux Desktop Survival Guide.
Books available on Amazon include Data Mining with Rattle and Essentials of Data Science.
Popular open source software includes rattle and wajig.
Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 2000-2020 Togaware Pty Ltd. . Creative Commons ShareAlike V4.