Data Science Desktop Survival Guide by Graham Williams Desktop Survival Project Home Preface Data Science Introducing R R Constructs R Tasks R Strings R Read, Write, and Create Data Template Data Exploration Data Wrangling Data Visualisation Statistics ML Template ML Scenarios ML Activities ML Applications ML Algorithms Cluster Analysis Decision Trees Computer Vision Graph Data Privacy Literate Data Science Coding with Style Resources Bibliography Index

## Rules Using Rpart Plot

We can also use rpart.plot::rpart.rules() from Stephen Milborrow's excellent rpart.plot package. This will convert the decision tree into rules:

rpart.rules(model)
 ```## rain_tomorrow ... ## 0.14 when humidity_3pm < 73 ... ## 0.34 when humidity_3pm is 73 to 84 & rainfall < 2.7 & wind_gu... ## 0.58 when humidity_3pm is 73 to 84 & rainfall < 2.7 & wind_gu... ## 0.65 when humidity_3pm is 73 to 84 & rainfall >= 2.7 ... ## 0.79 when humidity_3pm >= 84 ```

Some variations appear below. The first adds the coverage of the rule, which is the percentage of the training dataset that ends up in this part of the decision tree. The second adds the proportions of observations within a leaf node that correspond to No/Yes.

rpart.rules(model, cover=TRUE)
 ```## rain_tomorrow ... ## 0.14 when humidity_3pm < 73 ... ## 0.34 when humidity_3pm is 73 to 84 & rainfall < 2.7 & wind_gu... ## 0.58 when humidity_3pm is 73 to 84 & rainfall < 2.7 & wind_gu... ## 0.65 when humidity_3pm is 73 to 84 & rainfall >= 2.7 ... ## 0.79 when humidity_3pm >= 84 ... ```

rpart.rules(model, extra=4)
 ```## rain_tomorrow No Yes ... ## No [.86 .14] when humidity_3pm < 73 ... ## No [.66 .34] when humidity_3pm is 73 to 84 & rainfall < 2.7... ## Yes [.42 .58] when humidity_3pm is 73 to 84 & rainfall < 2.7... ## Yes [.35 .65] when humidity_3pm is 73 to 84 & rainfall >= 2.7... ## Yes [.21 .79] when humidity_3pm >= 84 ```

We can even obtain an explanation of the rules used in a prediction. Here we select specific observations to obtain the prediction and the rule used to make that prediction.

rpart.predict(model, newdata=ds[c(10, 12, 17), vars], rules=TRUE)
 ```## No Yes ## 1 0.8589349 0.1410651 ## 2 0.2125031 0.7874969 ## 3 0.6621727 0.3378273 ## ... ## 1 because humidity_3pm < 73 ... ## 2 because humidity_3pm >= 84 ... ## 3 because humidity_3pm is 73 to 84 & rainfall < 2.7 & wind_gust_speed <... ```

Other online resources include the GNU/Linux Desktop Survival Guide.
Books available on Amazon include Data Mining with Rattle and Essentials of Data Science.
Popular open source software includes rattle and wajig.
Hosted by Togaware, a pioneer of free and open source software since 1984.