Go to TogaWare.com Home Page. Data Science Desktop Survival Guide
by Graham Williams
Duck Duck Go



CLICK HERE TO VISIT THE UPDATED SURVIVAL GUIDE

Rules Using Rpart Plot

We can also use rpart.plot::rpart.rules() from Stephen Milborrow's excellent rpart.plot package. This will convert the decision tree into rules:

rpart.rules(model)
##  rain_tomorrow                                                          ...
##           0.14 when humidity_3pm <  73                                  ...
##           0.34 when humidity_3pm is 73 to 84 & rainfall <  2.7 & wind_gu...
##           0.58 when humidity_3pm is 73 to 84 & rainfall <  2.7 & wind_gu...
##           0.65 when humidity_3pm is 73 to 84 & rainfall >= 2.7          ...
##           0.79 when humidity_3pm >=       84

Some variations appear below. The first adds the coverage of the rule, which is the percentage of the training dataset that ends up in this part of the decision tree. The second adds the proportions of observations within a leaf node that correspond to No/Yes.

rpart.rules(model, cover=TRUE)
##  rain_tomorrow                                                          ...
##           0.14 when humidity_3pm <  73                                  ...
##           0.34 when humidity_3pm is 73 to 84 & rainfall <  2.7 & wind_gu...
##           0.58 when humidity_3pm is 73 to 84 & rainfall <  2.7 & wind_gu...
##           0.65 when humidity_3pm is 73 to 84 & rainfall >= 2.7          ...
##           0.79 when humidity_3pm >=       84                            ...

rpart.rules(model, extra=4)
##  rain_tomorrow   No Yes                                                 ...
##             No [.86 .14] when humidity_3pm <  73                        ...
##             No [.66 .34] when humidity_3pm is 73 to 84 & rainfall <  2.7...
##            Yes [.42 .58] when humidity_3pm is 73 to 84 & rainfall <  2.7...
##            Yes [.35 .65] when humidity_3pm is 73 to 84 & rainfall >= 2.7...
##            Yes [.21 .79] when humidity_3pm >=       84

We can even obtain an explanation of the rules used in a prediction. Here we select specific observations to obtain the prediction and the rule used to make that prediction.

rpart.predict(model, newdata=ds[c(10, 12, 17), vars], rules=TRUE)
##          No       Yes
## 1 0.8589349 0.1410651
## 2 0.2125031 0.7874969
## 3 0.6621727 0.3378273
##                                                                         ...
## 1 because humidity_3pm <  73                                            ...
## 2 because humidity_3pm >=       84                                      ...
## 3 because humidity_3pm is 73 to 84 & rainfall <  2.7 & wind_gust_speed <...


Support further development by purchasing the PDF version of the book.
Other online resources include the GNU/Linux Desktop Survival Guide.
Books available on Amazon include Data Mining with Rattle and Essentials of Data Science.
Popular open source software includes rattle and wajig.
Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 2000-2020 Togaware Pty Ltd. . Creative Commons ShareAlike V4.