18.6 Decision Trees

20210103

Representation Method Measure
Tree Recursive Partitioning Information Gain

To build a decision tree we typically use rpart::rpart().

mtype <- "rpart"
mdesc <- "decision tree"

ds %>%
  select(all_of(vars)) %>%
  slice(tr) %>%
  rpart(form, ., method="class", control=rpart.control(maxdepth=3)) %T>%
  print() ->
model
## n= 134001 
## 
## node), split, n, loss, yval, (yprob)
##       * denotes terminal node
## 
##  1) root 134001 28183 No (0.7896807 0.2103193)  
##    2) humidity_3pm< 71.5 112779 15543 No (0.8621818 0.1378182) *
##    3) humidity_3pm>=71.5 21222  8582 Yes (0.4043917 0.5956083)  
##      6) humidity_3pm< 83.5 12393  5755 No (0.5356249 0.4643751)  
##       12) rainfall< 2.15 8069  3005 No (0.6275871 0.3724129) *
##       13) rainfall>=2.15 4324  1574 Yes (0.3640148 0.6359852) *
##      7) humidity_3pm>=83.5 8829  1944 Yes (0.2201835 0.7798165) *

Chapter 21 covers decision trees in detail whilst Chapter 14 uses decision trees as the model builder to demonstrate the model template. Examples of decision tree induction are available through the rain, iris, and pyiris packages from MLHub.



Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.