Go to TogaWare.com Home Page. Data Science Desktop Survival Guide
by Graham Williams
Duck Duck Go



CLICK HERE TO VISIT THE UPDATED SURVIVAL GUIDE

Decision Trees

20210103

Representation: Tree
Method: Recursive Partitioning
Measure: Information Gain

To build a decision tree we typically use rpart::rpart().

mtype <- "rpart"
mdesc <- "decision tree"

ds %>%
  select(all_of(vars)) %>%
  slice(tr) %>%
  rpart(form, ., method="class", control=rpart.control(maxdepth=3)) %T>%
  print() ->
model
## n= 123722 
## 
## node), split, n, loss, yval, (yprob)
##       * denotes terminal node
## 
##  1) root 123722 25822 No (0.7912902 0.2087098)  
##    2) humidity_3pm< 72.5 105610 14774 No (0.8601079 0.1398921) *
##    3) humidity_3pm>=72.5 18112  7064 Yes (0.3900177 0.6099823)  
##      6) humidity_3pm< 82.5 9469  4412 No (0.5340585 0.4659415)  
##       12) rainfall< 0.85 5291  1909 No (0.6391986 0.3608014) *
##       13) rainfall>=0.85 4178  1675 Yes (0.4009095 0.5990905) *
##      7) humidity_3pm>=82.5 8643  2007 Yes (0.2322110 0.7677890) *

Chapter 18 covers decision trees in detail whilst Chapter 12 uses decision trees as the model builder to demonstrate the model template. Examples of decision tree induction are available through the rain, iris, and pyiris packages from MLHub.


Support further development by purchasing the PDF version of the book.
Other online resources include the GNU/Linux Desktop Survival Guide.
Books available on Amazon include Data Mining with Rattle and Essentials of Data Science.
Popular open source software includes rattle and wajig.
Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 2000-2020 Togaware Pty Ltd. . Creative Commons ShareAlike V4.