14.11 ROC Chart

Another measure of the quality of a model is the ROC curve and in particular the area under the ROC curve. This area can be calculated using ROCR::prediction() and ROCR::performance() from ROCR (Sing et al. 2020). These functions use the probability of a prediction rather than the prediction of a class.

In the following code block we obtain the predicted probabilities from the model, predicting over the te dataset. The result from calling stats::predict() is a matrix with columns corresponding to the possible class values recording the probability of each class for each observation. The second column is the one of interest (the probability that it will rain tomorrow or rain_tomorrow==yes). These probabilities are passed on to ROCR::prediction() to compare them with the actual target values. The result is then passed on to ROCR::performance() from which we obtain the base::attr()ibute y.values and then magrittr::extract2() the first value as the area under the curve.



Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.