Data Science Desktop Survival Guide
by Graham Williams |
|||||
ROC Chart |
Another measure of the quality of a model is the ROC curve and in particular the area under the ROC curve. This area can be calculated using ROCR::prediction() and ROCR::performance() from ROCR. These functions use the probability of a prediction rather than the prediction of a class.
In the following code block we obtain the predicted probabilities from
the model, predicting over the te dataset. The result
from calling stats::predict() is a matrix with columns
corresponding to the possible class values recording the probability
of each class for each observation. The second column is the one of
interest (the probability that it will rain tomorrow or
rain_tomorrow==yes
). These probabilities are passed on to
ROCR::prediction() to compare them with the actual target
values. The result is then passed on to ROCR::performance()
from which we obtain the base::attr()ibute
y.values and then magrittr::extract2() the first value
as the area under the curve.