Go to TogaWare.com Home Page. Data Science Desktop Survival Guide
by Graham Williams
Duck Duck Go



CLICK HERE TO VISIT THE UPDATED SURVIVAL GUIDE

Formula to Describe the Goal

20200607 In the context of supporting analytic modelling tasks we identify formula used to describe the model to be built. Typically we will model the target variable on the input variables, so that using any resulting model with a new set of values for the input variables we can predict the value of the target variable.

Using stats::formula() we can automatically construct the formula from the dataset itself if the first column of the dataset is the target variable and the remaining columns are the input variables. Our usual ordering of columns within a dataset place the target variable as the last variable rather than the first. A simple selection of the columns from vars in the reverse order, using base::rev(), will then lead to the right formula automatically.

form <- formula(ds[rev(vars)]) %T>% print()
## rain_tomorrow ~ min_temp + max_temp + rainfall + evaporation + 
##     sunshine + wind_gust_dir + wind_gust_speed + wind_dir_9am + 
##     wind_dir_3pm + wind_speed_9am + wind_speed_3pm + humidity_9am + 
##     humidity_3pm + pressure_9am + pressure_3pm + cloud_9am + 
##     cloud_3pm + temp_9am + temp_3pm + rain_today

The notation used to express the formula begins with the name of the target (rain_tomorrow) followed by a tilde (~) followed by the variables that will be used to model the target, each separated by a plus (+). The formula indicates that we will fit a model to predict rain_tomorrow from the remaining input variables.

A shorthand for this same formulation is:

rain_tomorrow ~ .


Support further development by purchasing the PDF version of the book.
Other online resources include the GNU/Linux Desktop Survival Guide.
Books available on Amazon include Data Mining with Rattle and Essentials of Data Science.
Popular open source software includes rattle and wajig.
Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 2000-2020 Togaware Pty Ltd. . Creative Commons ShareAlike V4.