Data Science Desktop Survival Guide
by Graham Williams |
|||||
Random Sample |
20200317
A common task is to randomly sample rows from a dataset. The dplyr::sample_frac() function will randomly choose a specified fraction (e.g. 20%) of the rows of the dataset:
ds %>% sample_frac(0.2)
The next time we randomly sample the dataset the resulting sample will be different:
|
ds %>% sample_frac(0.2)
To ensure the sample random sample each time use base::set.seed():
|
set.seed(72346)
ds %>% sample_frac(0.2)
set.seed(72346)
ds %>% sample_frac(0.2)
|