12.1 Analysis of Variance ANOVA

ds %>%
  group_by(x) %>%
  summarise(
    n    = n(),
    sm   = mean(y),
    rssq = sum((y - sm)^2)
  ) %>%
  summarise(
    om   = sum(n * sm) / sum(n),
    ssq  = sum(n * (sm - om)^2),
    rssq = sum(rssq),
    df1  = n_distinct(x) - 1,
    mse  = ssq / df1,
    df2  = sum(n) - n_distinct(x),
    rmse = rssq / df2,
    F    = mse / rmse,
    P    = pf(F, df1, df2, lower.tail=FALSE)
  )


Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.