The idea behind ANOVA

The idea behind ANOVA


When you're testing hypothesis with the categorical explanatory variable and a quantitative response variable the tool that you should use is Analysis of Variance, also called ANOVA
The question we need to answer with the ANOVA F Test is, are the differences among the sample means due to true differences among the population means, or merely due to sampling variability?
In order to answer this question, using our data, we obviously need to look at the variation among the sample means. But that's not enough.
We also need to look at the variation among the sample means relative to the variation within the groups.
So F is the variation among sample means divided by the variation within groups. In other words, we need to look at the quantity, variation among sample means, divided by variation within groups. Which measures to what extent the difference among the sample groups, means, dominates over the usual variation within sample groups. Which reflects differences in individuals that are typical in random samples.

When the variation within the groups is large, the differences or variation among the sample means could become negligible. And the data would provide very little evidence against the null hypothesis. When the variation within groups is small, the variation among the sample means dominates. And the data have stronger evidence against the null hypothesis.
The P value of the ANOVA F Test is the probability of getting an F statistic as largest we got or even larger had the null hypothesis been true. That is, had the population means been equal. In other words, it tells us how surprising it is to find data like those observed, assuming that there is no difference among the population means.

Comments

Popular posts from this blog

Variables Types

Confounder Variable

Scientific Method