Analysis of Variance
What is ANOVA
Analysis of Variance
Statistical test for detecting differences in the group means when there is one one parametric dependent variable and one or more independent variable.
We are intrested in determining whther differences exist between the population means.
Types of ANOVA
1 - way and 2-way
1-way:
- 1 dependent and 1 independent variable
2-way :
- 2 dependent and 2 or more independent variables
Key terms
Null Hypothesis H0
- General statement that states that no relationship between 2 measured phonomena or no association amoong groups
Alternative Hypothesis HA
- Contrary to Null hypothesis, it states that whenever something is happening, a new theory is preferred instead of old
P Value
- The probability of finding the observed, or more extreme, results when null hypothesis of a study question is True.
Alpha Value
- Creterion for determining whther a test statistic is technically significant.
F - statistics
- Extent of difference between the means of different trials
Sum of Squares
- Variation from the mean of different medical trials
Mean
- Average of all the results from evidences like medical trials.
How we can use ANOVA
Anova determines whther the groups created by the levels of independent variable are statistically different by calculatin the whether the means of the different samples are different from the overall mean of the dependent variable
If any of group means is significantly different from the overall mean, then the null hypothesis is rejected.
F - Statistics
Value you get when you run the ANOVA test to find out if the means bewteen two populations are significantly different.
Between-group variance is large relative to the value within group variance, so F statistic will be larger & > the critical value, therefore significantly different.
ANOVA formula
is made up of numerous parts. The best way to tackle an ANOVA test problem is to organize the formulae inside an ANOVA table. Below are the ANOVA formulae.
Source of Variation | Sum of Squares | Degree of Freedom | Mean Squares | F Value |
---|
Between Groups | SSB = Σnj(X̄j– X̄)2 | df1 = k – 1 | MSB = SSB / (k – 1) | f = MSB / MSE or, F = MST/MSE |
Error | SSE = Σnj(X̄- X̄j)2 | df2 = N – k | MSE = SSE / (N – k) | |
Total | SST = SSB + SSE | df3 = N – 1 | | |
where,
- F = ANOVA Coefficient
- MSB = Mean of the total of squares between groupings
- MSW = Mean total of squares within groupings
- MSE = Mean sum of squares due to error
- SST = total Sum of squares
- p = Total number of populations
- n = The total number of samples in a population
- SSW = Sum of squares within the groups
- SSB = Sum of squares between the groups
- SSE = Sum of squares due to error
- s = Standard deviation of the samples
- N = Total number of observations
Worked Examples
Example 1: Three different kinds of food are tested on three groups of rats for 5 weeks. The objective is to check the difference in mean weight(in grams) of the rats per week. Apply one-way ANOVA using a 0.05 significance level to the following data:
Food I | Food II | Food III |
---|
8 | 4 | 11 |
12 | 5 | 8 |
19 | 4 | 7 |
8 | 6 | 13 |
6 | 9 | 7 |
11 | 7 | 9 |
Solution:
H0: μ1= μ2=μ3
H1: The means are not equal
Since, X̄1 = 5, X̄2 = 9, X̄3 = 10
Total mean = X̄ = 8
SSB = 6(5 – 8)2 + 6(9 – 8)2 + 6(10 – 8)2 = 84
SSE = 68
MSB = SSB/df1 = 42
MSE = SSE/df2 = 4.53
f = MSB/MSE = 42/4.53 = 9.33
Since f > F, the null hypothesis stands rejected.
Example 2: Calculate the ANOVA coefficient for the following data:
Plant | Number | Average span | s |
---|
Hibiscus | 5 | 12 | 2 |
Marigold | 5 | 16 | 1 |
Rose | 5 | 20 | 4 |
Solution:
Plant | n | x | s | s2 |
---|
Hibiscus | 5 | 12 | 2 | 4 |
Marigold | 5 | 16 | 1 | 1 |
Rose | 5 | 20 | 4 | 16 |
p = 3
n = 5
N = 15
x̄ = 16
SST = Σn(x−x̄)2
SST= 5(12 − 16)2 + 5(16 − 16)2 + 11(20 − 16)2 = 160
MST = SST/p-1 = 160/3-1 = 80
SSE = ∑ (n−1) = 4 (4 + 1) + 4(16) = 84
MSE = 7
F = MST/MSE = 80/7
F = 11.429
Example 3: The following data show the number of worms quarantined from the GI areas of four groups of muskrats in a carbon tetrachloride anthelmintic study. Conduct a two-way ANOVA test.
I | II | III | IV |
---|
338 | 412 | 124 | 389 |
324 | 387 | 353 | 432 |
268 | 400 | 469 | 255 |
147 | 233 | 222 | 133 |
309 | 212 | 111 | 265 |
Solution:
Source of Variation | Sum of Squares | Degrees of Freedom | Mean Square |
---|
Between the groups | 62111.6 | 8 | 9078.067 |
Within the groups | 98787.8 | 16 | 4567.89 |
Total | 167771.4 | 24 | |
Since F = MST / MSE
= 9.4062 / 3.66 = 2.57
Example 4: Three types of fertilizers are used on three groups of plants for 5 weeks. We want to check if there is a difference in the mean growth of each group. Using the data given below apply a one way ANOVA test at 0.05 significant level.
Fertilizer 1 | Fertilizer 2 | Fertilizer 3 |
---|
6 | 8 | 13 |
8 | 12 | 9 |
4 | 9 | 11 |
5 | 11 | 8 |
3 | 6 | 7 |
4 | 8 | 12 |
Solution:
H0: μ1 = μ2 = μ3
H1: The means are not equal
Fertilizer 1 | Fertilizer 2 | Fertilizer 3 |
---|
6 | 8 | 13 |
8 | 12 | 9 |
4 | 9 | 11 |
5 | 11 | 8 |
3 | 6 | 7 |
4 | 8 | 12 |
¯¯¯¯¯X1 = 5 | ¯¯¯¯¯X1 = 9 | ¯¯¯¯¯X1 = 10 |
Total mean, ¯¯¯¯¯X = 8
n1 = n2 = n3 = 6, k = 3
SSB = 6(5 - 8)2 + 6(9 - 8)2 + 6(10 - 8)2
= 84
df1 = k - 1 = 2
Fertilizer 1 | (X - 5)2 | Fertilizer 2 | (X - 9)2 | Fertilizer 3 | (X - 10)2 |
---|
6 | 1 | 8 | 1 | 13 | 9 |
8 | 9 | 12 | 9 | 9 | 1 |
4 | 1 | 9 | 0 | 11 | 1 |
5 | 0 | 11 | 4 | 8 | 4 |
3 | 4 | 6 | 9 | 7 | 9 |
4 | 1 | 8 | 1 | 12 | 4 |
¯¯¯¯¯X1 = 5 | Total = 16 | ¯¯¯¯¯X1 = 9 | Total = 24 | ¯¯¯¯¯X1 = 10 | Total = 28 |
SSE = 16 + 24 + 28 = 68
N = 18
df2 = N - k = 18 - 3 = 15
MSB = SSB / df1 = 84 / 2 = 42
MSE = SSE / df2 = 68 / 15 = 4.53
ANOVA test statistic, f = MSB / MSE = 42 / 4.53 = 9.33
Using the f table at α = 0.05 the critical value is given as F(0.05, 2, 15) = 3.68
As f > F, thus, the null hypothesis is rejected and it can be concluded that there is a difference in the mean growth of the plants.
Answer: Reject the null hypothesis
No comments:
Post a Comment