Monday, 8 April 2024

ANOVA - Primer

Analysis of  Variance 


What is ANOVA

Analysis of Variance

Statistical test for detecting differences in the group means when there is one one parametric dependent variable and one or more independent variable.

We are intrested in determining whther differences exist between the population means.


Types of ANOVA

1 - way and 2-way
1-way:
 - 1 dependent and 1 independent variable
2-way :
 - 2 dependent and 2 or more independent variables


Key terms

Null Hypothesis H0
 - General statement that states that no relationship between 2 measured phonomena or no association amoong groups
Alternative Hypothesis HA
 - Contrary to Null hypothesis, it states that whenever something is happening, a new theory is preferred instead of old
P Value
  - The probability of finding the observed, or more extreme, results when null hypothesis of a study question is True. 
Alpha Value 
  - Creterion for determining whther a test statistic is technically significant.
F - statistics
  - Extent of difference between the means of different trials
Sum of Squares
  - Variation from the mean of different medical trials
Mean
  - Average of all the results from evidences like medical trials.


How we can use ANOVA

Anova determines whther the groups created by the levels of independent  variable are statistically different by  calculatin the whether the means of the different samples are different from the overall mean of the dependent variable

If any of  group means is significantly different from the overall mean, then the null hypothesis is rejected.

F - Statistics

Value you get when you run the ANOVA test to find out if the means bewteen two populations are significantly different.

Between-group variance is large relative to the value within group variance, so F statistic will be larger & > the critical value, therefore significantly different.

ANOVA formula

is made up of numerous parts. The best way to tackle an ANOVA test problem is to organize the formulae inside an ANOVA table. Below are the ANOVA formulae.

Source of Variation

Sum of Squares

Degree of Freedom

Mean Squares

F Value

Between GroupsSSB = Σnj(X̄j– X̄)2df1 = k – 1MSB = SSB / (k – 1)

f = MSB / MSE

or, F = MST/MSE

ErrorSSE = Σnj(X̄- X̄j)2df2 = N – kMSE = SSE / (N – k) 
TotalSST = SSB + SSEdf3 = N – 1  

where,

  • F = ANOVA Coefficient
  • MSB = Mean of the total of squares between groupings
  • MSW = Mean total of squares within groupings
  • MSE = Mean sum of squares due to error
  • SST = total Sum of squares
  • p = Total number of populations
  • n = The total number of samples in a population
  • SSW = Sum of squares within the groups
  • SSB = Sum of squares between the groups
  • SSE = Sum of squares due to error
  • s = Standard deviation of the samples
  • N = Total number of observations


Worked Examples


Example 1: Three different kinds of food are tested on three groups of rats for 5 weeks. The objective is to check the difference in mean weight(in grams) of the rats per week. Apply one-way ANOVA using a 0.05 significance level to the following data:
Food IFood IIFood III
8411
1258
1947
8613
697
1179

Solution:

H0: μ1= μ23

H1: The means are not equal

Since, X̄1 = 5, X̄2 = 9, X̄3 = 10

Total mean = X̄ = 8

SSB = 6(5 – 8)2 + 6(9 – 8)2 + 6(10 – 8)2 = 84

SSE = 68

MSB = SSB/df= 42

MSE = SSE/df2 = 4.53

f = MSB/MSE = 42/4.53 = 9.33

Since f > F, the null hypothesis stands rejected.

Example 2: Calculate the ANOVA coefficient for the following data:

PlantNumberAverage spans
Hibiscus5122
Marigold5161
Rose5204

Solution:

Plantnxss2
Hibiscus51224
Marigold51611
Rose520416

p = 3
n = 5
N = 15
x̄ = 16
SST = Σn(x−x̄)2

SST= 5(12 − 16)+ 5(16 − 16)2 + 11(20 − 16)2 = 160

MST = SST/p-1 = 160/3-1 = 80

SSE = ∑ (n−1) = 4 (4 + 1) + 4(16) = 84

MSE = 7

F = MST/MSE = 80/7
 
F = 11.429

 

Example 3: The following data show the number of worms quarantined from the GI areas of four groups of muskrats in a carbon tetrachloride anthelmintic study. Conduct a two-way ANOVA test.

IIIIIIIV
338412124389
324387353432
268400469255
147233222133
309212111265

Solution:

Source of VariationSum of SquaresDegrees of FreedomMean Square
Between the groups62111.689078.067
Within the groups98787.8164567.89
Total167771.424 

Since F = MST / MSE

           = 9.4062 / 3.66 = 2.57 

  1. Example 4: Three types of fertilizers are used on three groups of plants for 5 weeks. We want to check if there is a difference in the mean growth of each group. Using the data given below apply a one way ANOVA test at 0.05 significant level.

    Fertilizer 1Fertilizer 2Fertilizer 3
    6813
    8129
    4911
    5118
    367
    4812

    Solution:

    01 = 2 = 3

    1: The means are not equal

    Fertilizer 1Fertilizer 2Fertilizer 3
    6813
    8129
    4911
    5118
    367
    4812
    ¯1 = 5¯1 = 9¯1 = 10

    Total mean, ¯ = 8

    1 = 2 = 3 = 6, k = 3

    SSB = 6(5 - 8)2 + 6(9 - 8)2 + 6(10 - 8)2

    = 84

    df1 = k - 1 = 2

    Fertilizer 1(X - 5)2Fertilizer 2(X - 9)2Fertilizer 3(X - 10)2
    6181139
    8912991
    4190111
    5011484
    346979
    4181124
    ¯1 = 5Total = 16¯1 = 9Total = 24¯1 = 10Total = 28

    SSE = 16 + 24 + 28 = 68

    N = 18

    df2 = N - k = 18 - 3 = 15

    MSB = SSB / df1 = 84 / 2 = 42

    MSE = SSE / df2 = 68 / 15 = 4.53

    ANOVA test statistic, f = MSB / MSE = 42 / 4.53 = 9.33

    Using the f table at  = 0.05 the critical value is given as F(0.05, 2, 15) = 3.68

    As f > F, thus, the null hypothesis is rejected and it can be concluded that there is a difference in the mean growth of the plants.

    Answer: Reject the null hypothesis

 

No comments:

Post a Comment

Green Energy - House Construction

With Minimum Meterological data, how i can build model for Green Energy new construction WIth Minimum Meterological data, how i can build m...