Analysis of Variance Lecture 46

 

Introduction to ANOVA

Lecture 46

Analysis of variance (abbreviated ANOVA) is a statistical technique used to compare k (k > 2) independent population means or research groups technically called treatments simultaneously. In analysis of variance the k independent population means are compared by using the variability. In the ANOVA technique, the total variation in the response variable “Yij” is partitioned into its component parts, each of which is associated with a different source of variation. The F statistic is used to compare the estimates of these component parts of variance in such a manner that certain hypotheses about the equality of k independent population means can be tested as:


The F statistic follows an F distribution with k – 1 and n – k.

The high value of F indicates at least one population mean is significantly different from the other population means.

Assumptions on ANOVA

The ANOVA technique is based on certain assumptions before applying. These assumptions are given below:

The samples are selected randomly and independently.

The sampled population should be normal.

All sampled populations have identical variances.

The effects are additive.

Where:

Yij is yield.

μ: Population means based on historical data.

τi: Criterion 1 effect.

ρi: Criterion 2 effect

 ϵij: Random error.

Types of ANOVA

There are two different types of ANOVA depending on the source of variability.

One Way ANOVA

The one-way ANOVA is used when the data are classified into k groups on the basis of a single criterion.

Two-way ANOVA

The two-way ANOVA is used when the data are classified into k groups on the basis of two criteria.


One Way ANOVA

The one-way ANOVA is represented by the following linear statistical model.

Let k independent random samples of sizes n₁, n₂, ..., nₖ be selected at random from k normal populations having means μ₁, μ₂, ..., μₖ and identical standard deviations σ₁, σ₂, ..., σₖ.
Suppose it is desired to test the following null hypothesis.
H₀: μ₁ = μ₂ = ... = μₖ
Let Yij be the ith observation of the jth sample or treatment; then the data can be arranged as:


The total variation in the response variable is called the sum of squares total. The sum of squares total is partitioned into two components named the sum of squares between samples, or the sum of squares treatments, and the sum of squares within the sample, or error.
Thats
\

These can be calculated as:

When rows are unequal, then SST can be calculated as:

Analysis of variance table:


Testing Procedure:
i. State null and alternative hypotheses
H₀: μ₁ = μ₂ = ... = μₖ vs. H₁: μ₁  μ₂  ...  μₖ

ii. The significance level; 
α
iii. The test statistic:
iv. Critical Region:
Reject H0, when F ≥ Fα (k-1, n-k)
v. Computation
vi. Remarks.

Example 12.1: A large-scale agriculture farm is interested to know which of three different fertilisers maximises the crop yield. They sprinkle each fertiliser on four distinct fields and measure the total yield at the end of the growing season.

Fertilizer A

Fertilizer B

Fertilizer C

23

18

16

26

28

25

20

17

12

17

21

14

Test the hypothesis at a 5% significance level that the three fertilisers produce different yield capabilities.

Solution:

i. State null and alternative hypotheses
H₀: μ₁ = μ₂  = μ₃ vs. H₁: μ₁  μ₂  μ₃

ii. The significance level; α = 0.05

iii. The test statistic:
OR
iv. Critical Region:
Reject H0, when F ≥ F₀.₀₅ (2, 13) = 4.26
v. Computation






ANOVA table


vi. Remarks: As the calculated value falls in the acceptance region, we do not have sufficient evidence to reject H0; thus, we conclude the yield capabilities of the three fertilisers are identical.

Example 12.2: An educationist is interested in knowing which teaching method improves the students scores. The scores of 13 students taught by three different teaching methods are arranged in the following table:

Teaching Method 1

Teaching Method 2

Teaching Method 3

47

55

41

53

46

56

49

52

41

60

47

45

 

56

 

E

Do the teaching methods differ significantly at the 5 % level of significance?

Solution:

i. State null and alternative hypotheses
H₀: μ₁ = μ₂  = μ₃ vs. H₁: μ₁  μ₂  μ₃

ii. The significance level; α = 0.05

iii. The test statistic:
iv. Critical Region:
Reject H0, when F ≥ F₀.₀₅ (2, 10) = 4.10

v. Computation:



ANOVA Table
vi. Remarks: The F calculated value falls in the acceptance region; the sample data does not provide sufficient evidence to reject the null hypothesis. Thus, it is concluded that the three teaching methods are significantly different.



No comments:

Post a Comment

Moving Average Models (MA Models) Lecture 17

  Moving Average Models  (MA Models)  Lecture 17 The autoregressive model in which the current value 'yt' of the dependent variable ...