Yates Continuity Correction & Fisher Exact Test Lecture 39

 

Yates Correction for Continuity

Lecture 39

Yates correction is a statistical technique to improve the precision of the chi-square test of independence of two variables classification presented in a contingency table. In chi-square approximation, the smaller cell frequencies (less than 5) combine with the larger one and reduce the chi-square degree of freedom. But in the case of 2x2 contingency, the smaller cell cannot combine with the larger because the chi-square table value is not available at zero degrees of freedom.

Facing such a situation, Frank Yates proposed the following continuity correction for the 2 x 2 table, which markedly enhanced the chi-square approximation.

The above modification in the chi-square approximation is known as Yates continuity correction and is applicable when there is a single degree of freedom.

The Frank Yates continuity correction for a 2 x 2 contingency table is given by:

 

B1

B2

Total

A1

a

b

a+b

A2

c

d

c+d

Total

a+c

b+d

n


Example 9.20: A study examined the relationship between blood group and disease severity. The results are displayed in the 2X2 contingency table that follows:

 

Blood Group

Severity of Disease

Normal

Sevier

A (+)

50

4

A (-)

36

10

Is there an association between blood group and the severity of the condition? Can you suggest applying Yates continuity correction?

Solution: The cell frequency is small (less than 5); it is suggested to apply Yates continuity correction.

i. The null and alternative hypotheses may be stated as:

H0: The blood group and the severity of the disease are not associated.

Vs.

H1: The blood group and the severity of the disease are associated.

ii. The significance level: α = 0.05

iii. The test statistic:

vi. Critical Region:

Reject H₀ when χ² ≥ χ²₀.₀₅(1) = 3.481

Computation:

vi. Remarks: The computed chi-square calculated value falls in the acceptance region; the sample data does not provide sufficient evidence to reject the null hypothesis. Thus, it is concluded that the blood group and the severity of the disease are not associated.

Fisher's Exact Test

In a 2x2 contingency table where the cell frequencies are small. The effectiveness of the chi-square approximation will be questioned to some extent. In response to these circumstances, R.A. Fisher, J.O. Irwin, and Frank Yates developed the Fisher exact test, which is a method for evaluating the hypothesis of independence in a contingency table with fairly small cell frequencies.

Procedure: First, identify the smaller cell frequency and then alter the cell frequency with the restriction that marginal frequencies are fixed.

If it is desired to test the null hypothesis, there is no association between the two variables classification.

A / B

B1

B2

Total

A1

a

b

(a+b)

A2

c

d

(c+d)

Total

(a+c)

(b+d)

n

Where the marginal cell frequencies are fixed, given by
It follows hypergeometric distribution with parameters n, a, and (a+b).
where: 
Population size is n;
 a is the sample success, 
(a+b) is the population success.
(a+c) is the sample size.

Assuming that d is the least frequency, the other possible tables are obtained by reducing d by unity, altering the cell frequencies of the other cells, and repeating the procedure till d becomes zero.  Then compute the probability of the observed and other possible tables.

Then the total probabilities, P = Pd + Pd-1 + Pd-2 + ⋯ + P0.

The test statistic for two-tailed tests:

χ² = 2P 
Reject H0 if χ² > α
The test statistic for one-tailed tests:
χ² = P
 Reject H0 if χ² > α
Example 9.21: A researcher wants to investigate if political party choice is associated with gender. 18 voters are selected at random and asked which political party they favour. The survey’s results are displayed in the following table:

 

Gender

Political Party

 

Total

A

B

Male

2

9

11

Female

4

3

7

Total

6

12

18

Solution:
i. State the null and alternative hypothesis
H0: The gender and political party affiliation are not associated.
Vs.
H0: The gender and political party affiliation are associated.
ii. The significance level: α = 0.05
iii. The test statistic: Fisher's Exact test
iv. Critical Region:
Reject H₀ when χ² ≥ 0.05
v. Computation:

 

Gender

Political Party

 

Total

A

B

Male

2

9

11

Female

4

3

7

Total

6

12

18


 

Gender

Political Party

 

Total

A

B

Male

1

10

11

Female

5

2

7

Total

6

12

18


 

Gender

Political Party

 

Total

A

B

Male

0

11

11

Female

6

1

7

Total

6

12

18

P = P2 + P1 + Pd0
P = 0.1036 + 0.0124 + 0.00037
P = 0.11637
χ² = 2P
χ² = 2 x 0.11637
χ² = 0.23274

vi. Remarks: The calculated value falls in the rejection region: the sample data does not provide sufficient evidence to accept the null hypothesis. Thus, it is concluded that the gender and political party affiliation are associated.

No comments:

Post a Comment

Moving Average Models (MA Models) Lecture 17

  Moving Average Models  (MA Models)  Lecture 17 The autoregressive model in which the current value 'yt' of the dependent variable ...