F Distribution & their Statistical Inferences Lecture 45

 

F Distribution & their Statistical Inferences

Lecture 45

Introduction

The F distribution, sometimes referred to as the Fisher – Snedecor distribution, is a continuous type positive skewed probability distribution that is widely used in the hypothesis testing of the equality of two variances, analysis of variance (ANOVA), and regression analysis. An F distribution is the ratio of two unbiased estimates of population variances.

Definition: F is a distribution that is the ratio of two independent chi-squared distributions, each divided by its degree of freedom.

For two independent random samples of size n1 and n2 drawn from two normal populations with means μ1 and μ2 and variances σ1^2 and σ2^2, let s1^2 and s2^2 be the unbiased estimates of population variances, respectively.

Properties:

i. The shape of the F distribution is positively skewed.

ii. The range of the F distribution is from 0 to infinity.

iii. The mean of F distribution with v1 and v2 degree of freedom is v2/v2-2

iv. If F has an F distribution with v1 and v2 degrees of freedom, then 1/F has an F distribution with v2 and v1 degrees of freedom.

v. The F table value with 1 and n degrees of freedom is equal to the square of t at n degrees of freedom.

Confidence Interval for the Ratio of two Variances

Let s1^2 and s2^2 be the unbiased estimates of population variances, respectively. The values of two independent random samples are used to compute the sample estimates of population variances.
 The following test statistic is obtained:

Now, to construct a confidence interval for the ratio of two variances, choose two values from the F table and make the following probability statement.


Example 11.1: Research is conducted to study the reaction of plants to a stimulus. The reaction times in seconds of two plants in an experiment were given below:

X1

X2

0.41

0.32

0.38

0.36

0.37

0.38

0.42

0.33

0.35

0.38

0.38

 

Construct a 95% confidence interval for the ratio of variances of reaction times of plant 1 and plant 2.

Solution:

1-α = 0.95

α = 0.05

F0.05 (v1, v2) = F0.05 (5, 4) = 5.05

F0.05 (v2, v1) == F0.05 (4, 5) = 5.19

X1

X1^2

X2

X2^2

0.41

0.1681

0.32

0.1024

0.38

0.1444

0.36

0.1296

0.37

0.1369

0.38

0.1444

0.42

0.1764

0.33

0.1089

0.35

0.1225

0.38

0.1444

0.38

0.1444

2.31

0.8927

1.77

0.6297



Hypothesis Testing about the Equality of Variances
Suppose two random independent samples of size n1 and n2 are selected from normal populations having means μ1 and μ2 and variances σ1^2 and σ2^2; let s1^2 and s2^2 be the unbiased estimates of population variances, respectively.
The sampling distribution of s1^2 / s2^2 approaches the F distribution.

If H0 is σ1^2 = σ2^2, then
Testing Procedure:
i. State null and alternative hypothesis
 H0: σ1^2 = σ2^2 vs  H1: σ1^2  σ2^2
 H0: σ1^2   σ2^2 vs  H1: σ1^2 > σ2^2
 H0: σ1^2  σ2^2 vs  H1: σ1^2 < σ2^2

ii. The significance level: α

iii. The test statistic:

Then the following form of the F statistic will be used:

Then the following form of the F statistic will be used:

Then the following form of the F statistic will be used:
Then the following form of the F statistic will be used:
vi. Critical Region:
Reject H0, if Fcalculate ≥ Ftabulated
v. Computation
vi. Remarks.
Example 11.2: The mean earnings of 7 women and 12 men per day are 1200 and 1500, with standard deviations of 10 and 13, respectively, selected from a working population of women and men. Test the hypothesis of equal variance at a 5% significance level.

Solution:
Testing Procedure:
i. State null and alternative hypothesis
 H0: σ1^2 = σ2^2 vs  H1: σ1^2  σ2^2
ii. The significance level: α = 0.05

iii. The test statistic:
Then the following form of the F statistic will be used:
iv. Critical Region:
Reject H0, if Fcalculate ≥ F0.05 (5, 11) = 3.30
v. Computation:
vi. Remarks: The F-computed value falls in the acceptance region; the sample data does not provide sufficient evidence to reject H₀. . Thus, it is concluded that the variation between the income of women and men is identical.

Example 11.3: The 10th class of a school took a mathematics test from male and female students. The following statistics were obtained as below:

gender

n

Male

50

82

20

Female

60

94

39

From the sample observation, it is clear that the average grade of females is better than that of male students. Test the hypothesis that the variation of female students is better than that of male students.

Solution:
i. State null and alternative hypothesis
 H0: σ2^2   σ1^2 vs  H1: σ2^2 > σ1^2

ii. The significance level: α = 0.05
iii. The test statistic:
Then the following form of the F statistic will be used:
iv. Critical Region:
Reject H0, when F > F0.05 (59, 49) = 1.39
v. Computation:
vi. Remarks: The F-computed value falls in the rejection region; the sample data does not provide sufficient evidence to accept H0. Thus, it is concluded that the variance of female students marks is larger than the variance of male students.
Example 11.4: The consistency of two strips, A and B, is used to measure the blood glucose level in the sample of blood. The strip's consistency is measured by the standard deviation of the reading in repeated testing. Two random samples of 15 type A and 20 type B strips gave the mean and variance as 20.40, 25.60 and 2.50 and 1.96, respectively. Test the hypothesis, at the 5% level of significance, that type A strips have better consistency than type B.

Solution:
i. State null and alternative hypothesis
 H0: σ1^2   σ2^2 vs  H1: σ1^2 > σ2^2

ii. The significance level: α = 0.05
iii. The test statistic:
Then the following form of the F statistic will be used:
vi. Reject H0, if F > F0.05 (14, 19) = 2.31
v. Computation:
vi. Remarks: The F-computed value falls in the acceptance region; the sample data does not provide sufficient evidence to reject H₀. . Thus, it is concluded that the strip types A and B are of equal consistency.






No comments:

Post a Comment

Moving Average Models (MA Models) Lecture 17

  Moving Average Models  (MA Models)  Lecture 17 The autoregressive model in which the current value 'yt' of the dependent variable ...