Inferences about Regression Coefficients

 

Hypotheses Testing about Regression Analysis

Consider the simple linear regression model

Y=α+βX+ϵ

The primary null hypothesis is H0:β=0, whereas the alternative is H1:β≠0. If the null hypothesis is true, it follows that the predictor has no effect on the response variable and that the value of Y is for every value of X. If the null hypothesis is rejected, both one-tailed and two-tailed tests can be used. The type of test depends on the researcher's intention, whether to examine a positive or negative relationship between the response variable and the predictor variable. In some cases, the null hypothesis is written as H₀:β=β0. Where β₀ is the numerical value of β. The alternative hypothesis will be some form of negation of the null hypothesis.

The procedure for testing the null hypothesis about the slope of a regression line is as follows:

Hypothesis Testing about Slope

Let β^ be the estimate of β computed from the values of a small random sample of size n selected from a bivariate normal population with mean β and standard deviation σβ. If the sample size is small, the sampling distribution of β approaches the t-distribution with (n-2) degrees of freedom. 
Thats
Where:

Testing Procedure:
i. State the null and alternative hypotheses:
 H0 : β 0 Vs  H1: β 0

H0: β ≤ 0 vs. H1: β > 0
H0: β ≥ 0 vs. H0: β 0
ii. The significance level: α
iii. The test statistic: If the sample size is small, then
vi. Critical Region:
Reject H0 when |t| ≥ tα/2(n-2)
Reject H0 when t ≥ tα(n-2)
Reject H₀ when t -tα (n-2)

v. Computation:
vi. Remarks:
Hypothesis Testing about Intercept

The intercept is the mean value of the response variable when the predictor is zero and should always be stated in terms of the actual variables of the study. The hypothesis testing about the intercept has no such value as slope but only tests whether the mean value of the response variable has some specified value, i.e., H₀: α=α₀ versus suitable alternatives.

The test statistic for small sample size is


Practice Question 

The dosage of a stimulant drug as well as the reaction time to a stimulus are recorded for each of multiple participants who have been injected with the substance.

Dosage (grams)

4

4

6

6

8

8

10

10

Reaction Time (seconds)

7.5

6.8

4.0

4.4

3.9

3.1

1.4

1.7

Find the regression of reaction time on dosage and test the hypothesis that reaction time and dosage are independent at 5%.

Solution: Let reaction time and dosage be denoted by Y and X, respectively.

Y

X

XY

X^2

Y^2

7.5

4

30

16

56.25

6.8

4

27.2

16

46.24

4

6

24

36

16

4.4

6

26.4

36

19.36

3.9

8

31.2

64

15.21

3.1

8

24.8

64

9.61

1.4

10

14

100

1.96

1.7

10

17

100

2.89

32.8

56

194.6

432

167.52


The slope of a regression line can be estimated as
The intercept of a regression line can be estimated as
The estimated Regression model Y on X is given by
Y^ = α^+β^X

Y^ = 10.225 - 0.875X

The sign of the slope is negative; the relationship between the reaction time and dosage is negative.

Next is to test the hypothesis that reaction time and dosage are independent; we set up our hypothesis as

i. State the null and alternative hypotheses:
 H0 : β 0 Vs  H1: β 0

ii. The significance level: α = 0.05
iii. The test statistic: If the sample size is small, the following statistic is used as a test statistic:

assumed H0 is to be true.
vi. Critical Region:
Reject H0 when |t| ≥ t0.025(6) = 2.447
v. Computation:

vi. Remarks

The estimated t value (8.739) is above the rejection threshold (>2.447). The sample data is insufficient to support the claim that reaction time and dosage are independent. As a result, it is established that dosage has a considerable impact on reaction time.

Hypothesis Testing about the Slope of a Regression by ANOVA Table

The ANOVA table can likewise be used to test the hypothesis H₀:β=β₀ as well. This method requires less labour and saves time when the regression model is multiple and tests the null hypothesis that all slopes are equal to zero at the same time.
In this method, the total variation in the response variable is partitioned into two components: the part explained by the regression line and the part failed to be explained by the regression line. 
That’s
∑(Y-Y¯)² = ∑(Y-Y¯+ Y^ -Y^)2

∑(Y-Y¯)² ∑{(Y^) + ∑(Y¯ -Y^)} ²

∑(Y-Y¯)² = ∑(Y - Y^) ² + ∑(Y¯ -Y^) ² + 2(Y¯ -Y^)(Y^) 
The sum of residuals is zero.
∑(Y-Y¯)² = ∑(Y - Y^) ² + ∑(Y¯ -Y^) ²

SST = SSR + SSE
Presentation in ANOVA Table:

SV

df

SS

MS

F

Regression

1

SSR

MSR

 

MSR/MSE

Error

n -1-1

SSE

MSE

Total

n - 1

SST

 



The F statistic is used to test H₀: H₀: β = 0.
 
F = MSR/MSE

The coefficient of determination is defined as

OR







If the coefficient of determination is available, then the null hypothesis about the slope of the regression lines can be tested using the above test statistic.


Practice Question

Following data on the wage (Rs.) of daily workers and experience (weeks)

Wage (Rs.)

600

900

1100

1300

1500

2000

Experience (week)

1

3

5

9

12

15















No comments:

Post a Comment

Moving Average Models (MA Models) Lecture 17

  Moving Average Models  (MA Models)  Lecture 17 The autoregressive model in which the current value 'yt' of the dependent variable ...