Confidence
Interval Estimation
Of
Regression Parameters
A confidence interval estimate for a simple linear regression line is
based on sample statistics and their accompanying sampling distributions, with
a statement indicating how confident, in terms of probability, the interval
contains the population linear regression line. The probability associated with
a confidence interval is 1-α or (1-α). Thus, the confidence interval is the distance between the two curves (dotted lines) and 1-α chance that
the population linear regression line will lie within the space.
Confidence Level
The
estimates are based on sample data and vary from samples drawn from the same
population, and these estimates produce slightly different intervals. The
confidence coefficient, or confidence level, is the percentage (probability) of them that will contain the population linear regression line or parameters of the
model.
Confidence Interval for
Intercept Parameter
Let α^ be the estimate of α computed from the values of a small random sample of size n selected from a bivariate normal population having mean "μα" and standard deviation "σα." The population mean and standard deviation are unknown, so replace them with their estimates.
The sampling distribution of α^ approaches the t-distribution with mean "μα^" and standard deviation "sα^"
α^ ~ t(α, sα^)
Where:
Thus, a (1 - α) % confidence interval estimate for α is given by
Confidence Interval for
Slope Parameter
Let β^ be the estimate of β computed from the values of a small random sample of size n selected from a bivariate normal population having mean "μβ" and standard deviation "σβ." The population mean and standard deviation are unknown, so replace them with their estimates.
The sampling distribution of α^ approaches the t-distribution with mean "μβ^" and standard deviation "sβ^".
β^ ~ t(β, sβ^)
Where:
Thus, a (1 - α) % confidence interval estimate for β is given by
%20-%20Copy.png)
Confidence Interval for the
Mean value of Response Variable
Let Y^ = α^ + β^ X0 be the estimate of Y = α + β X0 + ϵ at X = X0 computed from the values of a small random sample of size n. The sampling distribution of Y^ = α^ + β^ X₀ approaches the t-distribution with μY.X = α + β X and standard deviation σy.x.
Where:
The population standard deviation is unknown, so replace it with its estimate given below:
The test statistic:
A 100(1-α)% confidence interval is given by:
Practice Question 1.5
The age and systolic blood pressure of 100 people gave the following information:
∑X=4421, ∑Y=12130, ∑XY=542735, ∑X²=208349, ∑Y²=1498976
i. Compute the regression line, which is used to estimate the true value μY.X.
ii. Assume normality and construct a 95% confidence interval for α, β, and the true value of blood pressure for the age of 50 years.
iii. Predict blood pressure for the age of 50 years and compute the 95% confidence interval for this estimate.
Solution: The OLS method is using to estimate the
parameters
i.
Estimation of Regression Line
The estimated regression line is given by:
Y^ = α^ + β^ X
Y^ = 98.97 + 0.5015X
95% Confidence interval for α, β
1 - α = 0.95
α = 0.05
tα/2(n-2) = t 0.025(98) = 1.96
95% confidence interval for α
α^ ± tα/2(n-2) sα^
98.97 ± 1.96 (6.337)
98.97 ± 12.547
86.423 < α < 111.517
95% confidence interval for β
β^ ± tα/2(n-2) sβ^
0.5015 ± 1.96 (0.138)
0.5015 ± 0.271
0.2305 < β < 0.7725
iii. Prediction of blood pressure for the age 50 years, i.e., X0 = 50
Y^ = α^ + β^ X
Y^0 = 98.97 + 0.5015 x 50
Y^ = 124.23
95% prediction interval for true value
Y0 ± tα/2(n-2) sY^
124.23 ± 1.96 x 15.96
124.23 ± 31.40
92.83 ≤ μY ≤ 155.63
،👍
ReplyDeleteNicely introduced
Delete