MULTICOLLINEARITY
Multicollinearity is a statistical phenomenon of the
presence of linear relationship among some or all predictor variables included in a multiple linear regression model.
Let the salary (Y)
In this model, Years of schooling (X1)
If this assumption is violated and there is a linear relationship between some or all predictor variables in a multiple linear regression model. The problem of multicollinearity is said to exist.
In multicollinearity
Let the salary(Y)
Consider the General linear model:
The regression model with two regressors is given
Nature (Types) of Multicollinearity
We can make
the distinction between two types of multicollinearity.
1.
Perfect
multicollinearity.
2. Imperfect multicollinearity.
Perfect Multicollinearity
If two or more predictor variables in an econometric model have a precise linear relationship. Let we have the following regression model.Imperfect Multicollinearity
Near multicollinearity exists when there is an approximate linear relationship between two or more predictor variables including in an econometric model. The correlation between two predictor variables (r12) will be less than - 1 and +1.
Consider a multiple linear regression model with two predictors.
The relation between
predictors included in the above regression model is
Diagrammatic
representation of week and strong multicollinearity
The Reason(s) behind Multicollinearity's existence
1. An
over define econometric model
An
over defined model is one that contains more explanatory variables than
observations.
When
a significant number of explanatory variables are included in the model to make
it more realistic, the number of observations "n" becomes smaller
than the number of explanatory variables "k". Such a situation can develop in medical
research when the number of patients is relatively small even information on a
significant number of factors is obtained.
2. The technique of data collection
When
the researcher samples only a subspace of the predictor region, this data
collecting strategy can lead to multicollinearity.
3. Population
and Model Restrictions Certain constraints may apply to the model or the population from which the sample is derived; for example, the predictors should be uncorrelated and only influence the response variable, not vice versa.
4. Inclusion
of Predictor variables that can computed from other Predictor variables
The
inclusion of predictor(s) that can be calculated from other predictor variables
included in a regression model can lead to the problem of multicollinearity. In
a regression model, for example, investment income and saving income are used
as predictor variables.
5. Using the same variable twice
When
two measures of the same concept are incorporated in an econometric model, the
problem of multicollinearity arises. In a regression model, predictor variables
such as weight in pounds and weight in kilograms are used.
6. Dummy
variable Trap
When
categorical variables, such as gender (male / female), season
(summer/winter/fall/spring), and so on, are included as independent variables
in a regression model, they take values 0 and 1, signifying the lack or
existence of the category.
Consider
the model with dummy variable
The
number of dummy variables cannot be greater than the number of categories. When
we utilize an amount equal to the number of categories, we get
multicollinearity.
If
we use one dummy variable “D1
- Read More: Effects of Multicollinearity

















No comments:
Post a Comment