Violation of assumptions
Tomás RosaApuntes14 de Septiembre de 2015
895 Palabras (4 Páginas)340 Visitas
Violation of assumptions
Under Gauss-Markov assumptions (named after Carl Friedrich Gauss and Andrey Markov) we can prove the model’s estimators are BLUE (best linear unbiased estimators). This assumptions are a keystone of every successful OLS model these assumptions allow econometricians to perform the usual inference methods with confidence.
Assumption 1: Linear in parameters
This assumption holds because the dependent variable, wage, is related to the independent variables and the error term.
Assumption 2: Random Sampling
This assumption cannot fail since our ‘sample’ is the whole population which was small enoug (102 observations) to be entirely studied.
Assumption 3: Sample Variation in the Explanatory Variable
This assumption almost always holds in interesting regression applications. Without it we cannot even obtain the OLS estimators. In our model this assumption holds because the sample outcome on xj (such as experience, draft position, age...) are not all the same value.
Assumption 4: Zero Conditional Mean
This assumption holds because u has an expected value of zero given any value of the explanatory variable. In other words E(u|x) = 0.
Since none of the four assumptions fails we can guarantee the estimators are unbiasedness.
Assumption 5: Homoskedasticity
This assumption holds if the error term u has the same variance given any value of the explanatory variable. If the error terms do not have constant variance, they are said to be heteroskedastic.
To conclude if heteroskedasticity exists in our model we ran a Breusch-Pagan test, which is designed to detect any linear form of heteroskedasticity. Breusch-Pagan/ Cook-Weisberg tests the null hypothesis that the error variances are all equal versus the alternative that the error variances are a multiplicative function of one or more variables. Since we rejected the null hypothesis we concluded the Homoskedasticity assumption does not hold. We can still assume the estimators are still unbiased. However, OLS estimates are no longer BLUE. That is, among all the unbiased estimators, OLS does not provide the estimate with the smallest variance. Now, to be able to make inference, we corrected the heteroskedasticity, using Stata, obtaining the robust standard errors.
Results
[pic 1]
This is the output we obtained from our first linear regression model:
-R-squared is 61,21%, which means that our explanatory (dependent variables) explain that percentage of the behavior of the dependent variable. This value is similar to the value that we found in papers that aimed to study the same subject that we did.
-We can say that we have, in fact, overall significance of our model, since the p-value of the F-Test is extremely low (0.000), which is very important to keep us confident about our model.
-Finally, when we analysed the individual statistical significance of our explanatory variables, we gladly verified that touchdowns per season (TDperseason), interceptions per season (intperseason), draft round in which the player was picked (draft) and years of experience (experience) were individually statistical significant with very low p-values (0; 0,025; 0,004; 0,001; 0,005) and even quarterback rate did not end up very far for being significant (0.11) , nonetheless, we tried to understand the non-significance, of a variable that is, as a matter of fact, the most ‘‘unnatural’’ of them all since this rate is elaborated by specialized magazines, being subject to the personal considerations of their journalists, however we were not fully convinced about this and we expected that maybe, when we corrected the heteroskedasticity and changed the standard errors, this would change.
...