Although we focused upon logarithms in class, one could use other non-linear transformations within a regression model. For the wage and education example that has been considered in lecture, suppose that we wanted to take the square root of education and relate wages to that. So the SLR model would be wage 0 1 educ u
Although we focused upon logarithms in class, one could use other non-linear transformations within a regression model. For the wage and education example that has been considered in lecture, suppose that we wanted to take the square root of education and relate wages to that. So the SLR model would be wage 0 1 educ u
a. Figure out the formula for the effect of educ on wage by taking the derivative dE(wage|educ)/d educ. Note that unlike the basic SLR model, this derivative depends upon the x variable (here, educ).
Here is the Stata for the regression of this model (using wage1.dta):
. gen sqrteduc = sqrt(educ) . regr wage sqrteduc
Source | SS df MS ————-+—————————— Model | 930.606128 1 930.606128 Residual | 6229.80816 524 11.8889469 ————-+—————————— Total | 7160.41429 525 13.6388844
Number of obs = F( 1, 524) = Prob > F = R-squared = Adj R-squared = Root MSE =
526 78.27 0.0000 0.1300 0.1283 3.448
—————————————————————————— wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] ————-+—————————————————————- sqrteduc | 2.947042 .3331004 8.85 0.000 2.292666 3.601419 _cons | -4.464346 1.180639 -3.78 0.000 -6.783714 -2.144978 ——————————————————————————
b. Estimate the effect of educ on wage at both educ=12 and educ=16 (using your formula from the previous part). How do these effects compare to the estimated effects from the original SLR model? For your reference, the original regression (wage on educ) results were:
. regr wage educ
Source | SS df MS ————-+—————————— Model | 1179.73204 1 1179.73204 Residual | 5980.68225 524 11.4135158 ————-+—————————— Total | 7160.41429 525 13.6388844
Number of obs = F( 1, 524) = Prob>F = R-squared = Adj R-squared = Root MSE =
526 103.36 0.0000 0.1648 0.1632 3.3784
—————————————————————————— wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] ————-+—————————————————————- educ | .5413593 .053248 10.17 0.000 .4367534 .6459651 _cons | -.9048516 .6849678 -1.32 0.187 -2.250472 .4407687 ——————————————————————————
c. Which regression (wage on educ or wage on sqrteduc) appears to give a better overall fit? Should overall fit (as measured by R-squared) be the only reason to choose one model specification over another? Explain.