Lab 4 2/27/16 lalonde ancova #####But wait, we must say "we are not done until the ancova is run" # refer back to week 1 (or week 5), the social science practice is to put # in the treatment variable (week 1 exs, breastfeeding, sexy media) and # a whole bunch of other variables to "control" for self-selection, nonequivalence etc. # week 5 we saw that was usually equivalent to analysis of covariance by whatever name > data(lalonde) > attach(lalonde) > dim(lalonde) [1] 614 10 > names(lalonde) [1] "treat" "age" "educ" "black" "hispan" "married" "nodegree" "re74" [9] "re75" "re78" > ancova.lalonde = lm( re78 ~ treat + age + educ + black + hispan + married + nodegree + re74 + re75) > summary(ancova.lalonde) Call: lm(formula = re78 ~ treat + age + educ + black + hispan + married + nodegree + re74 + re75) Residuals: Min 1Q Median 3Q Max -13595 -4894 -1662 3929 54570 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.651e+01 2.437e+03 0.027 0.9782 treat 1.548e+03 7.813e+02 1.982 0.0480 * age 1.298e+01 3.249e+01 0.399 0.6897 educ 4.039e+02 1.589e+02 2.542 0.0113 * black -1.241e+03 7.688e+02 -1.614 0.1071 hispan 4.989e+02 9.419e+02 0.530 0.5966 married 4.066e+02 6.955e+02 0.585 0.5590 nodegree 2.598e+02 8.474e+02 0.307 0.7593 re74 2.964e-01 5.827e-02 5.086 4.89e-07 *** re75 2.315e-01 1.046e-01 2.213 0.0273 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 6948 on 604 degrees of freedom Multiple R-squared: 0.1478, Adjusted R-squared: 0.1351 F-statistic: 11.64 on 9 and 604 DF, p-value: < 2.2e-16 So it turns out we were wrong all along (along with the labor economists) there is a significant effect of treat (job training), $1548. Well, I'll be..... But if you use different subsets of these covariates you may/will (don't want to spoil the surprise) get quite different results. (Or to increase the chaos try some version of IV) --------------------------------- just for reference, the stat60 t-test or regression version--no effect of treat (in opposite direction) > t.test( re78 ~ treat) Welch Two Sample t-test data: re78 by treat t = 0.9377, df = 326.412, p-value = 0.3491 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -697.192 1967.244 sample estimates: mean in group 0 mean in group 1 6984.170 6349.144 > summary(lm(re78 ~ treat)) #regression version of t-test Call: lm(formula = re78 ~ treat) Residuals: Min 1Q Median 3Q Max -6984 -6349 -2048 4100 53959 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6984.2 360.7 19.362 <2e-16 *** treat -635.0 657.1 -0.966 0.334 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 7471 on 612 degrees of freedom Multiple R-squared: 0.001524, Adjusted R-squared: -0.0001079 F-statistic: 0.9338 on 1 and 612 DF, p-value: 0.3342