Lab 4 Stat209 2/27/16
Rogosa Lab 4 R-session (base: secs 1-3)
Preliminaries, install MatchIt and get lab data
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
# I Started from a relatively clean install, get MatchIt and optmatch
# some years order matters because of complication with license for optmatch algorithms
# this year appears smooth
> install.packages("optmatch")
> library(optmatch)
> install.packages("MatchIt")
> library(MatchIt)
> data(nuclearplants) # we don't need this, just for illustration
Now to get going on our Lab 4 tasks
> data(lalonde) # get lalonde data from MatchIt, there are different versions in existence under this name (see M-B text)
# or simply
help(lalonde) #produces
-----------------------------------
lalonde package:MatchIt R Documentation
Data from National Supported Work Demonstration and PSID, as analyzed by Dehejia and Wahba (1999).
Description:
This is a subsample of the data from the treated group in the
National Supported Work Demonstration (NSW) and the comparison
sample from the Current Population Survey (CPS). This data was
previously analyzed extensively by Lalonde (1986) and Dehejia and
Wahba (1999). The full dataset is available at . [note: broken link still in current documentation]
Usage:
data(lalonde)
Format:
A data frame with 313 [sic, 614] observations (185 treated, 429 control).
There are 10 variables measured for each individual. "treat" is
the treatment assignment (1=treated, 0=control). "age" is age in
years. "educ" is education in number of years of schooling.
"black" is an indicator for African-American (1=African-American,
0=not). "hispan" is an indicator for being of Hispanic origin
(1=Hispanic, 0=not). "married" is an indicator for married
(1=married, 0=not married). "nodegree" is an indicator for
whether the individual has a high school degree (1=no degree,
0=degree). "re74" is income in 1974, in U.S. dollars. "re75" is
income in 1975, in U.S. dollars. "re78" is income in 1978, in
U.S. dollars.
References:
Lalonde, R. (1986). Evaluating the econometric evaluations of
training programs with experimental data. American Economic
Review 76: 604-620.
Dehejia, R.H. and Wahba, S. (1999). Causal Effects in
Nonexperimental Studies: Re-Evaluating the Evaluation of
Training Programs. Journal of the American Statistical
Association 94: 1053-1062.
-----------------------------------------------
note: Week 8 primary readings includes Matching Methods for Causal Inference
Elizabeth Stuart Donald Rubin [does lalonde lab4 example]
http://www.corwin.com/upm-data/18066_Chapter_11.pdf
> dim(lalonde) # we have 614 cases, 10 vars; see lab script ("exposition and commands") part 2
[1] 614 10
> names(lalonde)
[1] "treat" "age" "educ" "black" "hispan" "married"
[7] "nodegree" "re74" "re75" "re78"
> attach(lalonde)
> table(treat) # so these summaries synch with data description
treat
0 1
429 185
> lalonde[1:10,] # here's the first 10 observations, all from the treatment
treat age educ black hispan married nodegree re74 re75 re78
NSW1 1 37 11 1 0 1 1 0 0 9930.0
NSW2 1 22 9 0 1 0 1 0 0 3595.9
NSW3 1 30 12 1 0 0 0 0 0 24909.5
NSW4 1 27 11 1 0 0 1 0 0 7506.1
NSW5 1 33 8 1 0 0 1 0 0 289.8
NSW6 1 22 9 1 0 0 1 0 0 4056.5
NSW7 1 23 12 1 0 0 0 0 0 0.0
NSW8 1 32 11 1 0 0 1 0 0 8472.2
NSW9 1 22 16 1 0 0 0 0 0 2164.0
NSW10 1 33 12 0 0 1 0 0 0 12418.1
## before main events here's a warm-up
> # brute-force matching-- Lab4 script item 3.1
> numObs = dim(lalonde)[1]
> numObs
[1] 614
> # that's the 429 + 185, not going to distinguish between T/C here
#Do it yourself matching function, Lab text part 3#######
> indiv1 = lalonde[1,]
> indiv1 # the target for this matching exercise
treat age educ black hispan married nodegree re74 re75 re78
NSW1 1 37 11 1 0 1 1 0 0 9930
> # aside indiv1 is in the treatment group, the first 185 cases are all in treatment
> # for example, examine lalonde[185,] lalonde[186,]
> matches = matrix( nrow = numObs, ncol = 1, data = FALSE )
> # set up a dummy matrix and then loop through the data with specified calipers
> for ( i in 1:numObs) {
+ matches[i,1] <- all(
+ c(lalonde[i,4:7] == indiv1[4:7],
+ (lalonde[i,3] < indiv1[3] + 3) && (lalonde[i,3] > indiv1[3] - 3),
+ (lalonde[i,2] < indiv1[2] + 3) && (lalonde[i,2] > indiv1[2] - 3)
+ ) )
+ }
# Karen's function picks out exact matches on the four dichotomous black hisp married nodegree
and caliper matching on age and educ
> sum(as.numeric(matches) )
[1] 4
# this picks up a match to itself (the target indiv1)
> # we only got 3 matches to others cases 183, 191 212, the latter 2 being control group members
> matches # you can print out all 614
[,1]
[1,] TRUE
[2,] FALSE
[182,] FALSE
[183,] TRUE
[190,] FALSE
[191,] TRUE
[211,] FALSE
[212,] TRUE
[613,] FALSE
[614,] FALSE
> lalonde[191,]
treat age educ black hispan married nodegree re74 re75 re78
PSID6 0 37 9 1 0 1 1 13685 12756 17833
> lalonde[212,]
treat age educ black hispan married nodegree re74 re75 re78
PSID27 0 36 9 1 0 1 1 13256 8457 0
> lalonde[183,]
treat age educ black hispan married nodegree re74 re75 re78
NSW183 1 35 9 1 0 1 1 13602 13831 12804
> indiv1
treat age educ black hispan married nodegree re74 re75 re78
NSW1 1 37 11 1 0 1 1 0 0 9930
> # if we took re74 re75 into account we wouldn't have any control matches for indiv1; matching not so easy as dimension increases
########### aside, opmatch demo
> # optmatch (Ben Hansen) could also find matches to person 1
>
# I just used age and education for a demo; if I had further specified the dichotomous variables
I would have an alternative to the results above
> m1 = matchit(treat ~ age + educ, data = lalonde, method = "optimal", ratio = 2)
Loading required package: optmatch
m1.data = match.data(m1)
# so I created 185 subclasses each with 3 members, (2 control, 1 treatment), fun
> names(m1.data)
[1] "treat" "age" "educ" "black" "hispan" "married" "nodegree" "re74" "re75" "re78"
[11] "distance" "weights" "subclass"
# pick off subclass 1; two control members (not the same as above as i didn't use the other vars)
> s1 = subset(m1.data, subset = subclass < 2, select = c(age, educ, distance, subclass))
> dim(s1)
[1] 3 4
> s1
age educ distance subclass
NSW1 37 11 0.2537 1
PSID9 38 9 0.2467 1
PSID400 37 8 0.2499 1
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
########## back to main event propensity scores and treatment effects using stratification
## for reference, the stat60 t-test or regression version shows no significant effect of treat
## point estimate, negative treatment effect (in opposite direction)
> t.test( re78 ~ treat)
Welch Two Sample t-test
data: re78 by treat
t = 0.9377, df = 326.412, p-value = 0.3491
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-697.192 1967.244
sample estimates:
mean in group 0 mean in group 1
6984.170 6349.144
> summary(lm(re78 ~ treat)) #regression version of t-test
Call:
lm(formula = re78 ~ treat)
Residuals:
Min 1Q Median 3Q Max
-6984 -6349 -2048 4100 53959
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6984.2 360.7 19.362 <2e-16 ***
treat -635.0 657.1 -0.966 0.334
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7471 on 612 degrees of freedom
Multiple R-squared: 0.001524, Adjusted R-squared: -0.0001079
F-statistic: 0.9338 on 1 and 612 DF, p-value: 0.3342
> # Lab script 3.2
# now do the logistic regression that computes propensity scores (matching packages will do this for you with propen as distance measure)
> glm.lalonde = glm( treat ~ age + educ + black + hispan + married + nodegree + re74 + re75, data = lalonde, family = binomial)
> summary(glm.lalonde)
Call:
glm(formula = treat ~ age + educ + black + hispan + married +
nodegree + re74 + re75, family = binomial, data = lalonde)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.764 -0.474 -0.286 0.751 2.717
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.73e+00 1.02e+00 -4.65 3.3e-06 ***
age 1.58e-02 1.36e-02 1.16 0.2452
educ 1.61e-01 6.51e-02 2.48 0.0133 *
black 3.07e+00 2.87e-01 10.70 < 2e-16 ***
hispan 9.84e-01 4.26e-01 2.31 0.0208 *
married -8.32e-01 2.90e-01 -2.87 0.0042 **
nodegree 7.07e-01 3.38e-01 2.09 0.0362 *
re74 -7.18e-05 2.87e-05 -2.50 0.0125 *
re75 5.34e-05 4.63e-05 1.15 0.2488
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 751.49 on 613 degrees of freedom
Residual deviance: 487.84 on 605 degrees of freedom
AIC: 505.8
Number of Fisher Scoring iterations: 5
# best case is counter-intuitive; you really want not-wonderful predictions (analog to Rsq) so that
# you have both treat=1 and treat=0 cases at levels of propen--i.e. good overlap
> propen = fitted(glm.lalonde) # now we have the propensity scores, Lab script calls these propScores
> quantile(propen) # overall distrib
0% 25% 50% 75% 100%
0.00908 0.04854 0.12068 0.63872 0.8531
> tapply(propen, treat, quantile) # look at overlap via 5-number summary (or side-by-side boxplots)
not good overlap, as noted in class handout snippet
$`0`
0% 25% 50% 75% 100%
0.00908 0.03888 0.07585 0.19514 0.78917
$`1`
0% 25% 50% 75% 100%
0.02495 0.52646 0.65368 0.72660 0.85315
> # we are fitting prob(treat = 1) fits for those in treatment group
> # will be larger, we need good overlap for matching purposes
> boxplot(propen ~ treat) #gives side-by-side boxplots, you can add labels, not wonderful overlap compared to class exs
### could get fancier and recreate class displays of superimposed histograms (e.g. SAT, Ben Hansen)
### package such as 'HistogramTools' you can even do operations such as IntersectHistogram, but focus on basics here
###########
# superimposed histograms
# credit to Lucas Janson, creating the histogram code in real-time
# pick off propensity scores for control (p0) and treatment (p1) as separate vars, and check it
> p1 = propen[treat == 1]
> length(p1)
[1] 185
> p0 = propen[treat == 0]
> length(p0)
[1] 429
> fivenum(p1)
NSW124 NSW156 NSW50 NSW119 NSW178
0.02495179 0.52646352 0.65368426 0.72659995 0.85315284
> fivenum(p0)
PSID296 PSID347 PSID221 PSID334 PSID118
0.009080193 0.038880745 0.075849106 0.195135746 0.789172834
> # ok, this matches fivenum for full data set, I did subsetting right for once
> hist(p0,col=rgb(0,0,1,0.7),xlim=range(c(p0,p1)))
> hist(p1,col=rgb(1,0,0,0.7),add=T)
> # superimposed propensity histrograms, like Ben Hansen SAT, contol is blue, treatment is red, overlap close to perfect Stanford Cardinal red
# could supply more useful labels than my lazy implementation
# could also adjust the binning ("breaks") to make more bins, or make bins same as strata
##############
> tapply(propen, treat, stem) #less useful, stacked stem-and-leafs, something you can output in ascii
The decimal point is 1 digit(s) to the left of the | # treat = 0
0 | 11111111111112222222222222222222222222222222222222222222233333333333+57
0 | 55555555555555555555555555555556666666666666666666666666777777777777+39
1 | 000000000000000011111111111111112222222222223334444
1 | 56677888888889
2 | 0000011122244
2 | 55667788
3 | 0
3 | 6778899
4 | 1233444
4 | 568999
5 | 1144
5 | 5679
6 | 11112234444
6 | 555556667778888899999999
7 | 111112222444
7 | 55555555689
The decimal point is 1 digit(s) to the left of the | # treat = 1
0 | 244
0 | 6788899
1 | 000122334444
1 | 55
2 | 012
2 | 579
3 |
3 | 8899
4 | 0134
4 | 5559
5 | 1112334
5 | 6666677778899999
6 | 00111122222334444
6 | 55555555555666666778888888899999
7 | 000000001111222222222223333344444
7 | 555555566666777777777778888888999
8 | 1122
8 | 5
$`0`
NULL
$`1`
NULL
5
# or you can have MatchIt compute propensity scores for you and do a simple match (as in the Aspirin Study (Love))
# or more complex 'additional matching section'
######subclassification (by hand) #################
> # a common use of the propensity scores (backed by theory, class handout week8))
> # is to stratify by quintiles
> # the simple-minded way I do it is to use "cut", Lab script has fancier programming
> ?cut # this is a simple function to create bins
> k = 1:4
> quantile(propen, k/5) # defines 5 bins (equal sample size) for the 614 propensity scores
20% 40% 60% 80%
0.04015 0.08721 0.26978 0.67085
> propbin = cut(propen, c(0, .04015,.08721,.26978,.67085,1))
> propen[1:10] # show propen values that get put into bins for first 10 cases
NSW1 NSW2 NSW3 NSW4 NSW5 NSW6 NSW7 NSW8 NSW9 NSW10
0.63877 0.22463 0.67824 0.77632 0.70164 0.69907 0.65368 0.78972 0.77984 0.04292
> propbin[1:10] # here are the bins for those values
[1] (0.27,0.671] (0.0872,0.27] (0.671,1] (0.671,1]
[5] (0.671,1] (0.671,1] (0.27,0.671] (0.671,1]
[9] (0.671,1] (0.0401,0.0872]
Levels: (0,0.0401] (0.0401,0.0872] (0.0872,0.27] (0.27,0.671] (0.671,1]
> # levels are the 5 quintiles
> table(propbin, treat) # either way you display it, we do not have good overlap in the bottom
two quintiles, lower estimated probability for being in treatment
for treatment cases (see your side-by-side boxplot)
treat
propbin 0 1
(0,0.0401] 122 1
(0.0401,0.0872] 116 7
(0.0872,0.27] 101 21
(0.27,0.671] 53 71
(0.671,1] 37 85
> tapply(propbin, treat,table) # show same thing sideways
$`0`
(0,0.0401] (0.0401,0.0872] (0.0872,0.27] (0.27,0.671] (0.671,1]
122 116 101 53 37
$`1`
(0,0.0401] (0.0401,0.0872] (0.0872,0.27] (0.27,0.671] (0.671,1]
1 7 21 71 85
> table(treat)
treat
0 1
429 185
> #we have (real) overlap only in the upper 3 quintiles of the propensity score
##### looking at the code above, here's a one-line version using cut, cleaner, gives same binning
> pbin2 = cut(propen, quantile(propen, seq(0, 1, 1/5)), include.lowest = TRUE, labels = FALSE)
> table(pbin2 , lalonde$treat)
pbin2 0 1
1 122 1
2 116 7
3 101 21
4 53 71
5 37 85
##############################
## an important diagnostic step which we focused on in class examples Friday is checking balance
## on each of the confounding vars (i.e. covariates) within each subclassification, c.f Rosebaum-Rubin ex
## in class we showed balance on covariates for Aspirin (nearest neighbor matching) and SAT (full matching)
# a quick look is not overwhelmingly impressive, but some improvement
# for the two groups, before stratification
> tapply(propen, treat, mean) # of course these should differ
0 1
0.1822248 0.5774355
> tapply(age, treat, mean) # see HW8, prob 4 we looked at age
0 1
28.03030 25.81622
> tapply(educ, treat, mean)
0 1
10.23543 10.34595
> tapply(married, treat, mean) # big diff here
0 1
0.5128205 0.1891892
> tapply(nodegree, treat, mean)
0 1
0.5967366 0.7081081
> tapply(re74, treat, mean) # big diff
0 1
5619.237 2095.574
> tapply(re75, treat, mean)
0 1
2466.484 1532.055
# after this initial quintile subclassification on propensity score, look at balance within quintile
# compare to overall above
# keep in mind the top 3 quintiles is where we have data for treat-control comparison
### there's only so much of the tools I want to introduce but I should mention here the package
### PSAgraphics which displays balance or lack thereof in propensity score analyses using stratification (subclasses)
### if you like pretty try the following (using the pbin2 version of binning above)
---------------------
> install.packages("PSAgraphics")
> library(PSAgraphics)
> box.psa(age, treat, pbin2)
## also version for categorical predictors; vignette at https://www.jstatsoft.org/article/view/v029i06/v29i06.pdf
---------------------------------
# first look at propen score
> tapply(propen, list(propbin, treat), mean)
0 1
(0,0.0401] 0.02603935 0.02495179
(0.0401,0.0872] 0.06106601 0.06589413
(0.0872,0.27] 0.13761477 0.14830101
(0.27,0.671] 0.51792739 0.57075500
(0.671,1] 0.71796548 0.73766399
> tapply(propen, list(propbin, treat), median)
0 1
(0,0.0401] 0.02592168 0.02495179
(0.0401,0.0872] 0.05951047 0.07382733
(0.0872,0.27] 0.11515593 0.13557377
(0.27,0.671] 0.53948683 0.59404299
(0.671,1] 0.71467967 0.73802599
> names(lalonde)
[1] "treat" "age" "educ" "black" "hispan" "married" "nodegree" "re74" "re75"
[10] "re78"
> tapply(age, list(propbin, treat), median)
0 1
(0,0.0401] 29 27
(0.0401,0.0872] 26 23
(0.0872,0.27] 20 23
(0.27,0.671] 24 25
(0.671,1] 19 25
> tapply(age, list(propbin, treat), mean) # not better than overall
0 1
(0,0.0401] 32.09016 27.00000
(0.0401,0.0872] 28.93103 24.00000
(0.0872,0.27] 23.92079 24.95238
(0.27,0.671] 28.52830 26.18310
(0.671,1] 22.32432 25.85882
> tapply(educ, list(propbin, treat), mean) # not bad, but not better than overall
0 1
(0,0.0401] 9.877049 13.00000
(0.0401,0.0872] 10.086207 10.71429
(0.0872,0.27] 10.910891 10.52381
(0.27,0.671] 9.886792 10.05634
(0.671,1] 10.540541 10.48235
> tapply(married, list(propbin, treat), mean) # alot better than overall esp in top 3 where we have data
0 1
(0,0.0401] 0.9754098 1.00000000
(0.0401,0.0872] 0.5603448 0.28571429
(0.0872,0.27] 0.1188119 0.19047619
(0.27,0.671] 0.4528302 0.38028169
(0.671,1] 0.0000000 0.01176471
> tapply(nodegree, list(propbin, treat), mean) # better than overall esp in top 3
0 1
(0,0.0401] 0.5245902 0.0000000
(0.0401,0.0872] 0.5775862 0.2857143
(0.0872,0.27] 0.6534653 0.7142857
(0.27,0.671] 0.5660377 0.5915493
(0.671,1] 0.7837838 0.8470588
> tapply(re74, list(propbin, treat), mean) # better than overall
0 1
(0,0.0401] 11879.0344 9381.5660
(0.0401,0.0872] 4716.3099 1490.1237
(0.0872,0.27] 1550.6141 3177.4622
(0.27,0.671] 4319.4811 3540.6004
(0.671,1] 777.6701 585.4043
> tapply(re75, list(propbin, treat), mean) # better than overall
0 1
(0,0.0401] 3762.2162 853.7225
(0.0401,0.0872] 2617.8838 1500.3862
(0.0872,0.27] 1254.2185 2009.7170
(0.27,0.671] 2504.0532 1858.2195
(0.671,1] 974.7579 1152.1901
##############################
> # lab text does a quintile function (see below), like my pbin2 command above
> # IMPLEMENT LAB TEXT 3.3
> quintilePropScores = quantile( propen, probs = seq( from = 0, to = 1, by = .2) )
> quintilePropScores
0% 20% 40% 60% 80% 100%
0.00908 0.04015 0.08721 0.26978 0.67085 0.85315
# has bin breaks to greater signif digits, that makes a small difference it turns out
> lalonde$PropScores = propen # make var names consistent
#################################
######### now start looking at outcome
> tapply(re78, list(propbin, treat),mean) # here are the mean diffs in re78 the outcome
stratified by propensity quintile
# direction of mean diffs favors treatment, job training
0 1
(0,0.0401] 10467 0
(0.0401,0.0872] 5797 7919
(0.0872,0.27] 6043 9211
(0.27,0.671] 4977 5819
(0.671,1] 4666 6030
# contrast that with the comparison ignoring any concerns about self-selection (selection bias), effect in the other direction, but not significant
> tapply(re78, treat, mean)
0 1
6984.170 6349.144
> t.test(re78 ~ treat)
Welch Two Sample t-test
data: re78 by treat
t = 0.9377, df = 326.412, p-value = 0.3491
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-697.192 1967.244
sample estimates:
mean in group 0 mean in group 1
6984.170 6349.144
> lm(re78 ~ treat, data = lalonde) # another way to look
Call:
lm(formula = re78 ~ treat, data = lalonde)
Coefficients:
(Intercept) treat
6984 -635
##### do t-tests by subclassification (strata)
> tapply(re78, list(propbin, treat),t.test) # can't do a full set of t-tests
Error in t.test.default(X[[6L]], ...) : not enough 'x' observations
> # so for the 3 upper quintiles is the mean difference significant?
attributes for case 1, bin and propensity score
> propbin[1]
[1] (0.27,0.671]
Levels: (0,0.0401] (0.0401,0.0872] (0.0872,0.27] (0.27,0.671] (0.671,1]
> propen[1]
NSW1
0.6388
> bins = c("(0,0.0401]", "(0.0401,0.0872]", "(0.0872,0.27]", "(0.27,0.671]", "(0.671,1]")
> # i.e. make a string out of the bin designations, then pull-off entries
#####actually you can just do t.test(re78[propbin == propbin[5]] ~ treat[propbin == propbin[5]]) # t-test for quintile 5
> t.test(re78[propbin == bins[5]] ~ treat[propbin == bins[5]]) # t-test for quintile 5
Welch Two Sample t-test
data: re78[propbin == bins[5]] by treat[propbin == bins[5]]
t = -0.9844, df = 100.7, p-value = 0.3273
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4113 1385
sample estimates:
mean in group 0 mean in group 1
4666 6030
# since we are doing 3 of these best practice would be to Bonferroni
and do each at Type 1 .017 instead of .05. Also Karen did these one-sided which you might justify,
won't make a difference here; see Lab text version below
> t.test(re78[propbin == bins[4]] ~ treat[propbin == bins[4]])
Welch Two Sample t-test
data: re78[propbin == bins[4]] by treat[propbin == bins[4]]
t = -0.7516, df = 108.7, p-value = 0.4539
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3061 1378
sample estimates:
mean in group 0 mean in group 1
4977 5819
> t.test(re78[propbin == bins[3]] ~ treat[propbin == bins[3]])
Welch Two Sample t-test
data: re78[propbin == bins[3]] by treat[propbin == bins[3]]
t = -1.536, df = 23.57, p-value = 0.1379
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-7428 1093
sample estimates:
mean in group 0 mean in group 1
6043 9211
> # even one-sided versions of these tests are not going to be signif; should also Bonferroni the Type I error
> t.test(re78[propbin == bins[2]] ~ treat[propbin == bins[2]]) # can also try bin 2 but really don't have enough data
Welch Two Sample t-test
data: re78[propbin == bins[2]] by treat[propbin == bins[2]]
t = -1.083, df = 7.38, p-value = 0.3128
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6708 2462
sample estimates:
mean in group 0 mean in group 1
5797 7919
## so in summary we can't find any evidence for the effectiveness of job training looking at each of the subclasses
##### lmer, a better way to do the t-tests ########################
# Refer to week 4, lab2, TH1 for our background on lmer
# What we are going to do here is carry out the t-test within each of the 5 subclassifications (level-1 model)
# and then allow lmer to 'average' (weighting by the precision basically) those 5 comparisons. There are no
# conditioning variables in the level 2 model so the fixed effects are just the 'averages' of the level 1 effects
# being careful I added propen and the bin classification to the data set
> detach(lalonde)
> lalonde$propen = propen
> lalonde$bins = propbin
> lalonde[1:10,]
treat age educ black hispan married nodegree re74 re75 re78 propen bins
NSW1 1 37 11 1 0 1 1 0 0 9930.0460 0.63876993 (0.27,0.671]
NSW2 1 22 9 0 1 0 1 0 0 3595.8940 0.22463424 (0.0872,0.27]
NSW3 1 30 12 1 0 0 0 0 0 24909.4500 0.67824388 (0.671,1]
NSW4 1 27 11 1 0 0 1 0 0 7506.1460 0.77632408 (0.671,1]
NSW5 1 33 8 1 0 0 1 0 0 289.7899 0.70163874 (0.671,1]
NSW6 1 22 9 1 0 0 1 0 0 4056.4940 0.69906990 (0.671,1]
NSW7 1 23 12 1 0 0 0 0 0 0.0000 0.65368426 (0.27,0.671]
NSW8 1 32 11 1 0 0 1 0 0 8472.1580 0.78972311 (0.671,1]
NSW9 1 22 16 1 0 0 0 0 0 2164.0220 0.77983825 (0.671,1]
NSW10 1 33 12 0 0 1 0 0 0 12418.0700 0.04292461 (0.0401,0.0872]
> str(lalonde)
'data.frame': 614 obs. of 12 variables:
$ treat : int 1 1 1 1 1 1 1 1 1 1 ...
$ age : int 37 22 30 27 33 22 23 32 22 33 ...
$ educ : int 11 9 12 11 8 9 12 11 16 12 ...
$ black : int 1 0 1 1 1 1 1 1 1 0 ...
$ hispan : int 0 1 0 0 0 0 0 0 0 0 ...
$ married : int 1 0 0 0 0 0 0 0 0 1 ...
$ nodegree: int 1 1 0 1 1 1 0 1 0 0 ...
$ re74 : num 0 0 0 0 0 0 0 0 0 0 ...
$ re75 : num 0 0 0 0 0 0 0 0 0 0 ...
$ re78 : num 9930 3596 24909 7506 290 ...
$ propen : num 0.639 0.225 0.678 0.776 0.702 ...
$ bins : Factor w/ 5 levels "(0,0.0401]","(0.0401,0.0872]",..: 4 3 5 5 5 5 4 5 5 2 ...
> propen.lmer = lmer(re78 ~ treat + (1 + treat|bins), data = lalonde)
> summary(propen.lmer)
Linear mixed model fit by REML
Formula: re78 ~ treat + (1 + treat | bins)
Data: lalonde
AIC BIC logLik deviance REMLdev
12649 12676 -6319 12668 12637
Random effects:
Groups Name Variance Std.Dev. Corr
bins (Intercept) 5208931 2282.3
treat 2069968 1438.7 -1.000
Residual 52597983 7252.4
Number of obs: 614, groups: bins, 5
Fixed effects:
Estimate Std. Error t value
(Intercept) 6434.3 1090.2 5.902
treat 383.9 951.5 0.403
Correlation of Fixed Effects:
(Intr)
treat -0.796
# so here we have an overall estimate of the effect of the treat on re78 of positive $384, but
# far from significant. Much smaller point estimate than in some of the individual strata
# of course could/should do confint(propen.lmer), gonna be wide here
################################
##Ancova with propensity score alternative ##############################
#One not terrible, but not great, use of propensity scores which you likely will encounter
in the literature is to use # the propensity score as a covariate.
# Not great, but not the worst analysis possible. An analogy could be made to covariance vs blocking
# where here we use regression smoothing rather than blocks (strat). Or to nonrandom assignment
# (week 5) where at each level of propensity, use a (biased) coin flip to assign to treat or control.
# One reason the ancova is not great is that the OLS regression will give most influence to extremes
# of propen, where we have no overlap.
> propen_ancova = lm(re78 ~ treat + propen)
> summary(propen_ancova)
Call:
lm(formula = re78 ~ treat + propen)
Residuals:
Min 1Q Median 3Q Max
-8925 -5632 -2148 3993 54953
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7833.8 430.7 18.188 < 2e-16 ***
treat 1207.6 834.1 1.448 0.148200
propen -4662.4 1319.4 -3.534 0.000441 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7402 on 611 degrees of freedom
Multiple R-squared: 0.02152, Adjusted R-squared: 0.01832
F-statistic: 6.72 on 2 and 611 DF, p-value: 0.001298
# so bigger point estimate (in the same direction) than lmer on strata, but still not significant.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
### Karen's original functions (for analysis the lmer full is better; Karen followed the t-test in Gary King's tutorial)
> # lab text does a more sophisticated quintile function
> # IMPLEMENT LAB TEXT 3.3
> quintilePropScores = quantile( propen, probs = seq( from = 0, to = 1, by = .2) )
> quintilePropScores[1:10]
0% 20% 40% 60% 80% 100%
0.00908 0.04015 0.08721 0.26978 0.67085 0.85315 NA NA NA NA
> quintilePropScores
0% 20% 40% 60% 80% 100%
0.00908 0.04015 0.08721 0.26978 0.67085 0.85315
> lalonde$PropScores = propen
> lalonde$PropScores[1:10]
[1] 0.63877 0.22463 0.67824 0.77632 0.70164 0.69907 0.65368 0.78972 0.77984 0.04292
> # here's a pretty fancy do-loop version from Lab script
> treated = list(5)
> controls = list(5)
> for ( i in 1:5 ) {
+ treated[[i]] = subset( x = lalonde,
+ subset = PropScores >= quintilePropScores[i] &
+ PropScores < quintilePropScores[i+1] &
+ treat == 1)
+ controls[[i]] = subset( x = lalonde,
+ subset = PropScores >= quintilePropScores[i] &
+ PropScores < quintilePropScores[i+1] &
+ treat == 0)
+ }
> for ( i in 1:5 ) {
+ t.test( treated[[i]]$re78,
+ controls[[i]]$re78,
+ alternative = "greater" )
+ }
Error in t.test.default(treated[[i]]$re78, controls[[i]]$re78, alternative = "greater") :
not enough 'x' observations
# so I change the loop in Lab text to exclude testing bin 1
> for ( i in 2:5 ) {
+ t.test( treated[[i]]$re78,
+ controls[[i]]$re78,
+ alternative = "greater" )
+ }
> ptests = for ( i in 2:5 ) {
+ t.test( treated[[i]]$re78,
+ controls[[i]]$re78,
+ alternative = "greater" )
+ }
> ptests
Welch Two Sample t-test
data: treated[[i]]$re78 and controls[[i]]$re78
t = 1.120, df = 108.3, p-value = 0.1326
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
-735.8 Inf
sample estimates:
mean of x mean of y
6027 4499
> t.test( treated[[5]]$re78,
+ controls[[5]]$re78,
+ alternative = "greater" )
Welch Two Sample t-test
data: treated[[5]]$re78 and controls[[5]]$re78
t = 1.120, df = 108.3, p-value = 0.1326
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
-735.8 Inf
sample estimates:
mean of x mean of y
6027 4499
> t.test( treated[[4]]$re78,
+ controls[[4]]$re78,
+ alternative = "greater" )
Welch Two Sample t-test
data: treated[[4]]$re78 and controls[[4]]$re78
t = 0.6149, df = 103.4, p-value = 0.27
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
-1193 Inf
sample estimates:
mean of x mean of y
5819 5117
> t.test( treated[[3]]$re78,
+ controls[[3]]$re78,
+ alternative = "greater" )
Welch Two Sample t-test
data: treated[[3]]$re78 and controls[[3]]$re78
t = 1.536, df = 23.57, p-value = 0.06895
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
-363.7 Inf
sample estimates:
mean of x mean of y
9211 6043
> length(treated[[3]]$re78)
[1] 21
> mean(treated[[3]]$re78)
[1] 9211
> length(controls[[3]]$re78)
[1] 101
> mean(controls[[3]]$re78)
[1] 6043
> # those match my bins
> length(treated[[4]]$re78)
[1] 71
> mean(treated[[4]]$re78)
[1] 5819
> length(controls[[4]]$re78)
[1] 51
> mean(controls[[4]]$re78)
[1] 5117
> # our different bins procedure are a little discrepant and that matters, 2 controls
##########################################################
####Doing this by hand vs letting MatchIt do all
##
##if you type at the R-prompt 'demo(subclass)' you will be taken
##on a tour of MatchIt output. Last time I tried it This demo uses a subset of our matching
##variables, hardwired to 6 subclasses, but these fail to be of equal
##size (even close) like they are supposed to be. Shows you the balance statistics
##and some charming plots, (e.g. absolute standardized diffs)if you let it run.
##For pedagogy I chose to do all this step by step.
# The matchit JSS paper, linked in week 8, provides better explanation of using the subclass option
# than the main matchit docs
#####################*****end Lab script part 3 ***********###################################