Stat209/Ed260 D Rogosa 2/24/18 Solutions Assignment 7. Compliance and experimental protocols; intent to treat Problem 1 Non-compliance. a. ITT estimate is 639 - 380 (259 excess deaths per 100000 from no Vitamin A) or reverse it b. proportion in treatment complying is .8. Control group compliance was perfect (as stated) c. IV estimate is ITT/.8, which has the interpretation of CACE, complier average causal effect, 324 deaths per 100000 --------------------------- Problem 2 > # problem 2 > # mean compliance in JHU PIRC study is reported as .479 > # ITT estimate is difference in group means in change in anti-social behavior > # ITT =.364 > # CACE estimate is ITT/.479 = .76 ; usual compliance adjustment to ITT slides for the Booil Jo examples in class handout or linked on the class page at http://www-stat.stanford.edu/~rag/stat209/jorogosa06.pdf part 2 shown in Don Rubin For Objective Causal Inference, Design Trumps Analysis posted at http://www.bristol.ac.uk/media-library/sites/cmm/migrated/documents/trumps.pdf ==================== Problem 3 Artificial data in the image of Efron-Feldman # cholesterol reduction outcome measure > library(AER) > efdat = read.table(file = "http://web.stanford.edu/~rag/stat209/hw7efdata", header = T) > attach(efdat) > head(efdat) comp G Y 1 0.528 0 -9.57 2 0.862 1 58.50 3 0.980 0 10.10 4 0.673 1 33.90 5 0.660 0 3.60 6 0.551 1 29.10 > table(G) G 0 1 150 150 ###### got data > t.test(Y ~ G) # ITT estimate about 20, (not using compliance info) # approximating E-F paper analysis Welch Two Sample t-test data: Y by G t = -14.9477, df = 275.264, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -22.52955 -17.28584 sample estimates: mean in group 1 mean in group 2 9.663234 29.570932 # try an IV with compliance as a measured variable # and random assignment as instrument (analog to CACE) # Doesn't seem to work well, but see below > caceIV = ivreg(Y ~ comp|G) > confint(caceIV) 2.5 % 97.5 % (Intercept) -4262.740 2386.813 comp -3991.215 7220.521 # pretty wide > summary(caceIV) Call: ivreg(formula = Y ~ comp | G) Residuals: Min 1Q Median 3Q Max -634.86 -202.04 22.26 196.10 840.55 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -938 1696 -0.553 0.581 comp 1615 2860 0.565 0.573 Residual standard error: 305.4 on 298 degrees of freedom Multiple R-Squared: -399.6, Adjusted R-squared: -400.9 Wald test: 0.3187 on 1 and 298 DF, p-value: 0.5728 # but as you probably presumed that's not really the right analysis # the research question is, what is the effect of the drug? # the correct IV picture is as displayed in the Greenland handout # in the treatment group dose of the drug is confounded with compliance # in the control group dose of the drug is 0 (regardless of compliance level) # so the correct analog to AIR is > caceIV2 = ivreg(Y ~ I(comp*G)|G) > summary(caceIV2) Call: ivreg(formula = Y ~ I(comp * G) | G) Residuals: Min 1Q Median 3Q Max -28.5645 -6.1078 -0.2812 6.5362 29.6355 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 9.6645 0.8059 11.99 <2e-16 *** I(comp * G) 33.2189 1.9021 17.46 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 9.871 on 298 degrees of freedom Multiple R-Squared: 0.5814, Adjusted R-squared: 0.58 Wald test: 305 on 1 and 298 DF, p-value: < 2.2e-16 > # not far from the E-F paper estimate of 34.5 for ("CACE") effect (cholesterol reduction) for perfect compliers (these data are an imitation not exact recreation of E-F) > confint(caceIV2) 2.5 % 97.5 % (Intercept) 8.084849 11.24406 I(comp * G) 29.490842 36.94686 > # nice, tight interval ################################################ # Now go back to conventional Rubin approach with compliance as 0,1 > efdat$compT = efdat$comp > .8 # comp indicator # calculate simpler form of CACE with traditional .8 cutpoint, # yields with low compliance, 15%, maybe not the best choice of cut > mean(efdat$compT) [1] 0.15 > 20/.15 # divide ITT by proportion defined as compliant [1] 133.3333 > # wow > caceIVT = ivreg(Y ~ compT|G, data = efdat) > summary(caceIVT) Call: ivreg(formula = Y ~ compT | G, data = efdat) Residuals: Min 1Q Median 3Q Max -872.8 131.8 143.8 155.9 185.8 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -129.7 306.5 -0.423 0.673 compTTRUE 995.3 2039.0 0.488 0.626 # good bit diff than ITT of 20, CACE above is 133 Residual standard error: 353.2 on 298 degrees of freedom Multiple R-Squared: -534.8, Adjusted R-squared: -536.6 Wald test: 0.2383 on 1 and 298 DF, p-value: 0.6258 > confint(caceIVT) 2.5 % 97.5 % (Intercept) -730.4593 471.1126 compTTRUE -3001.0986 4991.6386 # here shows CACE can be anywhere for IV version # but as in above the right IV analysis uses dose (which equals compT in treat, 0 in control) > efdat$compT = efdat$comp > .8 # comp indicator > attach(efdat) > caceIV2 = ivreg(Y ~ I(compT*G)|G) # pretending dose is (0,1) > summary(caceIV2) # gives about usual cace, wayoff from E-F effect for perfect compliance Call: ivreg(formula = Y ~ I(compT * G) | G) Residuals: Min 1Q Median 3Q Max -114.173 -4.502 5.436 16.561 46.436 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 9.664 2.447 3.950 9.78e-05 *** I(compT * G) 124.409 21.628 5.752 2.18e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 29.97 on 298 degrees of freedom Multiple R-Squared: -2.858, Adjusted R-squared: -2.871 Wald test: 33.09 on 1 and 298 DF, p-value: 2.184e-08 > confint(caceIV2) 2.5 % 97.5 % (Intercept) 4.868521 14.46039 I(compT * G) 82.018297 166.79920 > > cor(efdat$G, efdat$compT) [1] 0.0280056 # G a pretty weak instrument here > cor.test(efdat$G, as.numeric(efdat$compT)) Pearson's product-moment correlation data: efdat$G and as.numeric(efdat$compT) t = 0.4836, df = 298, p-value = 0.629 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.08550641 0.14079991 sample estimates: cor 0.0280056 > cor.test(efdat$G, efdat$comp) Pearson's product-moment correlation data: efdat$G and efdat$comp t = 0.5565, df = 298, p-value = 0.5783 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.08131855 0.14493103 sample estimates: cor 0.03221898 ---------------------------------- end solutions HW7 2018