EPI292/STAT266/CHPR266/EDUC260B-- Lectures, Course Files, and Readings

Week 1--Course Introduction; Matching Methods part 1 (intro and theory)

Lecture Topics             Lecture 1 slide deck (companion audio part 1) (companion audio part 2)
1. Course outline and logistics
2. A matched observational study (DOS, Chap 7)
3. Study design versus inference
4. Basic tools of multivariate matching (DOS, Secs 8.1-8.4)

Text Readings
Rosenbaum DOS: Chapters 7 and 8 (8.1-8.4)
Additional Resources
Observational Studies according to Donald B. Rubin
   For objective causal inference, design trumps analysis Annals of Applied Statistics, Volume 2, Number 3 (2008), 808-840.    Rubin talk .   Another Rubin overview of matching: Matching Methods for Causal Inference Stuart, E.A. and Rubin, D.B. (2007). Best Practices in Quasi-Experimental Designs: Matching methods for causal inference. Chapter 11 (pp. 155-176) in Best Practices in Quantitative Social Science. J. Osborne (Ed.). Thousand Oaks, CA: Sage Publications.

Computing Corner: Extended Data Analysis Examples
Lalonde NSW data (DOS sec 2.1). Subclassification/Stratification and Full matching.
      pdf slides for CC1     2021 audio companion

   Week 1 handout Lalonde NSW data
Rogosa R-session (using R 3.3.3)        4/1/18 redo in R 3.4.4 (sparse)
      2019 lalonde Matchit: full matching, balance with cobalt  love.plot and bal.tab
      2019 lalonde optmatch: fullmatch with outcome analysis  
MatchIt provides a wrapper that can call optmatch or Sekhon's genetic matching
MatchIt: Nonparametric Preprocessing for Parametric Casual Inference Daniel Ho, Kosuke Imai, Gary King, Elizabeth Stuart
   MatchIt vignette
JSS May 2011 exposition: MatchIt: Nonparametric Preprocessing for Parametric Causal Inference  
Cobalt:     Using cobalt with Other Preprocessing Packages     Covariate Balance Tables and Plots: A Guide to the cobalt Package
Optmatch:   Ben Hansen (local hero)   optmatch manual     R News Oct 2007
optmatch:fullmatch vignette      optmatch another version     another good tutorial  optmatch Functions for Optimal Matching

Week 1 Review Questions
From Computing Corner
1.  In Week 1 Computing Corner with the Lalonde data (effect of job training on earnings), we started out (see R-session) by showing the ubiquitous [epidemiology to economics] analysis for observational data of an analysis of covariance, aka tossing the treatment variable and all the confounders into a regression equation predicting outcome and hoping for the best (c.f 2016 Week 1 in the news analyses: mom fish consumption on child cognition). The statement made in class (technical details week 1 stat209) is that regression does not "control" for confounders; instead the coefficient of treament (putative causal effect) is obtained from a straight-line regression of outcome on the residuals from a prediction of treatment by all the other predictors in the regression. Demonstrate that equivalence using the ancova in CC1.       
 Solution for Review Question 1
2. RQ1 uses the Week 1 Computing Corner Lalonde data (effect of job training on earnings) analysis of covariance: tossing the treatment variable and all the confounders into a regression equation predicting outcome and hoping for the best. Compare that ancova with an ancova the uses just the significant predictors of re78. Also compare with an ancova which uses the single available covariate/confounder having the highest correlation with outcome. Are these analyses consistent?       
 Solution for Review Question 2
From Lecture
3. We will be working a lot with matching based techniques. One of the best thinkers/writers on the topic of matching is Elizabeth Stuart from Johns Hopkins. For this problem, take a look at her paper: "Matching Methods for Causal Inference: A Review and a Look Forward." In lecture 01 you were introduced to "balance tables" (a.k.a. "Table 1") which summarizes the covariate distribution of the observations. A handful of questions: (a) as concisely as possible, state why we focus on balance assessments as part of our argumentation when attempting to perform causal inference, (b) in addition to a balance table, name other tools used to report balance, (c) why do we use standardized mean differences instead of p-values to assess balance when assessing the quality of a match design?, and (d) why is it kinda weird to use a p-value of the covariates in a randomized trial to assess balance?       
 Solution for Review Question 3
4. In lecture 1 we quickly outlined some of the big challenges to causal inference when using observational data (see slide 41, "There should be strong effort to show the two groups are similar..."). These challenges include: inclusion/exclusion of observations, observational units that may be completely missing (censored, survival bias), missing data, imbalances in observed data, and imbalances in unobserved data. We'll address each of these at different points in the course. But let's focus on the decision to include/exclude observations. What we're doing when matching -- i.e., removing observations that do not have adequate counterparts in the contrast group -- may seem a bit subversive. The intuition is: why "throw away" data? I think there are two reasons people worry about "throwing away data." First, it seems like limiting the kinds of observations in our study we may be losing the ability to generalize our conclusions to a wider swath of the population. The counter to that is: yes, we are trading off the ability to generalize (i.e., external validity) for the ability to make stronger claims about a candidate causal effect (i.e., internal validity). The second concern is that it seems like more data is better. Formulate a response to this concern. (Note: OMG, this question seems so nebulous. Yup. That's how this works; you're playing Big Kid academics now. We made sure to mention this argument during lecture 01, so you know it. It's a common statistical argument nowadays. If you want to read your way out of this one... here's a good paper.)       
 Solution for Review Question 4
From Computing Corner
5. Exercise in pair matching. In DOS Sec 2.1, Rosenbaum works with the randomized experiment data from NSW. In Week 1,2 Computing Corner we used the constructed observational study version of these data. Use the observational study data to do a version of the 1:1 matching in DOS section 2.1. Compare the balance improvement achieved from nearest neighbor matching with the full matching results in Computing Corner Week 1,2.      
 Solution for Review Question 5
6. For the fullmatch analysis done in the Lalonde class presentation weeks 1 and 2, the outcome comparison was carried out using lmer to average the treatment effects over the 104 subclasses. A hand-wavy analogy to the paired t-test here would be to use the mean difference within each subclass. Show that (because some of the subclasses are large) this simplified analysis doesn't well replicate the lmer results.       
 Solution for Review Question 6
7. optmatch package, fullmatch, lalonde.
MatchIt uses the optmatch package fullmatch command for its "full" option, as used in the class example. Using the raw optmatch (without the matchit wrapper) allows additional specifications and controlls for the full or optimal matching.
For lalonde data try out optmatch fullmatching and compare results for subclasses and balance with the class example using optmatch through MatchIt.       
 Solution for Review Question 7

Week 2-- Matching Methods Part 2 (implementation); Potential Outcomes and Study Design

Lecture Topics            Lecture 2 slide deck      (companion audio)
1. Basic tools of multivariate matching (DOS, Secs 8.1-8.4)
2. Potential outcomes framework (DOS 2.2) 
3. Fisher's sharp null; permutation test (DOS 2.3) 
4. Various practical issues in matching (DOS, Chap 9)
Text Readings
Rosenbaum DOS: Chapter 2 (plus week1 items)
Additional Resources
From Donald B. Rubin
   First section of Basic Concepts of Statistical Inference for Causal Effects in Experiments and Observational Studies    Similar material Chaps 1 and 2 Causal Inference in Statistics, Social and Biomedical Sciences: An Introduction, Guido Imbens and Don Rubin linked on main page.

Computing Corner: Extended Data Analysis Examples
     Lindner data, Percutaneous Coronary Intervention with 'evidence based medicine'.
Percutaneous coronary intervention (PCI), commonly known as coronary angioplasty or simply angioplasty, is a non-surgical procedure used to treat the stenotic (narrowed) coronary arteries of the heart found in coronary heart disease.
Lindner data in package PSAgraphics     Use of Lindner data in Vignette JSS   PSAgraphics: An R Package to Support Propensity Score Analysis  Journal of Statistical Software February 2009, Volume 29, Issue 6. http://www.jstatsoft.org/

       cc2 pdf slides Lindner example     2021 audio companion
         Week 2 handout       Rogosa R-session
          Additional R-session  Lindner fullmatch in optmatch and cobalt

Week 2 Review Questions
From Computing Corner
1. The JSS vignette for PSAgraphics (linked week 2 Computing Corner) does subclassification matching for Lindner data. Repeat their subclassification analyses and try out their balance displays and tests. They have some specialized functions. Compare with our basic approach.       
Lindner data  package PSAgraphics Vignette JSS           outcome analysis, Rogosa session
2. The Week 2 presentation showed an alternative propensity score analysis -- analysis of covariance with propensity score as covariate. A rough analogy is to ancova vs blocking (where blocking is our subclassification, say quintiles). Try out the basic (here logistic regression) ancova approach for the lifepres dichotomous outcome       
 Solution for Review Question 2

From Lecture
3. Modify Fisher's Sharp Null to reflect the null hypothesis that the treatment adds five units to the outcome under control. Build a small simulation (e.g., 10 observations) and construct a table that summarizes the potential outcomes. Randomize using a fair coin flip to assign treatment or control for each observational unit. Use the permutation test to assess your data set using (i) Fisher's Sharp Null and (ii) the null hypothesis that the treatment adds five units to the outcome under control.       
 Solution for Review Question 3

4. Building off of RQ#3 above, sort your observations so they are in ascending order based on the outcome under control. Randomize two at a time: one fair coin flip now assigns either the first or second observation to treatment (and the other to control). A second fair coin flip assigns either the third or the fourth observation to treatment (and the other to control). This continues so on and so forth. Use the appropriate permutation test to assess your data set using (i) Fisher's Sharp Null and (ii) the null hypothesis that the treatment adds five units to the outcome under control. Contrast the results here with the results from RQ#3.       
 Solution for Review Question 4

From Computing Corner
5. Pair matching--nuclear plants data. See also week8,Stat209. Another (small) canonical matching example for optmatch expositions is the nuclear plants data from Cox and Snell text.
Data set is nuclearplants fromn optmatch  optmatch manual    Ben Hansen (local hero) exposition of nuclearplants example in  R News Oct 2007
Additional exercises (checking balance) using the nuclearplants data from Mark Fredrickson here
Data cleaning gives 7 "treatment" and resevoir of 19 controls. Try out 1:2 optimal pair matching using MatchIt (see also stat209 exs) and compare with pairmatch in optmatch plus balance diagnostics.   
 nuclearplants using Matchit (Stat209 handout)           optmatch for Review Question 5

Week 3-- Full matching, Inclusion and Exclusion, and Defining Treatment Effects

Lecture Topics             Lecture 3 slide deck     (companion audio)
                                    optmatch example from lecture
1. Finish up: Basic tools of multivariate matching (DOS, Secs 8.1-8.4)
2. Various practical issues in matching (DOS, Chap 9)
3. Inverse probability weighting ( Robins & Hernan, Chap 2.4) - 
Text Readings
Rosenbaum DOS: Chapters 8 and 9
Additional Resources
Smoking study (Prochaska et al 2016)
Dealing with limited overlap in estimation of average treatment effects (Crump et al 2009)  (or see http://public.econ.duke.edu/~vjh3/working_papers/overlap.pdf )
Defining the Study Population for an Observational Study to Ensure Sufficient Overlap: A Tree Approach (Traskin & Small 2011)
CONSORT Statement   (randomized trials)
STROBE Statement   (observational studies)

Computing Corner resumes Week 4 with IPW methods

Week 3 Review Questions:
From Lecture
1. In this class we've shown you a couple of tools to assess the adequacy of a matched set - for example: Love plots, balance tables, standardized mean differences, and histogram plots of fitted propensity scores (or covariates). Why haven't we shown you a statistical test? That's weird, right? A ton of researchers fall for this, failing to see why assessing balance using a hypothesis test in an observational study is problematic. There are a couple of valid critiques; try articulating at least one such critique. (Hint: think about how we calculate the SMD vs a standard error.) Once you've given it a go, check out Section 6.6 of this paper (great paper!) for a couple of solutions to this question.

2. In section 6.7 of that same paper, the authors say their preferred tool for assessing balance is an empirical QQ plot. What's a QQ plot? Compare and contrast the use of QQ plots and a balance table. Neither of these tools in dominate, so what are the benefits and drawbacks to each?       
 Solution for Review Question 2

Week 4-- Models for Observational Studies

Lecture Topics             Lecture 4 slide deck     (companion audio)
 First model for observational studies (DOS, Sections  15.1 and 15.4; 3.1-3.3)

Computing Corner:  Extended Data Analysis Examples
Alternative propensity score analyses. Propensity score weighting: Inverse Probability of Treatment Weighting (IPTW). Treatment effect estimation without matching.
Primary sources:
Review paper: Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effectsin observational studies Peter C. Austin and Elizabeth A. Stuart, Statistics in Medicine Statist. Med.2015,34 3661-36793661
A thorough R exposition using the Lalonde data   A Practical Guide for Using Propensity Score Weighting in R Practical Assessment, Research & Evaluation, v20 n13 Jun 2015.

               pdf slides cc4         2021 audio companion         Rogosa R-session

Additional Resources:
package bcaboot    intro vignette   paper: The automatic construction of bootstrap confidence intervals
A Guide to Using WeightIt for Estimating Balancing Weights   Noah Greifer
Propensity score techniques and the assessment of measured covariate balance to test causal associations in psychological research Valerie S. Harder, M.H.S., Ph.D., Elizabeth A. Stuart, Ph.D., and James C.Anthony, Ph.D.             .R file (readable code showing matchit fullmatch and IPW) for paper        Also    Cox Regression, comparison with full matching (Elizabeth Stuart)

Week 4 Review Questions
From Computing Corner
1. Try out the ATE IPTW analysis (done in week4 computing corner) for the dichotomous outcome lifepres in the Lindner data. Compare with full matching results shown in class.       
 Solution for Review Question 1

2. Try an ATT IPTW analysis for log(cardbill) outcome in the Lindner data.       
 Solution for Review Question 2

From Lecture
3. The Wilcoxon signed rank test takes as its input a fixed number, designate this number I, of matched pairs. The Wilcoxon signed rank test is a permutation test with a specific test statistic. Let's explore the behavior of its statistic compared to the behavior of the average of the within-pair differences. You can use the sample code provided to simulate (i.e., simulation 1 here). Consider playing around with the sd in the data generating functions to see the impact in the histograms.
Question: what happens when we introduce one really 'weird' data point in our matched sets? Compare what happens to the distributions for mean(y_t - y_c |matched pairs) vs the Wilcoxon rank sign test. The solution is in the comments in simulation 3 in the link above.

Week 5-- Randomized Experiments and Design Sensitivity