1. Course outline and logistics

2. Potential outcomes framework (DOS 2.2)

3. Study design versus inference

4. Fisher's sharp null; permutation test (DOS 2.3)

5. First model for observational studies (DOS, Sections 3.1-3.3)

Rosenbaum DOS: Chapter 2 (secs 2.1 - 2.3); Chapter 1 (esp secs 1.1, 1.2, 1.7); Chapter 3 (secs 3.1- 3.3)

Observational Studies according to Donald B. Rubin

For objective causal inference, design trumps analysis Annals of Applied Statistics, Volume 2, Number 3 (2008), 808-840. Rubin talk . Another Rubin overview of matching: Matching Methods for Causal Inference Stuart, E.A. and Rubin, D.B. (2007). Best Practices in Quasi-Experimental Designs: Matching methods for causal inference. Chapter 11 (pp. 155-176) in Best Practices in Quantitative Social Science. J. Osborne (Ed.). Thousand Oaks, CA: Sage Publications.

Lalonde NSW data (DOS sec 2.1). Subclassification/Stratification and Full matching.

Week 1 handout Rogosa R-session (using R 3.3.3) pdf slides shown in class

MatchIt vignette

1. In Week 1 Computing Corner with the Lalonde data (effect of job training on earnings), we started out (see R-session) by showing the ubiquitous [epidemiology to economics] analysis for observational data of an analysis of covariance, aka tossing the treatment variable and all the confounders into a regression equation predicting outcome and hoping for the best (c.f 2016 Week 1

Breastfeeding May Not Lead to Smarter Preschoolers Breastfeeding does NOT boost a baby's IQ: Nourishing infants the natural way only makes them less hyper Breast-feeding study sheds light on benefits for babies

Publication: Breastfeeding, Cognitive and Noncognitive Development in Early Childhood: A Population Study. Lisa-Christine Girard, Orla Doyle, Richard E. Tremblay. PEDIATRICS Volume 1 39, number 4 , April 2017.

1. Finish up: First model for observational studies (DOS, Sections 3.1-3.3)

2. Fisher's sharp null; permutation test (DOS 2.3)

3. Why randomized controlled studies produce high-quality data (DOS, Sections 15.1 & 15.4 [skim the sections in between] also Holland paper)

4. Why randomized controlled studies do not produce high-quality data (DOS, Section 2.6)

5. A matched observational study (DOS, Chap 7)

Percutaneous coronary intervention (PCI), commonly known as coronary angioplasty or simply angioplasty, is a non-surgical procedure used to treat the stenotic (narrowed) coronary arteries of the heart found in coronary heart disease.

Lindner data in package

Use of Lindner data in Vignette JSS PSAgraphics: An R Package to Support Propensity Score Analysis

Week 2 handout Rogosa R-session pdf slides shown in class

1. Exercise in pair matching. In DOS Sec 2.1, Rosenbaum works with the randomized experiment data from NSW. In Week 1,2 Computing Corner we used the constructed observational study version of these data. Use the observational study data to do a version of the 1:1 matching in DOS section 2.1. Compare the balance improvement achieved from nearest neighbor matching with the full matching results in Computing Corner Week 1,2. 2. For the fullmatch analysis done in the Lalonde class presentation weeks 1 and 2, the outcome comparison was carried out using lmer to average the treatment effects over the 104 subclasses. A hand-wavy analogy to the paired t-test here would be to use the mean difference within each subclass. Show that (because some of the subclasses are large) this simplified analysis doesn't well replicate the lmer results. 3. The JSS vignette for PSAgraphics (linked week 2 Computing Corner) does subclassification matching for Lindner data. Repeat their subclassification analyses and try out their balance displays and tests. They have some specialized functions. Compare with our basic approach. 4. The Week 2 presentation showed an alternative propensity score analysis -- analysis of covariance with propensity score as covariate. A rough analogy is to ancova vs blocking (where blocking is our subclassification, say quintiles). Try out the basic (here logistic regression) ancova approach for the lifepres dichotomous outcome

NY Times Runners World Publication: Running as a Key Lifestyle Medicine for Longevity Prog Cardiovasc Dis. 2017 Mar 29.

1. A matched observational study (DOS, Chap 7)

2. Basic tools of multivariate matching (DOS, Secs 8.1-8.4)

3. Various practical issues in matching (DOS, Chap 9)

Alternative propensity score analyses. Propensity score weighting: Inverse Probability of Treatment Weighting (IPTW). Treatment effect estimation without matching.

A thorough R exposition using the Lalonde data A Practical Guide for Using Propensity Score Weighting in R Practical Assessment, Research & Evaluation, v20 n13 Jun 2015.

Also Cox Regression, comparison with full matching (Elizabeth Stuart)

Rogosa R-session pdf slides shown in class

1. Try out the ATE IPTW analysis (done in week3 computing corner) for the dichotomous outcome lifepres in the Lindner data. Compare with full matching results shown in class.

2. Try an ATT IPTW analysis for log(cardbill) outcome in the Lindner data.

3. Modify Fisher's Sharp Null to reflect the null hypothesis that the treatment adds five units to the outcome under control. Build a small simulation (e.g., 10 observations) and construct a table that summarizes the potential outcomes. Randomize using a fair coin flip to assign treatment or control for each observational unit. Use the permutation test to assess your data set using (i) Fisher's Sharp Null and (ii) the null hypothesis that the treatment adds five units to the outcome under control.

4. Building off of RQ#3 above, sort your observations so they are in ascending order based on the outcome under control. Randomize two at a time: one fair coin flip now assigns either the first or second observation to treatment (and the other to control). A second fair coin flip assigns either the third or the fourth observation to treatment (and the other to control). This continues so on and so forth. Use the appropriate permutation test to assess your data set using (i) Fisher's Sharp Null and (ii) the null hypothesis that the treatment adds five units to the outcome under control. Contrast the results here with the results from RQ#3.

Study links diet soda to higher risk of stroke, dementia Wash Post; Just ONE Diet Coke or Pepsi Max a day can "TRIPLE the risk of a deadly stroke" and dementia, researchers claim Sun; A diet soda a day might affect dementia risk, study suggests AHA news

Publication: Sugar and Artificially Sweetened Beverages and the Risks of Incident Stroke and Dementia: A Prospective Cohort Study

Coverage continues and widens. PepsiCo focuses on 'guilt-free' beverages, yet more research casts a pall over diet soda MarketWatch. Drinking Too Much Soda May Be Linked to Alzheimer's Bloomberg.

Finish up: Various practical issues in matching (DOS Chap 9)

Sensitivity analysis (DOS Sections 3.4-3.7 and 3.9)

Designs to strengthen your analysis: multiple control groups, "known" effects (DOS Chap 6)

Alternative computation of propensity scores (trees, boosting). Teamed with IPTW in

Toolkit for Weighting and Analysis of Nonequivalent Groups: A tutorial for the twang package Lalonde data, yet again.

Rogosa twang and ATT session with Lalonde data Week 4 slides

Additional Resources:

Package

Package

To come, sensitivity analysis computations: package

1. Try out, using the Lalonde data (Week 1), the boosted regression approach to computing propensity scores using Ridgeway's (via Friedman)

2. Try out using the Lindner data shown in the PSAgraphics vignette (JSS linked week 2), the regression tree classification (use rpart) approach for propensity score estimation. Examine resulting propensity scores, balance for matching in six suclassifications, and outcome analysis for cardbill measure.

Low-sodium diet might not lower blood pressure Higher sodium intake associated with lower blood pressure. You read that right. Abstract Low Sodium Intakes are Not Associated with Lower Blood Pressure Levels among Framingham Offspring Study Adults

Summarize Gamma sensitivity analysis (DoS 3.4-3.8)

What to do with missing data, and a word of warning (DoS 9)

Using multiple outcomes - coherence and known null effects (DoS 5.2.3 and 5.2.4)

Using a second control group - mitigating bias (DoS 5.2.2)

Sensitivity analysis computations:

package

Rosenbaum pacakges

Rogosa sensitivity session CC_5 slides

1. Mercury example (2 controls) from section 3 and 6 of Rosenbaum vignette (linked in CC_5)

Fish often contains mercury. Does eating large quantities of fish increase levels of mercury in the blood? Data set mercury in the sensitivitymw package is from the 2009-2010 National Health and Nutrition Examination Survey (NHANES) and is the example in Rosenbaum (2014). There are 397 rows or matched triples and three columns, one treated with two controls. The values are methylmercury levels in blood. Column 1, Treated, describes an individual who had at least 15 servings of fish or shellfish in the previous month. Column 2, Zero, describes an individual who had 0 servings of fish or shellfish in the previous month. Column 3, One, describes an individual who had 1 serving of fish or shellfish in the previous month. In the comparison here, Zero and One are not distinguished; both are controls. Sets were matched for gender, age, education, household income, black race, Hispanic, and cigarette consumption.

2. Demonstration--see solution. Mechanics of setting up a matched data set for the sensitivity functions. Easiest to create the data set for the most common 1:1 matching situation (merge works without needing thought); steps for 1:1 matching setting below

1. Note: RCT (cross-over design). Damn right! The secret of success is swearing: How shouting four letter words can help make you stronger Swearing can help you boost your physical performance The full power of swearing is starting to be discovered

2. Another RCT. Talking to yourself out loud helps boost brainpower and could indicate higher intelligence Is talking to yourself a sign of mental illness? An expert delivers her verdict

Bonus item: not just observational studies have problems. Medical studies are almost always bogus

(i) Using multiple outcomes - coherence and known null effects (DoS 5.2.3 and 5.2.4)

(ii) Using a second control group - mitigating bias (DoS 5.2.2)

(iii) inverse probability weighting (link)

Baiocchi R-session

TOO much exercise causes a leaky gut and increases health risks Publication: American Journal of Physiology - Gastrointestinal and Liver Physiology. Changes in intestinal microbiota composition and metabolism coincide with increased intestinal permeability in young adults under prolonged physiologic stress American Journal of Physiology - Gastrointestinal and Liver Physiology Published 23 March 2017

(1) Inference (DoS 2.3-2.4)

(2) Arguments for observational studies (DoS 2.6)

(3) Crossover designs (link)

package

Rogosa session, causaldrf examples

also covariate balancing propensity score, package CBPS

The Propensity Score with Continuous Treatments

Causal Inference With General Treatment Regimes: Generalizing the Propensity Score, Journal of the American Statistical Association, Vol. 99, No. 467 (September), pp. 854-866.

In week 7 Computing Corner we showed results for ADRF (average dose-response function) estimates using Imbens very clever artificial data example from the linked causaldrf vignette (see also CC_7 slides).

IPW results (see Weeks 3 and 4 Computing Corner for examples for binary treatements) were notable in apparant bad bad performance (all other estimates did pretty well). Keep in mind this artificial data test is not even a "phase 2" hurdle, as we are given the selection variables (X_1, X_2) that are responsible for individuals selecting dose (here denoted by T) other than randomness.

As IPW is dominant in applications like long-term occupation exposures (to bad stuff), the dose-reponse setting is quite relevant. The artificial data ADRF has an important feature of a non-monotonic dip, reminiscent of alcohol or even salt (a bit above 0 is better than zero) for health outcomes. So for another look at IPW, I tried to make a much easier example, with basically a straight-line ADRF (just with a little wiggle) by limiting dose (T) to > .5.

So try out the comparison of the hi_estimate (shown in class) and the iptw_estimate both from the causaldrf package with the true ADRF from the artificial data construction using values T > .5 (about half the data).

Are we any happier with the value of IPW (importance sampling)? Solution indicates to me: "no", YMMV.

Eating salt could help you to LOSE weight, study reveals Vanderbilt Publication: High salt intake reprioritizes osmolyte and energy metabolism for body fluid conservation. J Clin Invest. 2017;127(5):1944-1959.

Encouragement design (Holland 1988 )

Instrumental variable methods for causal inference ( Baiocchi, Cheng and Small 2004)

Regression discontinuity - Lee and Lemieux 2011

Example from rdd manual (Stat209 handout) ascii version

Angrist-Lavy Maimondes (class size) data sections 1.3, 3.2, 5.2.3, 5.3 DOS text

read data

R-package--rdd; Regression Discontinuity Estimation Author Drew Dimmery

Also Package

Stat209, Regression Discontinuity handout

Trochim W.M. & Cappelleri J.C. (1992). "Cutoff assignment strategies for enhancing randomized clinical trials." Controlled Clinical Trials, 13, 190-212. pubmed link

Journal of Econometrics (special issue) Volume 142, Issue 2, February 2008, The regression discontinuity design: Theory and applications Regression discontinuity designs: A guide to practice, Guido W. Imbens, Thomas Lemieux

Another Econometric treatment

Also from Journal of Econometrics (special issue) Volume 142, Issue 2, February 2008, The regression discontinuity design: Theory and applications Waiting for Life to Arrive: A history of the regression-discontinuity design in Psychology, Statistics and Economics, Thomas D Cook

the original paper: Thistlewaite, D., and D. Campbell (1960): "Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment," Journal of Educational Psychology, 51, 309-317.

Capitalizing on Nonrandom Assignment to Treatments: A Regression-Discontinuity Evaluation of a Crime-Control Program Richard A. Berk; David Rauma

Berk, R.A. & de Leeuw, J. (1999). "An evaluation of California's inmate classification system using a generalized regression discontinuity design."

To come: Instrumental Variable Methods: packages

Extra: try out also the

i. Create artificial data with the following specification. 10,000 observations; premeasure (Y_uc in my session) gaussian mean 10 variance 1. Effect of intervention (rho) if in the treatment group is 2 (or close to 2) and uncorrelated with Y_uc. Probability of being in the treatment group depends on Y_uc but is not a deterministic step-function ("sharp design"):

ii. Try out analysis of covariance with Y_uc as covariate. Obtain a confidence interval for the effect of the treatment.

iii. Try out the fancy econometric estimators (using finite support) as in the rdd package. See if you find that they work poorly in this very basic fuzzy design example.

Extra: try out also the

1. Causal direction. Journalists drink too much, are bad at managing emotions, and operate at a lower level than average, according to a new study the neuroscience, Study into the mental resilience of journalists

2. Eating Chocolate, A Little Each Week, May Lower The Risk Of A Heart Flutter . Publication: Chocolate intake and risk of clinically apparent atrial fibrillation: the Danish Diet, Cancer, and Health Study . Mostofsky E, Berg Johansen M, Tjønneland A, et al Chocolate intake and risk of clinically apparent atrial fibrillation: the Danish Diet, Cancer, and Health Study

Instrumental variable methods for causal inference ( Baiocchi, Cheng and Small 2004)

Regression discontinuity - Lee and Lemieux 2011

IV handout CC_9 slides Rogosa IV sessions, examples

Additional resources:

2. Use the Card data, described in the ivmodel vignette, to carry out some basic IV analyses. Compare