Stat 209B-- Lectures, Course Files, and Readings

Week 0
Course introduction (lecture and audio posted on main page)
Background readings
1.   Correlation and Causation: A Comment, Stephen Stigler Perspectives in Biology and Medicine, volume 48, number 1 supplement (winter 2005)
2.    Secret to Winning a Nobel Prize? Eat More Chocolate  (Time)   Publication: Chocolate Consumption, Cognitive Function, and Nobel Laureates Franz H. Messerli, M.D. N Engl J Med 2012; 367:1562-1564 October 18, 2012
3.  David Freedman chapters.
   From Association to Causation: Some Remarks on the History of Statistics;  
   Statistical Models for Causation: A critical review    
   Statistical Models and Shoe Leather, Sociological Methodology, Vol. 21. (1991), pp. 291-313. JStor link

Week 1

Lecture slides, week 1 (pdf)
Audio companion, week 1
parta   partb   partc
1. Encouragement Designs: example of potential outcomes formulation.

       Illustration using encouragement design representation in Holland (1988).    copies of selected overheads.
       Encouragement Designs. Potential outcomes formulation and IV parameter estimation in Holland (1988).    Estimation handout
       Do regression methods (path analysis) identify causal effects? Demonstrations of failure for Holland's encouragement design.    class handout    Encouragement design slides

Primary Readings
Paul Holland, Causal Effects and Encouragement Designs. Causal Inference, Path Analysis, and Recursive Structural Equations Models
Paul W. Holland Sociological Methodology, Vol. 18. (1988), pp. 449-484. (Encouragement design results; sections 3-5)
Holland Appendix (esp pp. 475-480) presents the potential outcomes formulation.
Abstract Rubin's model for causal inference in experiments and observational studies is enlarged to analyze the problem of "causes causing causes" and is compared to path analysis and recursive structural equations models. A special quasi-experimental design, the encouragement design, is used to give concreteness to the discussion by focusing on the simplest problem that involves both direct and indirect causation. It is shown that Rubin's model extends easily to this situation and specifies conditions under which the parameters of path analysis and recursive structural equations models have causal interpretations.

Encouragement Design research examples:
   Sesamee Street evaluation
Gelman-Hill text sec 10.5; Data Analysis Using Regression and Multilevel/Hierarchical Models
   Salt and Blood Pressure clinical trial
Publication: Feasibility and efficacy of sodium reduction in the Trials of Hypertension Prevention, phase I Trials of Hypertension Prevention Collaborative Research Group. S K Kumanyika, P R Hebert, J A Cutler, V I Lasser, C P Sugars, L Steffen-Batey, A A Brewer, MI. Hypertension doi: 10.1161/01.HYP.22.4.5021993;22:502-512

2. Mediating (process) variables
Historical (Barron-Kenny) methods  David Kenny web page
R-implementations: mediating variables
Barron-Kenny method via Sobel function in the multilevel package.
 More extensive implementation (incl BCa bootstrapping) function mediation in package MBESS Ken Kelley;
power and sample size calculations in package powerMediation
 mediation package. takes the topic up a large level of complexity/capabilities
      data analysis example    data file

Primary Readings
Vignette for mediation package   Causal Mediation Analysis Using R   
Mediation Analysis David P. MacKinnon, Amanda J. Fairchild, and Matthew S. Fritz Department of Psychology, Arizona State University, Tempe, Arizona 85287-1104; Annu. Rev. Psychol. 2007. 58:593-614

Mediation research examples:
  Framing experiment
Brader T, Valentino NA, Suhat E (2008). What Triggers Public Opposition to Immigration? Anxiety, Group Cues, and Immigration." American Journal of Political Science, 52(4), 959-978.  jstor link
Data in mediation package; data description and analyses in mediation package vignette (linked below)
  Bench Science vs Path Analysis: Exercise and Alzheimers
NYTimes:How Exercise May Help Keep Our Memory Sharp .
Publication: Exercise-linked FNDC5/irisin rescues synaptic plasticity and memory defects in Alzheimer's models   Nature Medicine volume 25, pages165-175 (2019)
  Mediated moderation?
   Stanford Medicine     Common opioids less effective for patients on SSRI antidepressants    Publication: Predicting inadequate postoperative pain management in depressed patients: A machine learning approach Arjun Parthipan,Imon Banerjee,Keith Humphreys,Steven M. Asch,Catherine Curtin,Ian Carroll ,Tina Hernandez-Boussard Published: February 6, 2019
   New Yorker. December 23, 2013. The Power of the Hoodie-Wearing C.E.O.    Publication: The Red Sneakers Effect: Inferring Status and Competence from Signals of Nonconformity Author(s): Silvia Bellezza, Francesca Gino, and Anat Keinan Source: Journal of Consumer Research

Additional Resources
Mediators and Moderators of Treatment Effects in Randomized Clinical Trials  Helena Chmura Kraemer; G. Terence Wilson; Christopher G. Fairburn; W. Stewart Agras Arch Gen Psychiatry. 2002;59:877-883
additional technical papers. Causal Mediation Analysis Using R K. Imai, L. Keele, D. Tingley, and T. Yamamoto    American Political Science Review Vol. 105, No. 4 November 2011 Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies
MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M.,West, S. G., Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83-104.
      Useful expositions Using R
Chapter 14: Mediation and Moderation  Alyssa Blair
Mediation and Moderation Analyses with R - OSF  presentation slides

Week 1 Review Questions

Question 1. Mediating Variable Computations: Class example continued
The data set shown in class example ss423 is linked above and in the legacy directory
for predictor (IV) 'belong' outcome 'depress' and (potential) mediating variable 'master' The class example showed you the Baron-Kenny analysis using functions from the multilevel and MBESS packages.
Here just use 'lm' basic regression and the recipees from the class handout to recreate point estimates and asymptotic standard errors, significance tests for the mediating variable effect.
Compare your result with the class example posting.
Extra: also try out the more 'sophisticated' functions in the mediation package.
Solution for question 1

Question 2. Potential Outcomes, Encouragement Design Estimation and (Causal) Mediation
Task 1. Create a potential Outcomes dataset following the first ALICE specification in the posted slides (week 3) ## ALICE example beta = 3 rho = 3 tau = 1, delta = 3 (I did n=400; larger would be better so I redid with n = 6400)

Task 2. Use the artificial data to show the results for the mediation (indirect) effect by hand doing the 3 regressions using multilevel package (sobel) using MBESS package using the causal mediation estimation ACME from the mediation package and compare with rho*beta

Task 3 estimate beta by the Wald estimator (assuming tau = 0) and estimate mediation effect

Solution for question 2

Question 3. Sesame Street: Encouragement Design research example
Sesame Street research setting and data description given pdf p.30 of Lecture 1 (also Gelman text).
For this exercise use postnumb : posttest on numbers (0-54), along with the measures encour and regular from the class example in Lecture 1.
Use the encouragement design formulation to estimate the effect on child cognitive development (postnumb here) of watching more Sesame Street.
What assumption is necessary for the IV estimation in this design?
Obtain a point and interval estimate for the effect of viewing (use ivreg as in class example).
From simple descriptives reproduce this instrumental variables estimate (Wald estimator).
The second approach (path analysis) analyzed by Holland requires what assumption?
Obtain the path analyses (regression) estimate for the effect on child cognitive development (postnumb here) of watching more Sesame Street.
Compare with the IV estimate (which employs different assumptions).
Solution for question 3

Week 2

Moderating Variables in experimental studies (heterogeneous treatment effects)

Lecture slides, week 2 (pdf)
Audio companion, week 2
parta  partb   partc
Lecture topics
0. Moderation, mediation recap slide
1. Review: formulation and purposes of analysis of covariance
    basic (old) ancova exposition slides           ancova and extensions, math notes
   High School and Beyond (observational study) school means data example HSB ancova handout (ascii version)      data for HSB ancova     HSB ancova, scanned pdf
2. Moderating variables, Heterogeneous Treatment Effects (CATE).
      Analyzing treatment effects as a function of covariate(s)
     CNRL, including Johnson-Neyman technique   cnrl data   cnrl analysis (extended)

Primary Readings
Ancova and extensions   
Rogosa, D. R. (1980). Comparing nonparallel regression lines.   Psychological Bulletin, 88, 307-321. [a better quality scan from the APA site]
R resources (below).

Moderation research examples:
       Gender differences in effectiveness of aspirin.
 Aspirin may be less effective heart treatment for women than men
Publication:      Aspirin Resistance in Patients with Stable Coronary Artery Disease, in the Annals of Pharmacotherapy April 2007
     Moderating variables can be your friend (statistics is the only friend you need)           music: I've got friends in low places
Wash Post: Why smart people are better off with fewer friends .
Publication: Country roads, take me home... to my friends: How intelligence, population density, and friendship affect modern happiness.   British Journal of Psychology 2016
    ATI research
Snow R.E. (1978) Aptitude-Treatment Interactions in Educational Research. In: Pervin L.A., Lewis M. (eds) Perspectives in Interactional Psychology. Springer, Boston, MA.

R implementations and Resources
package probemod    manual
package interactions    intro     vignette: Exploring interactions with continuous predictors in regression models    manual

Additional Resources,  Ancova and extensions
Improving Present Practices in the Visual Display of Interactions Advances in Methods and Practices in Psychological Science
      analysis of covariance: Background/historical papers:
Covariance Adjustment in Randomized Experiments and Observational Studies Paul R. Rosenbaum Statistical Science, Vol. 17, No. 3. (Aug., 2002), pp. 286-304.   Jstor
Some Aspects of Analysis of Covariance, A Biometrics Invited Paper with Discussion. D. R. Cox; P. McCullagh Biometrics, Vol. 38, No. 3, (Sep., 1982), pp. 541-561.   Jstor
Analysis of Covariance: Its Nature and Uses William G. Cochran Biometrics, Vol. 13, No. 3, Special Issue on the Analysis of Covariance. (Sep., 1957), pp. 261-281. Jstor
The Use of Covariance in Observational Studies W. G. Cochran Applied Statistics, Vol. 18, No. 3. (1969), pp. 270-275. Jstor
Estimation of the Slope and Analysis of Covariance when the Concomitant Variable is Measured with Error James S. Degracie; Wayne A. Fuller Journal of the American Statistical Association, Vol. 67, No. 340. (Dec., 1972), pp. 930-937. Jstor
Deep background Neter-Wasserman text (Applied linear statistical models. Neter, Kutner, Nachtsheim and Wasserman 1996. Fifth edition. Homewood IL: Irwin, Inc.) chapters 22 and 8.
     Johnson-Neyman technique and aptitude-treatment interaction (ATI)
Cronbach, L. J., & Snow, R. E. (1977). Aptitudes and instructional methods: A handbook for research on interactions. Irvington
Regions of Significant Criterion Differences in Aptitude-Treatment-Interaction Research Leonard S. Cahen; Robert L. Linn American Educational Research Journal, Vol. 8, No. 3. (May, 1971), pp. 521-530. Jstor
Identifying Regions of Significance in Aptitude-by-Treatment-Interaction Research Ronald C. Serlin; Joel R. Levin American Educational Research Journal, Vol. 17, No. 3. (Autumn, 1980), pp. 389-399. Jstor
Defining Johnson-Neyman Regions of Significance in the Three-Covariate ANCOVA Using Mathematica Steve Hunka; Jacqueline Leighton Journal of Educational and Behavioral Statistics, Vol. 22, No. 4. (Winter, 1997), pp. 361-387.  Jstor
discussion of substantive issues: Trait-Treatment Interaction and Learning David C. Berliner; Leonard S. Cahen Review of Research in Education, Vol. 1. (1973), pp. 58-94. Jstor

Week 2 Review Questions
[more to be posted]

Question 1. Background: standard analysis of covariance.(no moderating variable)

A researcher is studying the effect of an incentive on the retention of subject matter and is also interested in the role of time devoted to study.
Subjects are randomly assigned to two groups, one receiving (C3 = 1) and the other not receiving (C3 = 0) an incentive. Within these groups, subjects are randomly assigned to 5, 10, 15, or 20 minutes of study (C2) of a passage specifically prepared for the experiment. At the end of the study period, a test of retention is administered.
Treat the study time as a covariate for investigating the differential effects of the incentive.   Does using the covariate improve precision in estimating the effect of incentive?
Does the ancova assumption of a constant treatment effect at levels of StudyMin appear reasonable? full data are in file retention.dat
Solution for question 1

Question 2. Revisit High School and Beyond ancova from Week 2 lecture

In the class example we used school level (mean, gradient) outcomes and used school mean ses as a covariate. Investigate the usefulness of that covariate by comparing the ancova in class example with just a simple t-test (sector) on these school level outcomes. What is the difference in precision between using the covariate or not? As this is not an RCT (revisit in Unit 2), also look at differences in the estimate of the sector effect (bias?).
Solution for question 2

Question 3. Comparing Regressions (demonstration data, not an RCT)

Let's give recognition to the guys who made S (and R) and take some data from Venables, W. N. and Ripley, B. D. (1999) Modern Applied Statistics with S-PLUS. Third Edition. Springer (now up to 4th edition). Chap 6 section 1 considers analysis of the data set whiteside (available as part of MASS subset of VR package) to access
> library(MASS) # do need to load library, MASS is part of base R > data(whiteside) > ?whiteside
Mr Derek Whiteside of the UK Building Research Station recorded the weekly gas consumption and average external temperature at his own house in south-east England for two heating seasons, one of 26 weeks before, and one of 30 weeks after cavity-wall insulation was installed. The object of the exercise was to assess the effect of the insulation on gas consumption.
Format The whiteside data frame has 56 rows and 3 columns.:
Insul A factor, before or after insulation.
Temp Purportedly the average outside temperature in degrees Celsius. (These values is far too low for any 56-week period in the 1960s in South-East England. It might be the weekly average of daily minima.)
Gas The weekly gas consumption in 1000s of cubic feet.
Source. A data set collected in the 1960s by Mr Derek Whiteside of the UK Building Research Station. Reported by Hand, D. J., Daly, F., McConway, K., Lunn, D. and Ostrowski, E. eds (1993) A Handbook of Small Data Sets. Chapman & Hall, p. 69.

carry out a comparing regressions analysis with Insul as the group variable, Gas as outcome, and Temp as within-group predictor.
construct a 95% confidence interval for the effect of insul on on gas with temp = 4 (pick-a-point procedure)
for what values of temp does there appear to be an effect of Insul on Gas (simultaneous region of significance)
Solution for question 3

Question 4.