Stat 209B-- Lectures, Course Files, and Readings
Course introduction (lecture and audio posted on main page)
1. Correlation and Causation: A Comment, Stephen Stigler
Perspectives in Biology and Medicine, volume 48, number 1 supplement (winter 2005)
2. Secret to Winning a Nobel Prize? Eat More Chocolate (Time) Publication: Chocolate Consumption, Cognitive Function, and Nobel Laureates Franz H. Messerli, M.D. N Engl J Med 2012; 367:1562-1564 October 18, 2012
3. David Freedman chapters.
From Association to Causation: Some Remarks on the History of Statistics;
Statistical Models for Causation: A critical review
Statistical Models and Shoe Leather, Sociological Methodology, Vol. 21. (1991), pp. 291-313. JStor link
1. Encouragement Designs: example of potential outcomes formulation.
Illustration using encouragement design representation in Holland (1988). copies of selected overheads.
Encouragement Designs. Potential outcomes formulation and IV parameter estimation in Holland (1988). Estimation handout
Do regression methods (path analysis) identify causal effects? Demonstrations of failure for Holland's encouragement design.
class handout Encouragement design slides
Paul Holland, Causal Effects and Encouragement Designs. Causal Inference, Path Analysis, and Recursive Structural Equations Models
Paul W. Holland Sociological Methodology, Vol. 18. (1988), pp. 449-484. (Encouragement design results; sections 3-5)
Holland Appendix (esp pp. 475-480) presents the potential outcomes formulation.
Abstract Rubin's model for causal inference in experiments and observational studies is enlarged to analyze the problem of "causes causing causes" and is compared to path analysis and recursive structural equations models. A special quasi-experimental design, the encouragement design, is used to give concreteness to the discussion by focusing on the simplest problem that involves both direct and indirect causation. It is shown that Rubin's model extends easily to this situation and specifies conditions under which the parameters of path analysis and recursive structural equations models have causal interpretations.
Encouragement Design research examples:
Sesamee Street evaluation
Gelman-Hill text sec 10.5; Data Analysis Using Regression and Multilevel/Hierarchical Models
Salt and Blood Pressure clinical trial
Publication: Feasibility and efficacy of sodium reduction in the Trials of Hypertension Prevention, phase I
Trials of Hypertension Prevention Collaborative Research Group. S K Kumanyika, P R Hebert, J A Cutler, V I Lasser, C P Sugars, L Steffen-Batey, A A Brewer, MI.
Hypertension doi: 10.1161/01.HYP.22.4.5021993;22:502-512
2. Mediating (process) variables
Historical (Barron-Kenny) methods David Kenny web page
R-implementations: mediating variables
Barron-Kenny method via Sobel function in the multilevel package.
More extensive implementation (incl BCa bootstrapping) function mediation in package MBESS Ken Kelley;
power and sample size calculations in package powerMediation
mediation package. takes the topic up a large level of complexity/capabilities
data analysis example data file
Vignette for mediation package Causal Mediation Analysis Using R
Mediation Analysis David P. MacKinnon, Amanda J. Fairchild,
and Matthew S. Fritz Department of Psychology, Arizona State University, Tempe, Arizona 85287-1104; Annu. Rev. Psychol. 2007. 58:593-614
Mediation research examples:
Brader T, Valentino NA, Suhat E (2008). What Triggers Public Opposition to Immigration?
Anxiety, Group Cues, and Immigration." American Journal of Political Science, 52(4),
959-978. jstor link
Data in mediation package; data description and analyses in mediation package vignette (linked below)
Bench Science vs Path Analysis: Exercise and Alzheimers
NYTimes:How Exercise May Help Keep Our Memory Sharp .
Publication: Exercise-linked FNDC5/irisin rescues synaptic plasticity and memory defects in Alzheimer's models Nature Medicine volume 25, pages165-175 (2019)
Stanford Medicine Common opioids less effective for patients on SSRI antidepressants Publication: Predicting inadequate postoperative pain management in depressed patients: A machine learning approach Arjun Parthipan,Imon Banerjee,Keith Humphreys,Steven M. Asch,Catherine Curtin,Ian Carroll ,Tina Hernandez-Boussard
Published: February 6, 2019https://doi.org/10.1371/journal.pone.0210575
New Yorker. December 23, 2013. The Power of the Hoodie-Wearing C.E.O. Publication: The Red Sneakers Effect: Inferring Status and Competence from Signals of Nonconformity
Author(s): Silvia Bellezza, Francesca Gino, and Anat Keinan Source: Journal of Consumer Research
Mediators and Moderators of Treatment Effects in Randomized Clinical Trials
Helena Chmura Kraemer; G. Terence Wilson; Christopher G. Fairburn; W. Stewart Agras
Arch Gen Psychiatry. 2002;59:877-883
additional technical papers. Causal Mediation Analysis Using R K. Imai, L. Keele, D. Tingley, and T. Yamamoto American Political Science Review Vol. 105, No. 4 November 2011
Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies
MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M.,West, S. G., Sheets, V. (2002). A comparison
of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83-104.
Useful expositions Using R
Chapter 14: Mediation and Moderation Alyssa Blair
Mediation and Moderation Analyses with R - OSF presentation slides
Week 1 Review Questions
Question 1. Mediating Variable Computations: Class example continued
The data set shown in class example ss423 is linked above and
in the legacy directory
for predictor (IV) 'belong' outcome 'depress' and (potential) mediating variable 'master'
The class example showed you the Baron-Kenny analysis using functions from
the multilevel and MBESS packages.
Here just use 'lm' basic regression and the recipees from the class handout
to recreate point estimates and asymptotic standard errors, significance tests
for the mediating variable effect.
Compare your result with the class example posting.
Extra: also try out the more 'sophisticated' functions in
the mediation package.
Question 2. Potential Outcomes, Encouragement Design Estimation and (Causal) Mediation
Task 1. Create a potential Outcomes dataset following the
first ALICE specification in the posted slides (week 3)
## ALICE example beta = 3 rho = 3 tau = 1, delta = 3
(I did n=400; larger would be better so I redid with n = 6400)
Task 2. Use the artificial data to show the results
for the mediation (indirect) effect
by hand doing the 3 regressions
using multilevel package (sobel)
using MBESS package
using the causal mediation estimation ACME from the mediation package
and compare with rho*beta
estimate beta by the Wald estimator (assuming tau = 0)
and estimate mediation effect
Question 3. Sesame Street: Encouragement Design research example
Sesame Street research setting and data description given pdf p.30 of Lecture 1 (also Gelman text).
For this exercise use postnumb : posttest on numbers (0-54), along with the measures encour and regular from the class example in Lecture 1.
Use the encouragement design formulation to estimate the effect on child cognitive development (postnumb here) of watching more Sesame Street.
What assumption is necessary for the IV estimation in this design?
Obtain a point and interval estimate for the effect of viewing (use ivreg as in class example).
From simple descriptives reproduce this instrumental variables estimate (Wald estimator).
The second approach (path analysis) analyzed by Holland requires what assumption?
Obtain the path analyses (regression) estimate for the effect on child cognitive development (postnumb here) of watching more Sesame Street.
Compare with the IV estimate (which employs different assumptions).
Moderating Variables in experimental studies (heterogeneous treatment effects)
0. Moderation, mediation recap slide
1. Review: formulation and purposes of analysis of covariance
basic (old) ancova exposition slides ancova and extensions, math notes
High School and Beyond (observational study) school means data example HSB ancova handout (ascii version) data for HSB ancova HSB ancova, scanned pdf
2. Moderating variables, Heterogeneous Treatment Effects (CATE).
Analyzing treatment effects as a function of covariate(s)
CNRL, including Johnson-Neyman technique cnrl data cnrl analysis (extended)
Ancova and extensions
Rogosa, D. R. (1980). Comparing nonparallel regression lines. Psychological Bulletin, 88, 307-321. [a better quality scan from the APA site]
R resources (below).
Moderation research examples:
Gender differences in effectiveness of aspirin.
Aspirin may be less effective heart treatment for women than men
Publication: Aspirin Resistance in Patients with Stable Coronary Artery Disease, in the Annals of Pharmacotherapy April 2007
Moderating variables can be your friend (statistics is the only friend you need) music: I've got friends in low places
Wash Post: Why smart people are better off with fewer friends .
Publication: Country roads, take me home... to my friends: How intelligence, population density, and friendship affect modern happiness.
British Journal of Psychology 2016
Snow R.E. (1978) Aptitude-Treatment Interactions in Educational Research. In: Pervin L.A., Lewis M. (eds)
Perspectives in Interactional Psychology. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-3997-7_10
R implementations and Resources
package probemod manual
package interactions intro
vignette: Exploring interactions with continuous predictors in regression models manual
Additional Resources, Ancova and extensions
Improving Present Practices in the Visual Display of Interactions Advances in Methods and Practices in Psychological Science
analysis of covariance: Background/historical papers:
Covariance Adjustment in Randomized Experiments and Observational Studies Paul R. Rosenbaum Statistical Science, Vol. 17, No. 3. (Aug., 2002), pp. 286-304. Jstor
Some Aspects of Analysis of Covariance, A Biometrics Invited Paper with Discussion. D. R. Cox; P. McCullagh Biometrics, Vol. 38, No. 3, (Sep., 1982), pp. 541-561. Jstor
Analysis of Covariance: Its Nature and Uses William G. Cochran Biometrics, Vol. 13, No. 3, Special Issue on the Analysis of Covariance. (Sep., 1957), pp. 261-281. Jstor
The Use of Covariance in Observational Studies W. G. Cochran Applied Statistics, Vol. 18, No. 3. (1969), pp. 270-275. Jstor
Estimation of the Slope and Analysis of Covariance when the Concomitant Variable is Measured with Error James S. Degracie; Wayne A. Fuller Journal of the American Statistical Association, Vol. 67, No. 340. (Dec., 1972), pp. 930-937. Jstor
Deep background Neter-Wasserman text (Applied linear statistical models. Neter, Kutner, Nachtsheim and Wasserman 1996. Fifth edition. Homewood IL: Irwin, Inc.) chapters 22 and 8.
Johnson-Neyman technique and aptitude-treatment interaction (ATI)
Cronbach, L. J., & Snow, R. E. (1977). Aptitudes and instructional methods: A handbook for research on interactions. Irvington
Regions of Significant Criterion Differences in Aptitude-Treatment-Interaction Research Leonard S. Cahen; Robert L. Linn American Educational Research Journal, Vol. 8, No. 3. (May, 1971), pp. 521-530. Jstor
Identifying Regions of Significance in Aptitude-by-Treatment-Interaction Research Ronald C. Serlin; Joel R. Levin American Educational Research Journal, Vol. 17, No. 3. (Autumn, 1980), pp. 389-399. Jstor
Defining Johnson-Neyman Regions of Significance in the Three-Covariate ANCOVA Using Mathematica Steve Hunka; Jacqueline Leighton Journal of Educational and Behavioral Statistics, Vol. 22, No. 4. (Winter, 1997), pp. 361-387. Jstor
discussion of substantive issues: Trait-Treatment Interaction and Learning David C. Berliner; Leonard S. Cahen Review of Research in Education, Vol. 1. (1973), pp. 58-94. Jstor
Week 2 Review Questions
[more to be posted]
Question 1. Background: standard analysis of covariance.(no moderating variable)
A researcher is studying the effect of an incentive on the
retention of subject matter and is also interested in the role of
time devoted to study.
Subjects are randomly assigned to two groups,
one receiving (C3 = 1) and the other not receiving (C3 = 0) an
incentive. Within these groups, subjects are randomly assigned to 5,
10, 15, or 20 minutes of study (C2) of a passage specifically
prepared for the experiment. At the end of the study period, a test
of retention is administered.
Treat the study time as a
covariate for investigating the differential effects of the
incentive. Does using the covariate improve precision in estimating the effect of incentive?
Does the ancova assumption of a constant treatment effect at levels of StudyMin appear reasonable?
full data are in file retention.dat
Question 2. Revisit High School and Beyond ancova from Week 2 lecture
In the class example we used school level (mean, gradient)
outcomes and used school mean ses as a covariate.
Investigate the usefulness of that covariate by comparing
the ancova in class example with just a simple t-test (sector) on these school
level outcomes. What is the difference in precision between using the covariate or not?
As this is not an RCT (revisit in Unit 2), also look at differences in the estimate of the sector effect (bias?).
Question 3. Comparing Regressions (demonstration data, not an RCT)
Let's give recognition to the guys who made S (and R)
and take some data from
Venables, W. N. and Ripley, B. D. (1999) Modern Applied
Statistics with S-PLUS. Third Edition. Springer
(now up to 4th edition). Chap 6 section 1 considers
analysis of the data set whiteside (available as part
of MASS subset of VR package)
> library(MASS) # do need to load library, MASS is part of base R
Mr Derek Whiteside of the UK Building Research Station recorded
the weekly gas consumption and average external temperature at
his own house in south-east England for two heating seasons, one
of 26 weeks before, and one of 30 weeks after cavity-wall
insulation was installed. The object of the exercise was to
assess the effect of the insulation on gas consumption.
The whiteside data frame has 56 rows and 3 columns.:
Insul A factor, before or after insulation.
Temp Purportedly the average outside temperature in degrees
Celsius. (These values is far too low for any 56-week period in
the 1960s in South-East England. It might be the weekly average
of daily minima.)
Gas The weekly gas consumption in 1000s of cubic feet.
A data set collected in the 1960s by
Mr Derek Whiteside of the UK Building Research Station.
Hand, D. J., Daly, F., McConway, K., Lunn, D. and Ostrowski, E. eds (1993)
A Handbook of Small Data Sets. Chapman & Hall, p. 69.
carry out a comparing regressions analysis with Insul as the group variable,
Gas as outcome, and Temp as within-group predictor.
construct a 95% confidence interval for the effect of insul on on gas
with temp = 4 (pick-a-point procedure)
for what values of temp does there appear to be an effect of Insul
on Gas (simultaneous region of significance)