Stat 209-- Course Files, Readings, Examples

Week 1--Course Introduction; properties of regression models

Lecture topics
Quick Tour of course logistics and course materials
Main Topic: Meaning of regression coefficients: simple and multiple regression (including logistic)
Technical facts and foibles:
a. adjusted variables and regression coefficients--values of coefficients depend crucially on what else is used in the regression fit;   conditioning vs controlling
b. effects of errors in measurement on regression coefficients
c. standardized regression coefficients.

Lecture materials
MT woes of regression coefficients slides
a. Class Handout. Coleman data: adjusted-variables multiple regression (ascii version)      Coleman scanned pdf
       Additional materials: data file, 20 schools      using pairs command
      Adjusted variable plot         Standard regression diagnostic plots, Coleman regression
     Added (adjusted) Variable plots in various R-packages, car,olsrr  Coleman avPlots in car
      nice vignette on regression diagnostics, including added variable plots, from the olsrr package.
slide for regression recursion
b. Class handout: Week 1 Math facts, Measurement error: Basic Results handout
     Also   Faraway book (linked below) Ch.4 single predictor case;    Maindonald-Braun sec6.7 results and R-functions, Stigler example
c. Class handout: Standardized regression coefficients standardized variables
            (aside "beta weights" in Kool-Aid Psychology Scientific American, Jan 2010)
       Hooke's Law example in Statistical Models for Causation: A critical review    

Regression examples, publications:
a.  Do Breast-Fed Baby Boys Grow Into Better Students?   Publication: Breastfeeding Duration and Academic Achievement at 10 Years (Stanford access). Wendy H. Oddy, Jianghong Li, Andrew J. O. Whitehouse, Stephen R. Zubrick, Eva Malacova. Pediatrics; Vol 127, Numb 1, Jan 2011   
        Ohio State breastfeeding study. Is breast truly best? Estimating the effects of breastfeeding on long-term child health and wellbeing in the United States using sibling comparisons Cynthia G. Colen, , David M. Ramey Social Science & Medicine Volume 109, May 2014, Pages 55-65.
Ohio State press release.   Breast-feeding Benefits Appear to be Overstated
b.  Pediatrics 2006;117;1018-1027   Sexy Media Matter: Exposure to Sexual Content in Music, Movies, Television, and Magazines Predicts Black and White Adolescents' Sexual Behavior  (Stanford access)  

Week 1 Readings
Primary Readings
Background piece: Correlation and Causation: A Comment, (Stanford access) Stephen Stigler Perspectives in Biology and Medicine, volume 48, number 1 supplement (winter 2005)
Freedman text Ch. 1 (esp. Yule on paupers, Snow on Cholera and Sec 1.5);(Ch 2-5 are advanced review of regression models)
   Chap 1 exs also in From Association to Causation: Some Remarks on the History of Statistics;  

Additional Resources
Mosteller-Tukey, Chap 13 (Woes of regression coefficients)
Practical Regression and Anova using R Julian J. Faraway,   chapter 4. errors in predictors
MB 3rd ed Ch.6. esp 6.2.2 adjusted variables; 6.2 Interpreting regression coefficients; 6.7 errors in variables
Background info, errors in variables. Short primer on test reliability  (Wm Trochin, Cornell)  Informal exposition in Shoe Shopping and the Reliability Coefficient    extensive technical material in Chap 7 Revelle text
       Source technical papers:   Errors of Measurement in Statistics, W. G. Cochran , Technometrics, Vol. 10, No. 4. (Nov., 1968), pp. 637-666. JStor URL esp sections 8,9,11
Some Effects of Errors of Measurement on Multiple Correlation, W. G. Cochran Journal of the American Statistical Association Vol. 65, No. 329 (Mar., 1970), pp. 22-34 JStor URL esp sec 8 discussion.
An overview of latent variables in Ch 1 of Generalized Latent Variable Modeling Multilevel, Longitudinal, and Structural Equation Models Anders Skrondal and Sophia Rabe-Hesketh Chapman and Hall/CRC 2004

Week 2-- Association vs Causation; Experiments vs observational studies; Neyman-Rubin-Holland formulation

In the news
1.  Depression in girls linked to higher use of social media (Guardian)      Social media linked to higher risk of depression in teen girls (Reuters).   Publication: Social Media Use and Adolescent Mental Health: Findings From the UK Millennium Cohort Study  EClinicalMedicine published by The Lancet, 2019  has multiple regression and path analysis, wow.
2. perennial favorite Spurious Correlation examples   
From 2018
Correlation study.  New study finds sweary people are more honest    Publication: Frankly, We Do Give a Damn: The Relationship Between Profanity and Honesty, Social Psychological and Personality Science.
    Recent (last spring). RCT (cross-over design Week 9). Damn right! The secret of success is swearing: How shouting four letter words can help make you stronger    Swearing can help you boost your physical performance    The full power of swearing is starting to be discovered

Lecture topics
       From week 1, Standardized variables and regression
Week 2
    Third-variable Topics
Class handout: Third Variables
A. Spurious Correlation: some historical notes; partial and part correlations. (class slides)
B. Simpson's paradox wiki page  Kidney stone example (dichotomous outcome slide)
C. Mediating/moderating variables  David Kenny web page   mediation handout   data analysis example   c.f. R-packages multilevel, MBESS, mediation
    First pass: experiments vs observational studies
Class handout: Neyman-Rubin-Holland
D. Design Trumps Analysis.    Rubin paper     Rubin talk .   Other exs: Breast-feeding,   Knee surgery.
E. Surveys of results from experimental and observational studies (see HRT, Mosteller below)
F. Introduction to Neyman-Rubin-Holland formulation (potential outcomes) for causal effects.
       Imbens and Rubin text (linked on main page) Chap 1.
       presentation of NRH formulation for comparative studies based on Appendix of Holland (1988). Class handout.
      shorter (modern) versions of ATE, ATT intros:    Causal inference from Harvard (slides 1-12);
           treatment effects from MIT (pages 1-4; handout pp.2-3);     Wooldridge, estimating average treatment effects from Michigan State.
      Wellesley example, Science Table, pp. 16-22.
G. Encouragement Designs: example of potential outcomes formulation.
       Illustration using encouragement design representation in Holland (1988).    copies of selected overheads.
       Encouragement Designs. Potential outcomes formulation and parameter estimation (Holland, 1988).    Estimation handout

Primary Readings
1. A multi-decade example: Experiments vs Observational studies, Hormone Replacement Therapy
   D.B. Petitti and D.A. Freedman. Invited commentary: How far can epidemiologists get with statistical adjustment? American Journal of Epidemiology vol. 162 (2005) pp. 415-18.       Freedman handout page
2. Freedman text Ch. 1 (esp Snow on Cholera and Sec 1.5) value of modeling Chap 10; response schedules sec 6.4
or online from week 1   Freedman Chap 1 exs also in From Association to Causation: Some Remarks on the History of Statistics;  
or   more on response schedules (text sec 5.4) in Statistical Models for Causation: A critical review    
    and   Statistical Models and Shoe Leather, Sociological Methodology, Vol. 21. (1991), pp. 291-313. JStor link
3. Paul Holland, Causal Effects and Encouragement Designs. Causal Inference, Path Analysis, and Recursive Structural Equations Models Paul W. Holland Sociological Methodology, Vol. 18. (1988), pp. 449-484.
Holland Appendix (esp pp. 475-480) presents the potential outcomes formulation.
Abstract Rubin's model for causal inference in experiments and observational studies is enlarged to analyze the problem of "causes causing causes" and is compared to path analysis and recursive structural equations models. A special quasi-experimental design, the encouragement design, is used to give concreteness to the discussion by focusing on the simplest problem that involves both direct and indirect causation. It is shown that Rubin's model extends easily to this situation and specifies conditions under which the parameters of path analysis and recursive structural equations models have causal interpretations.

Additional Resources
Spurious correlation?
R-Package ppcor October 29, 2012 Title Partial and Semi-partial (Part) correlation
Correlations Genuine and Spurious in Pearson and Yule, John Aldrich Statistical Science, Vol. 10, No. 4. (Nov., 1995), pp. 364-376.  Jstor link
Spurious Correlation: A Causal Interpretation. Herbert A. Simon Journal of the American Statistical Association, Vol. 49, No. 267. (Sep., 1954), pp. 467-479. Jstor link

Simpson's Paradox.
R-package Simpsons.   Frontiers in Psychology. 2013; 4: 513. Simpson's paradox in psychological science: a practical guide

Experiments vs Observational studies:
Mosteller-Tukey Ch. 13 (esp sec 13G)
Intent-to-treat Analysis of Randomized Clinical Trials Michael P. LaValley Boston University ACR/ARHP Annual Scientific Meeting Orlando 10/27/2003
Bringing Evidence-Driven Progress To Education:    Coalition for Evidence-Based Policy          
Overdoing a good thing? Evidence-based medicine.    Hazardous journey Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials Gordon C S Smith, professor, Jill P Pell, consultant BMJ 2003;327:1459-1461
Classic paper on Medical experimentation. Statistics and Ethics in Surgery and Anesthesia. John P. Gilbert; Bucknam McPeek; Frederick Mosteller Science, New Series, Vol. 198, No. 4318. (Nov. 18, 1977), pp. 684-689.     JTSOR link

mediating/moderating variables
R-implementations: Barron-Kenny method via Sobel function in the multilevel package.  More extensive implementation (incl BCa bootstrapping) function mediation in package MBESS Ken Kelley; power and sample size calculations in package powerMediation
NEW and improved  mediation package. Causal Mediation Analysis Using R   This package (and pubs) takes the topic up a large level of complexity/capabilities
additional technical papers. Causal Mediation Analysis Using R K. Imai, L. Keele, D. Tingley, and T. Yamamoto    American Political Science Review Vol. 105, No. 4 November 2011 Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies
Mediation Analysis David P. MacKinnon, Amanda J. Fairchild, and Matthew S. Fritz Department of Psychology, Arizona State University, Tempe, Arizona 85287-1104; Annu. Rev. Psychol. 2007. 58:593–614
Mediators and Moderators of Treatment Effects in Randomized Clinical Trials Helena Chmura Kraemer; G. Terence Wilson; Christopher G. Fairburn; W. Stewart Agras Arch Gen Psychiatry. 2002;59:877-883
MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M.,West, S. G., Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83-104.

Neyman-Rubin-Holland models for comparative experiments (causal inference)
Rosenbaum Ch 2 (esp 2.5)
Statistics and Causal Inference, Paul W. Holland pp. 945-960 JASA 1986, another JSTOR link
Commentaries Donald Rubin, David Cox
Rubin, D. B., 1974, Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies, Journal of Educational Psychology, 66, 688-701.
Direct and Indirect Causal Effects via Potential Outcomes Donald B. Rubin Scandinavian Journal of Statistics Volume 31, Issue 2, Page 161-170, Jun 2004 .
Causal Inference, Annotated Bibliography - Oregon Research Institute Winship's repository Counterfactual Causal Analysis in Sociology (link is broken even from his own webpage)
Counterfactuals Stanford Encyclopedia of Philosophy Counterfactual Theories of Causation   wiki page     long Nancy Cartwright

Week 3-- Path analysis and causal modeling  multiple regression with pictures