Lectures, Course Files, and Readings, Stat209B 2021

Stat 209B-- Lectures, Course Files, and Readings

Week 0
Course introduction (lecture and audio posted on main page)
Background readings
1. Correlation and Causation: A Comment, Stephen Stigler Perspectives in Biology and Medicine, volume 48, number 1 supplement (winter 2005)
2.    Secret to Winning a Nobel Prize? Eat More Chocolate (Time)   Publication: Chocolate Consumption, Cognitive Function, and Nobel Laureates Franz H. Messerli, M.D. N Engl J Med 2012; 367:1562-1564 October 18, 2012
3. David Freedman chapters.
   From Association to Causation: Some Remarks on the History of Statistics;
   Statistical Models for Causation: A critical review
   Statistical Models and Shoe Leather, Sociological Methodology, Vol. 21. (1991), pp. 291-313. JStor link

Week 1

Lecture slides, week 1 (pdf)
Audio companion, week 1
parta partb partc

1. Encouragement Designs: example of potential outcomes formulation.

Lecture Topics
Illustration using encouragement design representation in Holland (1988).    copies of selected overheads.
Encouragement Designs. Potential outcomes formulation and IV parameter estimation in Holland (1988).    Estimation handout
Do regression methods (path analysis) identify causal effects? Demonstrations of failure for Holland's encouragement design.    class handout    Encouragement design slides

Primary Readings
Paul Holland, Causal Effects and Encouragement Designs. Causal Inference, Path Analysis, and Recursive Structural Equations Models
Paul W. Holland Sociological Methodology, Vol. 18. (1988), pp. 449-484. (Encouragement design results; sections 3-5)
Holland Appendix (esp pp. 475-480) presents the potential outcomes formulation.
Abstract Rubin's model for causal inference in experiments and observational studies is enlarged to analyze the problem of "causes causing causes" and is compared to path analysis and recursive structural equations models. A special quasi-experimental design, the encouragement design, is used to give concreteness to the discussion by focusing on the simplest problem that involves both direct and indirect causation. It is shown that Rubin's model extends easily to this situation and specifies conditions under which the parameters of path analysis and recursive structural equations models have causal interpretations.

Encouragement Design research examples:
   Sesamee Street evaluation
Gelman-Hill text sec 10.5; Data Analysis Using Regression and Multilevel/Hierarchical Models
   Salt and Blood Pressure clinical trial
Publication: Feasibility and efficacy of sodium reduction in the Trials of Hypertension Prevention, phase I Trials of Hypertension Prevention Collaborative Research Group. S K Kumanyika, P R Hebert, J A Cutler, V I Lasser, C P Sugars, L Steffen-Batey, A A Brewer, MI. Hypertension doi: 10.1161/01.HYP.22.4.5021993;22:502-512

2. Mediating (process) variables

Lecture Topics
Historical (Barron-Kenny) methods David Kenny web page
R-implementations: mediating variables         data analysis example    data file
    Barron-Kenny method via Sobel function in the multilevel package.
    More extensive implementation (incl BCa bootstrapping) function mediation in package MBESS Ken Kelley;
    power and sample size calculations in package powerMediation
    mediation package. takes the topic up a large level of complexity/capabilities

Primary Readings
Vignette for mediation package Causal Mediation Analysis Using R
Mediation Analysis David P. MacKinnon, Amanda J. Fairchild, and Matthew S. Fritz Department of Psychology, Arizona State University, Tempe, Arizona 85287-1104; Annu. Rev. Psychol. 2007. 58:593-614

Mediation research examples:
  Framing experiment
Brader T, Valentino NA, Suhat E (2008). What Triggers Public Opposition to Immigration? Anxiety, Group Cues, and Immigration." American Journal of Political Science, 52(4), 959-978.  jstor link
Data in mediation package; data description and analyses in mediation package vignette (linked below)
  Bench Science vs Path Analysis: Exercise and Alzheimers
NYTimes:How Exercise May Help Keep Our Memory Sharp .
Publication: Exercise-linked FNDC5/irisin rescues synaptic plasticity and memory defects in Alzheimer's models   Nature Medicine volume 25, pages165-175 (2019)
  Mediated moderation?
   Stanford Medicine     Common opioids less effective for patients on SSRI antidepressants    Publication: Predicting inadequate postoperative pain management in depressed patients: A machine learning approach Arjun Parthipan,Imon Banerjee,Keith Humphreys,Steven M. Asch,Catherine Curtin,Ian Carroll ,Tina Hernandez-Boussard Published: February 6, 2019https://doi.org/10.1371/journal.pone.0210575
   New Yorker. December 23, 2013. The Power of the Hoodie-Wearing C.E.O.    Publication: The Red Sneakers Effect: Inferring Status and Competence from Signals of Nonconformity Author(s): Silvia Bellezza, Francesca Gino, and Anat Keinan Source: Journal of Consumer Research


Additional Resources
Mediators and Moderators of Treatment Effects in Randomized Clinical Trials Helena Chmura Kraemer; G. Terence Wilson; Christopher G. Fairburn; W. Stewart Agras Arch Gen Psychiatry. 2002;59:877-883
additional technical papers. Causal Mediation Analysis Using R K. Imai, L. Keele, D. Tingley, and T. Yamamoto    American Political Science Review Vol. 105, No. 4 November 2011 Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies
MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M.,West, S. G., Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83-104.
      Useful expositions Using R
Chapter 14: Mediation and Moderation Alyssa Blair
Mediation and Moderation Analyses with R - OSF presentation slides

Week 1 Review Questions

Question 1. Mediating Variable Computations: Class example continued
The data set shown in class example ss423 is linked above and in the legacy directory http://web.stanford.edu/~rag/stat209/ss423
for predictor (IV) 'belong' outcome 'depress' and (potential) mediating variable 'master' The class example showed you the Baron-Kenny analysis using functions from the multilevel and MBESS packages.
Here just use 'lm' basic regression and the recipees from the class handout to recreate point estimates and asymptotic standard errors, significance tests for the mediating variable effect.
Compare your result with the class example posting.
Extra: also try out the more 'sophisticated' functions in the mediation package.

Solution for question 1

Question 2. Potential Outcomes, Encouragement Design Estimation and (Causal) Mediation
Task 1. Create a potential Outcomes dataset following the first ALICE specification in the posted slides (week 3) ## ALICE example beta = 3 rho = 3 tau = 1, delta = 3 (I did n=400; larger would be better so I redid with n = 6400)

Task 2. Use the artificial data to show the results for the mediation (indirect) effect by hand doing the 3 regressions using multilevel package (sobel) using MBESS package using the causal mediation estimation ACME from the mediation package and compare with rho*beta

Task 3 estimate beta by the Wald estimator (assuming tau = 0) and estimate mediation effect

Solution for question 2

Question 3. Sesame Street: Encouragement Design research example
Sesame Street research setting and data description given pdf p.30 of Lecture 1 (also Gelman text).
For this exercise use postnumb : posttest on numbers (0-54), along with the measures encour and regular from the class example in Lecture 1.
Use the encouragement design formulation to estimate the effect on child cognitive development (postnumb here) of watching more Sesame Street.
What assumption is necessary for the IV estimation in this design?
Obtain a point and interval estimate for the effect of viewing (use ivreg as in class example).
From simple descriptives reproduce this instrumental variables estimate (Wald estimator).
The second approach (path analysis) analyzed by Holland requires what assumption?
Obtain the path analyses (regression) estimate for the effect on child cognitive development (postnumb here) of watching more Sesame Street.
Compare with the IV estimate (which employs different assumptions).

Solution for question 3

Week 2

Moderating Variables in experimental studies (heterogeneous treatment effects)

Lecture slides, week 2 (pdf)
Audio companion, week 2
parta partb partc

Lecture topics
0. Moderation, mediation recap slide
1. Review: formulation and purposes of analysis of covariance
    basic (old) ancova exposition slides           ancova and extensions, math notes
   High School and Beyond (observational study) school means data example HSB ancova handout (ascii version)      data for HSB ancova     HSB ancova, scanned pdf
2. Moderating variables, Heterogeneous Treatment Effects (CATE).
      Analyzing treatment effects as a function of covariate(s)
     CNRL, including Johnson-Neyman technique   cnrl data   cnrl analysis (extended)

Primary Readings
Ancova and extensions
Rogosa, D. R. (1980). Comparing nonparallel regression lines. Psychological Bulletin, 88, 307-321. [a better quality scan from the APA site]
R resources (below).

Moderation research examples:
       Gender differences in effectiveness of aspirin.
Aspirin may be less effective heart treatment for women than men
Publication:      Aspirin Resistance in Patients with Stable Coronary Artery Disease, in the Annals of Pharmacotherapy April 2007
     Moderating variables can be your friend (statistics is the only friend you need)           music: I've got friends in low places
Wash Post: Why smart people are better off with fewer friends .
Publication: Country roads, take me home... to my friends: How intelligence, population density, and friendship affect modern happiness.   British Journal of Psychology 2016
    ATI research
Snow R.E. (1978) Aptitude-Treatment Interactions in Educational Research. In: Pervin L.A., Lewis M. (eds) Perspectives in Interactional Psychology. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-3997-7_10
     Family SES as a moderating variable in nature/nuture:
Why Rich Parents Don't Matter UTexas press release: Being Poor Can Suppress Children's Genetic Potentials    Publication: Emergence of a Gene × Socioeconomic Status Interaction on Infant Mental Ability Between 10 Months and 2 years DOI: 10.1177/0956797610392926 Psychological Science published online 17 December 2010 Elliot M. Tucker-Drob, Mijke Rhemtulla, K. Paige Harden, Eric Turkheimer and David Fask

R implementations and Resources
package probemod    manual
package interactions    intro     vignette: Exploring interactions with continuous predictors in regression models    manual

Additional Resources, Ancova and extensions
Improving Present Practices in the Visual Display of Interactions Advances in Methods and Practices in Psychological Science
      analysis of covariance: Background/historical papers:
Covariance Adjustment in Randomized Experiments and Observational Studies Paul R. Rosenbaum Statistical Science, Vol. 17, No. 3. (Aug., 2002), pp. 286-304.   Jstor
Some Aspects of Analysis of Covariance, A Biometrics Invited Paper with Discussion. D. R. Cox; P. McCullagh Biometrics, Vol. 38, No. 3, (Sep., 1982), pp. 541-561.   Jstor
Analysis of Covariance: Its Nature and Uses William G. Cochran Biometrics, Vol. 13, No. 3, Special Issue on the Analysis of Covariance. (Sep., 1957), pp. 261-281. Jstor
The Use of Covariance in Observational Studies W. G. Cochran Applied Statistics, Vol. 18, No. 3. (1969), pp. 270-275. Jstor
Estimation of the Slope and Analysis of Covariance when the Concomitant Variable is Measured with Error James S. Degracie; Wayne A. Fuller Journal of the American Statistical Association, Vol. 67, No. 340. (Dec., 1972), pp. 930-937. Jstor
Deep background Neter-Wasserman text (Applied linear statistical models. Neter, Kutner, Nachtsheim and Wasserman 1996. Fifth edition. Homewood IL: Irwin, Inc.) chapters 22 and 8.
     Johnson-Neyman technique and aptitude-treatment interaction (ATI)
Cronbach, L. J., & Snow, R. E. (1977). Aptitudes and instructional methods: A handbook for research on interactions. Irvington
Regions of Significant Criterion Differences in Aptitude-Treatment-Interaction Research Leonard S. Cahen; Robert L. Linn American Educational Research Journal, Vol. 8, No. 3. (May, 1971), pp. 521-530. Jstor
Identifying Regions of Significance in Aptitude-by-Treatment-Interaction Research Ronald C. Serlin; Joel R. Levin American Educational Research Journal, Vol. 17, No. 3. (Autumn, 1980), pp. 389-399. Jstor
Defining Johnson-Neyman Regions of Significance in the Three-Covariate ANCOVA Using Mathematica Steve Hunka; Jacqueline Leighton Journal of Educational and Behavioral Statistics, Vol. 22, No. 4. (Winter, 1997), pp. 361-387. Jstor
discussion of substantive issues: Trait-Treatment Interaction and Learning David C. Berliner; Leonard S. Cahen Review of Research in Education, Vol. 1. (1973), pp. 58-94. Jstor

Week 2 Review Questions

Question 1. Background: standard analysis of covariance.(no moderating variable)

A researcher is studying the effect of an incentive on the retention of subject matter and is also interested in the role of time devoted to study.
Subjects are randomly assigned to two groups, one receiving (C3 = 1) and the other not receiving (C3 = 0) an incentive. Within these groups, subjects are randomly assigned to 5, 10, 15, or 20 minutes of study (C2) of a passage specifically prepared for the experiment. At the end of the study period, a test of retention is administered.
Treat the study time as a covariate for investigating the differential effects of the incentive. Does using the covariate improve precision in estimating the effect of incentive?
Does the ancova assumption of a constant treatment effect at levels of StudyMin appear reasonable? full data are in file retention.dat http://statweb.stanford.edu/~rag/stat209/retention.dat

Solution for question 1

Question 2. Revisit High School and Beyond ancova from Week 2 lecture

In the class example we used school level (mean, gradient) outcomes and used school mean ses as a covariate. Investigate the usefulness of that covariate by comparing the ancova in class example with just a simple t-test (sector) on these school level outcomes. What is the difference in precision between using the covariate or not? As this is not an RCT (revisit in Unit 2), also look at differences in the estimate of the sector effect (bias?).

Solution for question 2

Question 3. Comparing Regressions (demonstration data, not an RCT)

Let's give recognition to the guys who made S (and R) and take some data from Venables, W. N. and Ripley, B. D. (1999) Modern Applied Statistics with S-PLUS. Third Edition. Springer (now up to 4th edition). Chap 6 section 1 considers analysis of the data set whiteside (available as part of MASS subset of VR package) to access
> library(MASS) # do need to load library, MASS is part of base R > data(whiteside) > ?whiteside
Description
Mr Derek Whiteside of the UK Building Research Station recorded the weekly gas consumption and average external temperature at his own house in south-east England for two heating seasons, one of 26 weeks before, and one of 30 weeks after cavity-wall insulation was installed. The object of the exercise was to assess the effect of the insulation on gas consumption.
Format The whiteside data frame has 56 rows and 3 columns.:
Insul A factor, before or after insulation.
Temp Purportedly the average outside temperature in degrees Celsius. (These values is far too low for any 56-week period in the 1960s in South-East England. It might be the weekly average of daily minima.)
Gas The weekly gas consumption in 1000s of cubic feet.
Source. A data set collected in the 1960s by Mr Derek Whiteside of the UK Building Research Station. Reported by Hand, D. J., Daly, F., McConway, K., Lunn, D. and Ostrowski, E. eds (1993) A Handbook of Small Data Sets. Chapman & Hall, p. 69.

carry out a comparing regressions analysis with Insul as the group variable, Gas as outcome, and Temp as within-group predictor.
construct a 95% confidence interval for the effect of insul on on gas with temp = 4 (pick-a-point procedure)
for what values of temp does there appear to be an effect of Insul on Gas (simultaneous region of significance)

Solution for question 3

Question 4. R packages interactions and probemod
In lecture there was short mention of these two R-packages that whose main functions are to carry out the pick-a-point and Johnson-Neyman claculations, which are developed in Rogosa(1980).
Try out these functions using the cnrl dataset (also from Rogosa,1980) which we worked out in the lecture materials.
Solutions spoiler alert: no joy from these packages.

Solution for question 4

Week 3

Lecture slides, week 3 (pdf)
week 3, part a (pdf)
week 3, part b (pdf)
Audio companion, week 3
parta partb

1. Compliance in RCT

Lecture topics
1. Compliance background: Intent-to-treat analyses, CACE estimators, research examples
2. Compliance and Dose-response data analysis (Efron-Feldman)
3. Rubin-Holland approach via Booil Jo presentation: Potential Outcomes Approach: A Brief Introduction
    Class handouts:   Compliance examples     Compliance overview     Compliance math notes     Little-Rubin Ann Rev Pub Health formulation

Primary Readings
Compliance Background: Intent-to-Treat (ITT), the FDA mandate.    simple definitions: wiki    Encyclopedia of epidemiology, Volume 1 (google books)
Potential outcomes formulation (CACE): Causal Effects in Clinical and Epidemiological Studies Via Potential Outcomes: Concepts and Analytical Approaches Roderick J. Little and and Donald B. Rubin Vol. Annual Review of Public Health, 21: 121-145, May 2000.
Epidemiology exposition: An introduction to instrumental variables for epidemiologists, Sander Greenland, International Journal of Epidemiology 2000;29:722-729

Compliance research examples.
     Clofibrate in Coronary Drug Project
Influence of adherence to treatment and response of cholesterol on mortality in the coronary drug project. New England Journal of Medicine Volume 303:1038-1041 October 30, 1980 Number 18
    Vitamin A in Central America
An introduction to instrumental variables for epidemiologists, Sander Greenland, International Journal of Epidemiology 2000;29:722-729
    Cholestyramine in Cholesterol trial (measured compliance)
Compliance as an Explanatory Variable in Clinical Trials. B. Efron; D. Feldman Journal of the American Statistical Association, Vol. 86, No. 413. (Mar., 1991), pp. 9-17. Jstor
    Draft Lottery and Vietnam Service
Joshua D. Angrist; Guido W. Imbens; Donald B. Rubin "Identification of Causal Effects Using Instrumental Variables" Journal of the American Statistical Association, Vol. 91, No. 434. (Jun., 1996), pp. 444-455. JStor

Additional resources
Compliance as an Explanatory Variable in Clinical Trials. B. Efron; D. Feldman Journal of the American Statistical Association, Vol. 86, No. 413. (Mar., 1991), pp. 9-17. Jstor
David Freedman on Compliance Adjustments:      Statistical Models for Causation: What Inferential Leverage Do They Provide? Evaluation Review 2006; 30: 691-713.       On regression adjustments to experimental data Advances in Applied Mathematics vol. 40 (2008) pp. 180-93.
Intent-to-treat Analysis of Randomized Clinical Trials Michael P. LaValley Boston University ACR/ARHP Annual Scientific Meeting Orlando 10/27/2003
Intention to treat--who should use ITT? J. A. Lewis and D. Machin Br J Cancer. 1993 October; 68(4): 647-650.
Compliance analyses, R-implementations: Imai experiment package     Package icsw, Inverse Compliance Score Weighting
What is meant by intention to treat analysis? Survey of published randomised controlled trials Sally Hollis and Fiona Campbell British Medical Journal 1999;319;670-674
Booil Jo, Dept of Psychiatry   Estimation of Intervention Effects with Noncompliance Journal of Educational and Behavioral Statistics
   Compliance Publications based on Neyman-Rubin causal models:
Direct and Indirect Causal Effects via Potential Outcomes Donald B. Rubin Scandinavian Journal of Statistics Volume 31, Issue 2, Page 161-170, Jun 2004 .
Imbens GW and Rubin DB (1997) Bayesian Inference for Causal Effects in Randomized Experiments with Noncompliance The Annals of Statistics, 25, 305-327.
Principal Stratification in Causal Inference Constantine E. Frangakis and Donald B. Rubin, Biometrics, 2002, 58, 2129.
Addressing Complications of Intention-to-Treat Analysis in the Combined Presence of All-or-None Treatment-Noncompliance and Subsequent Missing Outcomes. Constantine E. Frangakis; Donald B. Rubin Biometrika, Vol. 86, No. 2. (Jun., 1999), pp. 365-379. Jstor link
    Additional Case Studies
Principal Stratification Approach to Broken Randomized Experiments: A Case Study of School Choice Vouchers in New York City Barnard, Frangakis, Hill, and Rubin Journal of the American Statistical Association June 2003, Vol. 98, No. 462, Applications and Case Studies
The British Journal of Psychiatry (2003) 183: 323-331 Estimating psychological treatment effects from a randomised controlled trial with both non-compliance and loss to follow-up graham dunn, and mohammad maracy

2. Regression Discontinuity Designs (systematic assignment)

Lecture Topics
Non-random assignment on the basis of the covariate, such as regression discontinuity designs.
    Regression Discontinuity handout     Example from rdd manual    ascii version

Primary Readings
Regression Discontinuity Designs Useful primers by Wm Trochin: William Trochim's Knowledge Base
Rubin, D. B., (1977), "Assignment to a Treatment Group on the Basis of a Covariate", Journal of Educational Statistics, 2, 1-26.   Jstor link

Regression Discontinuity Research Examples
    The original: PSAT and National Merit
Thistlewaite, D., and D. Campbell (1960): "Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment," Journal of Educational Psychology, 51, 309-317.
    Class size, Maimonides' Rule
In Rosenbaum, Design of Observational Studies (linked on main page).    sections 1.3, 3.2, 5.2.3, 5.3 DOS text
Angrist-Lavy Maimondes (class size) data   Angrist and Lavy, 1999.               read data ang = read.dta("http://stats.idre.ucla.edu/stat/stata/examples/methods_matter/chapter9/angrist.dta")

R implementations and Resources
R-package--rdd; Regression Discontinuity Estimation Author Drew Dimmery
Also Package rdrobust Title Robust data-driven statistical inference in Regression-Discontinuity designs
   RJournal for rdrobust, rdrobust: An R Package for Robust Nonparametric Inference in Regression-Discontinuity Designs

Additional Resources: Regression Discontinuity Designs
Journal of Econometrics (special issue) Volume 142, Issue 2, February 2008, The regression discontinuity design: Theory and applications    Regression discontinuity designs: A guide to practice, Guido W. Imbens, Thomas Lemieux
    Also from Journal of Econometrics (special issue) Volume 142, Issue 2, February 2008, The regression discontinuity design: Theory and applications Waiting for Life to Arrive: A history of the regression-discontinuity design in Psychology, Statistics and Economics, Thomas D Cook
the original paper: Thistlewaite, D., and D. Campbell (1960): "Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment," Journal of Educational Psychology, 51, 309-317.
Trochim W.M. & Cappelleri J.C. (1992). "Cutoff assignment strategies for enhancing randomized clinical trials." Controlled Clinical Trials, 13, 190-212. pubmed link
Capitalizing on Nonrandom Assignment to Treatments: A Regression-Discontinuity Evaluation of a Crime-Control Program Richard A. Berk; David Rauma Journal of the American Statistical Association, Vol. 78, No. 381. (Mar., 1983), pp. 21-27. Jstor
Berk, R.A. & de Leeuw, J. (1999). "An evaluation of California's inmate classification system using a generalized regression discontinuity design." Journal of the American Statistical Association, 94(448), 1045-1052. Jstor
another econometric treatment

Week 3 Review Questions

Regression Discontinuity

Question 1. Regression Discontinuity, classic "Sharp" design.

Replicate the package rdd toy example: cutpoint = 0, sharp design, with treatment effect of 3 units (instead of 10). Try out the analysis of covariance (Rubin 1977) estimate and compare with rdd output and plot. Pick off the observations used in the Half-BW estimate and verify using t-test or wilcoxon.
Extra: try out also the rdrobust package for this sharp design.

Solution for Review Question 1

Question 2. Systematic Assignment, "fuzzy design". Probabilistic assignment on the basis of the covariate.

i. Create artificial data with the following specification. 10,000 observations; premeasure (Y_uc in my session) gaussian mean 10 variance 1. Effect of intervention (rho) if in the treatment group is 2 (or close to 2) and uncorrelated with Y_uc. Probability of being in the treatment group depends on Y_uc but is not a deterministic step-function ("sharp design"): Pr(treatment|Y_uc) = pnorm(Y_uc, 10,1) . Plot that function.
ii. Try out analysis of covariance with Y_uc as covariate. Obtain a confidence interval for the effect of the treatment.
iii. Try out the fancy econometric estimators (using finite support) as in the rdd package. See if you find that they work poorly in this very basic fuzzy design example.
Extra: try out also the rdrobust package for this fuzzy design.

Solution for Review Question 2

Question 3. Controlled Assignment (class example)

From Rubin, D. B., (1977), "Assignment to a Treatment Group on the Basis of a Covariate", linked on course page

From page 16 Rubin

              7. A SIMPLE EXAMPLE

Table I presents the raw data from an evaluation of a computer-
aided program designed to teach mathematics to children in fourth 
grade. There were 25 children in Program 1 (the computer-aided 
program) and 47 children in Program 2 (the regular program). All 
children took a Pretest and Posttest, each test consisting of 20 
problems, a child's score being the number of problems correctly 
solved. These data will be used to illustrate the estimation 
methods discussed in Sections 4, 5, and 6. We do not attempt a 
complete statistical analysis nor do we question the assumption 
of no interference between units.

TABLE I

Raw Data for 25 Program 1 Children and 47 Program 2 Children
Pretest              Posttest Scores
Scores  
         Program 1             Program 2
10            15                 6,7
9             16                 7,11,12
8             12                 5,6,9,12
7            8,11,12             6,6,6,6,7,8
6        9,10,11,13,20           5,5,6,6,6,6,6,6,6,8,8,8,9,10
5          5,6,7,16              3,5,5,6,6,7,8
4           5,6,6,12             4,4,4,5,7,11
3           4,7,8,9,12           0,5,7
2             4                   4
1              -                   -
0              -                   7

Does assignment appear to be random or is this appear to be Assignment on the Basis of Pretest?
Try to estimate the asignment rule, presuming it is based on pretest How does this differ from a regression discontinuity design (simplest version)?
Assuming that assignment to Program 1 or Program 2 was solely on the basis of pretest (plus perhaps a probabilistic component) estimate the effect of program (new vs regular).
note data in table 1 exist in a more convenient form in file hw5rubin.dat http://statweb.stanford.edu/~rag/stat209/hw5rubin.dat and data file included in the solutions

Solution for Review Question 3

Compliance in RCT

Question 4 Non-compliance. Class example week 3.

Adapted from (linked on class page): An introduction to instrumental variables for epidemiologists, Sander Greenland, International Journal of Epidemiology 2000;29:722-729
Additional Reference: Sommer and Zeger (1991). On Estimating Efficacy from Clinical Trials. Statistics in Medicine

Greenland discusses randomized trials with non-compliance where Z indicates treatment assignment, which is randomized; X indicates treatment received, which is affected but not fully determined by assignment Z.

To illustrate Greenland presents in his Table 1 individual one- year mortality data from a cluster-randomized trial of vitamin A supplementation in childhood. Of 450 villages, 229 were assigned to a treatment in which village children received two oral doses of vitamin A; children in the 221 control villages were assigned none. This protocol resulted in 12,094 children assigned to the treatment (Z = 1) and 11,588 assigned to the control (Z = 0). Only children assigned to treatment received the treatment; that is, no one had Z = 0 and X = 1. Unfortunately, 2419 (20%) of those assigned to the treatment did not receive the treatment (had Z = 1 and X = 0), resulting in only 9675 receiving treatment (X = 1). Class handout has depiction and Greenland's table of results. Use as the outcome measure Y, the Deaths per 100,000 within one year (labeled Risk in Greenland's Table 1).

Part 1, using data summary from class handout
a. Give the ITT (intent-to-treat) estimate of the effect of vitamin A on Risk
b. What is the compliance rate in the treatment group (Z=1)? In the control group (Z=0)?
c. What is the instrumental variables estimate (following Angrist Imbens Rubin) of the effect of vitamin A on Risk?
What interpretation is given to this estimate (c.f. Booil Jo presentation)? Compare with part (a) result and comment.

Don Rubin has a great overview talk For Objective Causal Inference, Design Trumps Analysis Don Rubin, posted at http://www.bristol.ac.uk/media-library/sites/cmm/migrated/documents/trumps.pdf
Starting pdf page 21 Rubin takes up noncompliance using the Viamin A data (slightly different tabulated values than in the Greenland paper handout)
d. Recreate the calculations (ITT As-treated, Per Protocol) shown on pdf p.23; refer to Booil Jo handout
e. also CACE estimate pdf p.24
The Bayesian estimates (Imbens and Rubin 1997) pdf page 25 onward are implimented in part in the experiment package (Imai) mentioned in class and class materials.

Solution for question 4

Question 5
From the Booil Jo presentation slides in lecture, consider the JHU PIRC Intervention Study: N=284
Estimate Intervention Effects With Noncompliance
The Johns Hopkins Public School Preventive Intervention Study was conducted by the Johns Hopkins University Preventive Intervention Research Center (JHU PIRC) in 1993-1994 (lalongo et al., 1999~ The study was designed to improve academic achievement and to reduce early behavioral problems of school children. Teachers and first-grade children were randomly assigned to intervention conditions. The control condition and the Family-School Partnership Intervention condition are compared in this example. In the intervention condition, parents were asked to implement 66 take-home activities related to literacy and mathematics over a six-month period. One of the major outcome measures in the JHU PIRC preventive trial was the TOCA-R (Teacher Observation of Classroom Adaptation)
• Completed at least 45 activities = compliers.
• Outcome: change score (baseline - followup) of anti-social behavior .
From the means and compliance data given in the class materials (also linked Booil talk) compute treatment effect estimate of change in anti-social behavior: give ITT estimate and CACE estimate

Solution for question 5

Question 6 Broken RCT: Compliance, measured or binary

Compliance as a measured variable. In Stat209 week 3 we examine compliance adjustments; both those based on a dichotomous compliance variable and the much much more common measured compliance (often unwisely dichotomized to match Rubin formulation). The Efron-Feldman study ( handout description) used a continuous compliance measure. An artificial data set a data frame containing Compliance, Group, and Outcome for Stat209 is constructed so that ITT for cholesterol reduction is about 20 (compliance .6) and effect of cholestyramine for perfect compliance is about 35.
Try out some IV estimators for CACE. Obtain ITT estimate of group (treatment) effect with a confidence interval. Try using G as an instrument for the Y ~ comp regression. What does that produce?
Alternatively use the Rubin formulation with a dichotomous compliance indicator defined as TRUE for compliance > .8 in these data. What is your CACE estimate. What assumptions did you make? Compare with ITT estimate. In this problem the ivreg function from AER package is used for IV estimation.

Solution for Review Question 6

More Question 6 1. Compliance data, IV analysis, imitating Efron-Feldman cholestyramine trial. Solution showed you the widely used ivreg function from package AER package. Redo the ivreg analyses using functions from the ivmodel package.

Solution for more Review Question 6

Week 4

Lecture slides, week 4 (pdf)
week 4, part a (pdf)
week 4, part b (pdf)
Audio companion, week 4
parta partb partc

1. Regression Adjustments (analysis of covariance) in Observational Studies

Lecture topics

      1. Regression coefficients: Technical facts and foibles:
a. adjusted variables and regression coefficients--values of coefficients depend crucially on what else is used in the regression fit;   conditioning vs controlling
MT woes of regression coefficients slides    Class Handout. Coleman data: adjusted-variables multiple regression (ascii version)      Coleman scanned pdf
Additional materials: data file, 20 schools      using pairs command       Adjusted variable plot         Standard regression diagnostic plots, Coleman regression
Added (adjusted) Variable plots in various R-packages, car,olsrr Coleman avPlots in car             nice vignette on regression diagnostics, including added variable plots, from the olsrr package.
slide for regression recursion
b. effects of errors in measurement on regression coefficients
Class handout: Measurement error: Basic Results handout      Also   Faraway book (linked below) Ch.4 single predictor case;    Maindonald-Braun sec6.7 results and R-functions, Stigler example
c. standardized regression coefficients.   Class handout: Standardized regression coefficients standardized variables
       Hooke's Law example in Statistical Models for Causation: A critical review
       see Lab 1 legacy Stat209 Multiple regression basics for data analysis demonstration for standardized variables and regression coefficients (and regression from correlation matrix)             (aside "beta weights" in Kool-Aid Psychology Scientific American, Jan 2010)

      2. Failures of ancova regression adjustments in observational studies.   Regression adjustments in quasiexperiments handout

Primary Readings
Weisberg, H. I. Statistical adjustments and uncontrolled studies. Psychological Bulletin, 1979, 86, 1149-1164.
Background piece: Correlation and Causation: A Comment, (Stanford access) Stephen Stigler Perspectives in Biology and Medicine, volume 48, number 1 supplement (winter 2005)
Freedman text Ch. 1 (esp. Yule on paupers, Snow on Cholera)    Chap 1 exs also in From Association to Causation: Some Remarks on the History of Statistics;

Regression Adjustments Research Examples:
     Breastfeeding
Do Breast-Fed Baby Boys Grow Into Better Students?   Publication: Breastfeeding Duration and Academic Achievement at 10 Years (Stanford access). Wendy H. Oddy, Jianghong Li, Andrew J. O. Whitehouse, Stephen R. Zubrick, Eva Malacova. Pediatrics; Vol 127, Numb 1, Jan 2011
        Ohio State breastfeeding study. Is breast truly best? Estimating the effects of breastfeeding on long-term child health and wellbeing in the United States using sibling comparisons Cynthia G. Colen, , David M. Ramey Social Science & Medicine Volume 109, May 2014, Pages 55-65.
Ohio State press release.   Breast-feeding Benefits Appear to be Overstated

     Sexy Media
Pediatrics 2006;117;1018-1027   Sexy Media Matter: Exposure to Sexual Content in Music, Movies, Television, and Magazines Predicts Black and White Adolescents' Sexual Behavior (Stanford access)
2008 uproar, RAND corp.
Sex-saturated TV shows making teens pregnant
Sex on TV linked to teen pregnancies: Watching lots of racy shows can affect adolescents over time
Publication: Does Watching Sex on Television Predict Teen Pregnancy? Findings from a National Longitudinal Survey of Youth Pediatrics, v. 122, no. 5, Nov. 2008, p. 1047-1054
The real truth on sex and rock-and-roll from Frank Zappa: Zappa on Crossfire 1987;      Zappa vs Tipper Gore on Nightline 1985 with Ted Koppel

     (go) Fish makes your babies smart, but fat (regression analysis says)   Music: Fishin' Blues
a. High fish consumption in pregnancy tied to brain benefits for kids   Publication: Maternal Consumption of Seafood in Pregnancy and Child Neuropsychological Development: A Longitudinal Study Based on a Population With High Consumption Levels. American Journal of Epidemiology (2016) Vol 183(3), 169-182.
b. Eating lots of fish in pregnancy linked to obesity risk for kids    Publication: Fish Intake in Pregnancy and Child Growth: A Pooled Analysis of 15 European and US Birth Cohorts. JAMA Pediatrics. Published online February 15, 2016. doi:10.1001/jamapediatrics.2015.4430
    There's more. From the publication:
     "To select the confounders for adjustment in multivariable models, we used a directed acyclic graph approach based on prior knowledge about parental and child covariates that may be related to child adiposity and/or fish intake in pregnancy. We constructed the graph using DAGitty version 2.1 (DAGitty) to identify minimally sufficient adjustment sets of covariates and chose the set on which we had the best available information"
   DAGitty resources. Drawing and Analyzing Causal DAGs with DAGitty   Main website: DAGitty -- drawing and analyzing causal diagrams (DAGs)

    Survival analysis example, Cox regression
a, Marriage and Cancer survival.   Music: Love and Marriage
   Marriage may help fight cancer     Marriage is good for cancer patients
Publication: Martinez, M. E., Anderson, K., Murphy, J. D., Hurley, S., Canchola, A. J., Keegan, T. H. M., Cheng, I., Clarke, C. A., Glaser, S. L. and Gomez, S. L. (2016),   Differences in marital status and mortality by race/ethnicity and nativity among California cancer patients.   Cancer. doi: 10.1002/cncr.29886
b. Greenery and Longevity               Music: Don't fence me in   and    Green Acres
Living Near Green Spaces Helps You Live Longer, New Study Shows     Why living around nature could make you live longer      Publication: Exposure to Greenness and Mortality in a Nationwide Prospective Cohort Study of Women Environ Health Perspect; DOI:10.1289/ehp.1510363      Also a Mediation analysis example (c.f.stat209 week 1).     Gelman on mediation
c. A multi-decade example: Experiments vs Observational studies, Hormone Replacement Therapy
   D.B. Petitti and D.A. Freedman. Invited commentary: How far can epidemiologists get with statistical adjustment? American Journal of Epidemiology vol. 162 (2005) pp. 415-18.       Freedman handout page

Additional Resources
Mosteller-Tukey, Chap 13 (Woes of regression coefficients)
Practical Regression and Anova using R Julian J. Faraway,   chapter 4. errors in predictors
MB 3rd ed Ch.6. esp 6.2.2 adjusted variables; 6.2 Interpreting regression coefficients; 6.7 errors in variables
Background info, errors in variables. Short primer on test reliability (Wm Trochin, Cornell)  Informal exposition in Shoe Shopping and the Reliability Coefficient    extensive technical material in Chap 7 Revelle text
       Source technical papers: Errors of Measurement in Statistics, W. G. Cochran , Technometrics, Vol. 10, No. 4. (Nov., 1968), pp. 637-666. JStor URL esp sections 8,9,11
Some Effects of Errors of Measurement on Multiple Correlation, W. G. Cochran Journal of the American Statistical Association Vol. 65, No. 329 (Mar., 1970), pp. 22-34 JStor URL esp sec 8 discussion.
An overview of latent variables in Ch 1 of Generalized Latent Variable Modeling Multilevel, Longitudinal, and Structural Equation Models Anders Skrondal and Sophia Rabe-Hesketh Chapman and Hall/CRC 2004
Covariance Adjustment in Randomized Experiments and Observational Studies Paul R. Rosenbaum Statistical Science, Vol. 17, No. 3. (Aug., 2002), pp. 286-304.   Jstor
Some Aspects of Analysis of Covariance, A Biometrics Invited Paper with Discussion. D. R. Cox; P. McCullagh Biometrics, Vol. 38, No. 3, (Sep., 1982), pp. 541-561.   Jstor
Analysis of Covariance: Its Nature and Uses William G. Cochran Biometrics, Vol. 13, No. 3, Special Issue on the Analysis of Covariance. (Sep., 1957), pp. 261-281. Jstor
The Use of Covariance in Observational Studies W. G. Cochran Applied Statistics, Vol. 18, No. 3. (1969), pp. 270-275. Jstor

2. Instrumental Variables Analyses in Observational Studies

Lecture topics
Intro IV (Disattenuation, omitted variables, "selection effects") and other IV applications for broken regression models
   IV basics and measurement error example    IV intro Stat266        Music: Wishin' and hopin'

Primary Reading
Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments. Joshua D. Angrist; Alan B. Krueger, The Journal of Economic Perspectives Vol. 15, No. 4 (Autumn, 2001), pp. 69-85

Instrumental Variables Research Examples
      Returns to schooling? (if any)
See Angrist and Krueger, primary reading
      Does Television Cause Autism?
and should instrumental variables (IV) provide the answer? Is Rain the magic IV?
A cautionary comment, including by Nobel-laureate Jim Heckman
Economists' Full paper: Does Television Cause Autism?
Now it's rainfall.    Autism Prevalence and Precipitation Rates in California, Oregon, and Washington Counties Michael Waldman; Sean Nicholson; Nodir Adilov; John Williams Arch Pediatr Adolesc Med. 2008;162(11):1026-1034.
      Kindergarten and Money
$320,000 Kindergarten Teachers    Paper: How does your kindergarten classroom affect your earnings? evidence from project STAR Raj Chetty, Harvard University and NBER John N. Friedman, Harvard University and NBER Nathaniel Hilger, Harvard University Emmanuel Saez, UC Berkeley and NBER Diane Whitmore Schanzenbach, Northwestern University and NBER Danny Yagan, Harvard University March 2011.
  Policy Brief, Kennedy School of Government Talks: How Does Your Kindergarten Classroom Affect Your Earnings? Evidence from Project STAR Raj Chetty, Harvard   another version
     Before spotify
Other IV applications.   The Effect of File Sharing on Record Sales An Empirical Analysis

R Implentations and Resources.
ivreg in AER package AER: Applied Econometrics with R
   ivmodel package    vignette       Dylan Small ivpack
Two-stage Least Squares in R (tsls in sem package) by John Fox.     older package systemfit)

Week 4 Review Questions

Multiple Regression
Question 1. 1. Yule's Data via Freedman (deep review regression)
Yule (1899), "An investigation into the causes of changes in pauperism in England, chiefly during the last two intercensal decades."
File yuledoc.dat contains the data in Table 3, p.10 of Freedman text (1871 to 1881 comp) or described sec. 4 (p.6) of class reading "Association to Causation" and elsewhere) http://statweb.stanford.edu/~rag/stat209/yuledoc.dat
(note there's some preamble text in the file I commented out the preamble so it will load without issue, but it is still good to always look at file before opening)
I scanned and posted p10-11 of Freedman's text for reference on the variables and fit http://statweb.stanford.edu/~rag/stat209/DAFp10.pdf
a. replicate Yule's regression equation forInternal Revenue Service P.O. Box 510000 San Francisco, CA 94151-5100 the metropolitan unions, 1871--81. (parameters a b c d) Yule offered a regression equation, 'DELTA"Paup = a + b * "DELTA"Out + c * "DELTA"Old + d * "DELTA"Pop + error.
In this equation, "DELTA" is percentage change over time, "Out" is the out-relief ratio N/D, N = number on welfare outside the poor-house, D = number inside, "Old" is the percentage of the population over 65, "Pop" is the population.
Data are from the English Censuses of 1871, 1881 (Subtract 100 from each entry to get the percentage changes cf Freedman text pp10-11.)
Arithmetic addendum. Example: a variable has value 50 in 1870 and 60 in 1880 that's an increase of 10units or 20% (the metric used in the regression equation)
The data in yuledoc.dat reside in a somewhat cryptic form: in this case the entry would be (60/50)*100 = 120. So we obtain the (desired) 20% entry by subtracting 100 from the value in yuledoc.dat.
Example 2. value 70 in 1870 and 56 in 1880 yuledoc.dat entry would be (56/70)*100 = 80. Subtract 100 to get -20 (20% decline)
b. More complex regression review, which you may have done before. Orig version problem Freedman p.63, problem 4 asks you to test that the regression parameters for the variables change in population over 65 (c) and change in population (d) are both 0 (i.e. null hypothesis c=d=0 in Yule's regression above )

Solution for question 1

Question 2. Revisit Coleman data example from week 4 lecture.
part a. In the discussion of the Coleman data in Chap 13 Green book (Mosteller and Tukey) their Example 9 commentary suggests trying one school resource variable and one family demographic variable (instead of the bunch of redundent variables) in predicting vach.
See what happens with momed with simpler two predictor models: predictors tverb momed or ssal momed.
Are these regressions "better" than the full model? What you get from regression critically depends on what else is in the model ??

part b. As the Coleman data snippet used in the Green book (Mosteller and Tukey) is only 20 schools (with 5 predictors), for expository purposes I created a larger artificial data set, with 320 rows, for a population having the same means and covariances as the n=20 sample.
http://statweb.stanford.edu/~rag/stat209/coleman320.dat
Repeat the multiple regression and adjusted variables demonstration for momed.
Also as was done in class handout plot outcome (or adjusted outcome) vs the adjusted predictor (residuals from momed on the other predictors), to obtain the scatterplot the scatterplot for the multiple regression weight (cf plot from week 1 materials).
Computation note, data generation for this example to create the artificial data set with 320 rows I used the mvrnorm function in R, which requires the package MASS (part of basic R distribution)
> library(MASS) #brings in the contents of the library > ?mvrnorm #shows you the helpfile for this useful data generation function then to obtain a sample of 320 drawn from a mvnormal population with pop mean and cov the same as the n=20 sample, that data frame called "ed" here > eddat320 = mvrnorm(n = 320, mean(ed), cov(ed), tol = 1e-6, empirical = FALSE)
if empirical = TRUE then the sample moments would match exactly those in the "ed" dataset

extra bit, Coleman data avPlots
I mentioned in class that John Fox's car package generates the adjusted variable plots we created for momed in the avPlot command. Try that out for the base Coleman data n=20.

Solution for question 2

Question 3.
part a. From Class handout, top frame of "Math facts" , prove the result in the 2-predictor case that the multiple regression parameter-- coefficient of X_1 -- is identical to the coefficient for Y regressed on the adjusted variable. Hint: use the formulas on the 'regression recursion' slide
part b. The "Regression Recursion" slide (useful trick) is worth revisiting. Linked in week 4 materials.
Take the second version (using vars labelled 1 2 3), and use from the Coleman data vach as var1, momed as var2, and ses as var3.
Demonstrate that this relation holds in the sample for the parameter estimates from the corresponding regressions.

Solution for question 3

Question 4. Measurement error, single predictor linear regression
Construct a simple artificial data illustration of effects of measurement error in a single predictor variable on a regression slope. Result is shown in class handout week 4.
Set the reliability coefficient for the predictor variable to be .8. Set the slope for the perfectly measured predictor to be 1.5. Compare slope for perfectly measured predictor with the slope using the fallible predictor measurement.
A couple of ways of doing this exercise (your choice)
a. generate true predictor values, predictor error, and outcome variable and do the two regressions
b. use mvrnorm and generate (all at once) outcome, true predictor, fallible predictor, and do the two regressions
c. use the R-package DAAG function errorsINx (linked in week 4 materials)

Solution for question 4

Question 5. Observational Studies: Regression Adjustments.
The display from lecture of the regression adjustments also has a numerical example (page 2 of pdf). Recreate the results shown for the Anderson et al Head Start example.
Also for lecture materials, Regression Adjustments with Non-equivalent groups Week 4, show that the Belson adjustment procedure (using control group slope) is equivalent to evaluating the vertical distance between the within-group regression fits at the mean of the treatment group. written out proof.

   Instrumental variables
Question 6. Mroz87 data analysis, Panel Study of Income Dynamics" (PSID) .
Extended Instrumental Variables data analysis examples in Lab 3 of legacy Stat209.
     Lab 3 Instrumental Variables.       Lab3, exposition and commands
     Lab 3, Rogosa R-session        Mroz87 data description     Lab3
note: read.table("http://statweb.stanford.edu/~rag/stat209/Mroz87.dat", header = T) reads in the 753 cases.

Question 7.
Recreate the IV artificial data demo from class handout week 4, the "measurement error example" using mvrnorm to get the artificial data (n=1000) to match exactly the specified parameters of the data generation.

Solution for question 7

Question 8. Simulation from Freedman.
Try to recreate the simulation described in Freedman text bottom p.191-top p.192 (note: page 199 revised version)
scan of pages 191-192 at
http://statweb.stanford.edu/~rag/stat209/dafp191.pdf

Solution for question 8

Week 5

Lecture slides, week 5 (pdf)
week 5, part a (pdf)
week 5, part b (pdf)
Audio companion, week 5
parta partb

1. Failures of Traditional Path Analysis (and Structural Equation Models) in Observational Studies: Multiple Regression with Pictures

Lecture topics
1. Traditional Path Analysis introduction and examples (incl Blau-Duncan from Freedman chap 5).   class handouts; basics and examples
        [time permitting a little on Structural equation models: introduction and examples.   old class handout]
2. Three-strikes against these causal models. Does path analysis identify causal effects? Demonstrations of failure for Holland's encouragement design, Rogosa longitudinal examples (Goldstein, simplex).        class handout      Encouragement design slides
3. Traditional Path Analysis (regression) models are NOT modern causal graphs or DAG (directed acyclic graph). Quick overview.

Primary Readings
1. From David Freedman    response schedules, path analysis examples and potential outcomes in Statistical Models for Causation: A critical review
Freedman text Chap 5 (Chap 6 in revised ver).
2. David Rogosa. Casual Models Do Not Support Scientific Conclusions: A Comment in Support of Freedman.
Journal of Educational Statistics, Vol. 12, No. 2. (Summer, 1987), pp. 185-195. Jstor link
3. Revisit Week 1--Paul Holland: Encouragement design results; sections 3-5 Causal Inference, Path Analysis, and Recursive Structural Equations Models Paul W. Holland Sociological Methodology, Vol. 18. (1988), pp. 449-484.

Path Analysis Research Examples
    Adolescent depression via path analysis
Depression in girls linked to higher use of social media (Guardian)      Social media linked to higher risk of depression in teen girls (Reuters).   Publication: Social Media Use and Adolescent Mental Health: Findings From the UK Millennium Cohort Study EClinicalMedicine published by The Lancet, 2019 has multiple regression and path analysis, wow.
    Video Game Violence and Agression via path analysis
CNN: Violent video games linked to child aggression    publication: Longitudinal Effects of Violent Video Games on Aggression in Japan and the United States Craig A. Anderson, Akira Sakamoto, Douglas A. Gentile, Nobuko Ihori, Akiko Shibuya, Shintaro Yukawa, Mayumi Naito, and Kumiko Kobayashi Pediatrics 2008; 122: e1067-e1072.
More using latent growth curve methods (structural equation models) Do video games fuel mental health problems?     New Study Links Video Games and Mental Problems   Publication: Douglas A. Gentile, Hyekyung Choo, Albert Liau, Timothy Sim, Dongdong Li, Daniel Fung, and Angeline Khoo Pathological Video Game Use Among Youths: A Two-Year Longitudinal Study Pediatrics published online January 17, 2011 (10.1542/peds.2010-1353)
    Stress and illness? follow the path
Life events, fitness, hardiness, and health: A simultaneous analysis of proposed stress-resistance effects. Roth, David L.; Wiebe, Deborah J.; Fillingim, Roger B.; Shay, Kathleen A. Journal of Personality and Social Psychology. Vol 57(1), Jul 1989, 136- 142.
    TV Viewing and ADHD by LISREL
There Is No Meaningful Relationship Between Television Exposure and Symptoms of Attention-Deficit/Hyperactivity Disorder. Tara Stevens and Miriam Mulsow Pediatrics 2006;117;665-672 DOI: 10.1542/peds.2005-0863
    Social Stratification, Blau-Duncan
see D. Freedman, Statistical Models for Causation

R Implentations and Resources.
See also Social Science and Psychometrics Task Views)
John Fox sem exposition   talk format   another talk   also Sec.5 Stats with R
Structural Equation Models package in R,   sem manual    OpenMx - Advanced Structural Equation Modeling   Using R for Structural Equation Model:
R-implementations: Graphical Models, Causal Diagrams. CRAN Task View: gRaphical Models in R .   Peter Buehlmann and pcalg package.

Additional Resources
   Path Analysis
Class Theme Song  Ballad of the casual modeler http://http://rogosateaching.com/stat209/ballad.mp3
Path analysis intros    Useful classnotes:     Notre Dame
Path Analysis: Sociological Examples. Otis Dudley Duncan The American Journal of Sociology, Vol. 72, No. 1. (Jul., 1966), pp. 1-16. Jstor link
D.A. Freedman, Comments on Standardizing Path Diagrams: What Are the Parameters?
A reconsideration by a wise psychologist: The Path Analysis Controversy: A new statistical approach to strong appraisal of verisimilitude Meehl, Paul E; Waller, Niels G Psychological Methods. Vol 7(3), Sep 2002, pp. 283-300. available from SU APA pubs
Path Analysis, special issue: Journal of Educational Statistics Publication Info Vol. 12, No. 2, Summer, 1987 Issue        As Others See Us: A Case Study in Path Analysis(pp. 101-128) D. A. Freedman
Original publication on the longitudinal path analysis:   Some Models for Analysing Longitudinal Data on Educational Attainment. Harvey Goldstein       Journal of the Royal Statistical Society. Series A (General), Vol. 142, No. 4. (1979), pp. 407-442. Jstor link
Technical details on Rogosa longitudinal examples:
     Rogosa, D. R. (1993). Individual unit models versus structural equations: Growth curve examples.
     In Statistical modeling and latent variables, K. Haagen, D. Bartholomew, and M. Diestler, Eds. Amsterdam: Elsevier North Holland, 259-281.
     Rogosa, D. R., & Willett, J. B. (1985). Satisfying a simplex structure is simpler than it should be.
     Journal of Educational Statistics, 10, 99-107. Jstor link
   Structural equation models
Structural equation modeling is a major industry in social and behavioral science with many texts (such as Principles and Practice of Structural Equation Modeling 2nd Edition Rex B. Kline; here's a long list), specialized courses, dedicated journals (Structural Equation Modeling: A Multidisciplinary Journal), and specialized computer programs (e.g., LISREL, EQS, AMOS).
Maximum likelihood factor analysis: A General Method for Analysis of Covariance Structures, K. G. Joreskog, Biometrika, Vol. 57, No. 2. (Aug., 1970), pp. 239-251.
Structural equation modeling from Scientific Software International home of LISREL Student editions, documentation, examples, etc
   Graphical Models, Causal Diagrams
Original Epi exposition. Greenland S., Pearl J., and Robins J.M. Causal diagrams for epidemiologic research. Epidemiology, 10(1):37-48, 1999.
Richardson and Robbins attempts at unification. Single World Intervention Graphs: A Primer   Longer version
Graphical Markov Models: Overview   Nanny Wermuth and D.R. Cox
C. Shalizi. Advanced Data Analysis from an Elementary Point of View, 2017; Chapter 24 (except 24.2)

2. Interpreting Associations: Spurious Correlation and Simpson's Paradox

Lecture topics
Class handout: Third Variables (page 1)
1. Spurious Correlation: some historical notes; partial and part correlations. (class slides)
2. Simpson's paradox wiki page (dichotomous outcome slide)

Research Examples
   Correlation studies.
From Week 0 intro:    Secret to Winning a Nobel Prize? Eat More Chocolate (Time)   Publication: Chocolate Consumption, Cognitive Function, and Nobel Laureates Franz H. Messerli, M.D. N Engl J Med 2012; 367:1562-1564 October 18, 2012
New study finds sweary people are more honest    Publication: Frankly, We Do Give a Damn: The Relationship Between Profanity and Honesty, Social Psychological and Personality Science.
Size does matter. Bigger is smarter: Overall, not relative, brain size predicts intelligence. Publication: Deaner RO, Isler K, Burkart J, van Schaik C:   Overall Brain Size, and Not Encephalization Quotient, Best Predicts Cognitive Ability across Non-Human Primates.   Brain Behav Evol 2007;70:115-124 (DOI: 10.1159/000102973)
   Spurious Correlation
perennial favorite Spurious Correlation examples
Correlations Genuine and Spurious in Pearson and Yule, John Aldrich Statistical Science, Vol. 10, No. 4. (Nov., 1995), pp. 364-376. Jstor link
Spurious Correlation: A Causal Interpretation. Herbert A. Simon Journal of the American Statistical Association, Vol. 49, No. 267. (Sep., 1954), pp. 467-479. Jstor link
   Simpson's Paradox
Kidney stone example Confounding and Simpson's paradox, BMJ, vol309, 1480-1, 1994
UC Berkeley admissions, Racial bias in Death Penalty in wiki page.

R Implentations and Resources.
Spurious correlation?
R-Package ppcor October 29, 2012 Title Partial and Semi-partial (Part) correlation
Simpson's Paradox.
R-package Simpsons.   Frontiers in Psychology. 2013; 4: 513. Simpson's paradox in psychological science: a practical guide

Week 5 Review Questions

Path Analysis and Friends
Question 1. Freedman, Blau-Duncan example in class handout.
Freedman links "Stat Models for Causation" (pp3-4) or Freedman text Ch6 (revised)
Replicate class handout computations for the path analysis
Plus questions from Freedman text: scan of pp.80-81 at (pp86-7 revised ed) http://web.stanford.edu/~rag/stat209/DAFtextp8081.pdf (includes standardization material Hookes Law on week 4 class handout).
Freedman pp80-1 (set A) prob 1 prob 5 prob 6 prob 8
pdf scan also includes freedman Set E, p.97 prob 4(a,b) (p.103 revised)

Solution for question 1

Question 2. Causal Models of Publishing Productivity
freedman p.101 prob 5 (page 107 in revised version)
This Homework problem considers one of the path analysis models from "Causal Models of Publishing Productivity in Psychology", Rogers & Maranto, J. Applied Psychology, 1989, 74(4), 636-649.
direct link to paper http://content.apa.org/journals/apl/74/4/636.pdf
The path analysis conducted by the authors from a sample of 86 men and 76 women is shown in p.101 of Freedman's text and on page 647 of the publication; that page also exists at http://www-stat.stanford.edu/~rag/stat209/pathpage647.pdf
You do have the correlation matrix from adding Table 7 fits and residuals. But here all the problem asks you to do is look at and consider the usefulness of this analysis. Note they don't display the disturbance paths so we don't get a look at Rsq values.
What are the predictors of Pubs (direct effects) in this picture?
What are the predictors of Cites (direct effects) in this picture?
The diagram provides estimates of supposed causal effects ("causal model of publishing" is the article title); it displays regression coeffs , with coefficient estimates shown on the edges.
Consider a "productive researcher" to be defined in terms of the number of publications and the number of cites. The good news is that ability "affects" pubs and cites with a positive coefficient in each case. Therefore, higher ability leads to a more "productive researcher", according to the causal path gospel. Some bad news is that sex is a predictor of pubs with a large coefficient value. However, it is likely that there are confounding variables between sex and pubs.

Solution for question 2

Question 3. Longitudinal path analysis (based on the Goldstein example)
Apply the path analysis model taken from Goldstein (1979) (in class handouts week5,also Rogosa eq 2 1988, "casual models...) to verify results for path coefficients in eq 3 of Rogosa (1988) (also in handouts).
Data are given in http://statweb.stanford.edu/~rag/stat209/casualdat using the top frame of 40 observations for variables (perfectly measured) Xi(1) Xi(3) Xi(5) and taking the times of observation to be 1 3 5 respectively.
These data are in wide form--each row is a subject.
You can verify, if you like, that each subject's data lies on a straight-line (constant rate of change)
Try pairs on the three measurements to see the scatter plots over persons.
Obtain values for the path coefficients and the muliple correlations for the regression fits.
Can you obtain standard errors for the path coefficients for this small sample?
Any interpretations of the results from the path analysis?

Solution for question 3

Question 4. ENRICHMENT ITEM, Structural Equation Models, Method-of-moments for two-variable, two-indicator model
Problem 4 is an "enrichment" item, and you may want to look at the solution which is linked.
For latent variable models with multiple indicators How does structural equation model (latent vars) methods provide a correction for measurement error?
Method-of-moments for two-variable, two-indicator model
For the Structural Equation Models handout from Joreskog book, which is linked in the week 5 lecture materials (class handout) but we did not take up in detail in class, obtain parameter estimates for the no-correlated error version (9 parameters, top covariance matrix) in terms of the sample variance and covariances among the four indicators (y_ij).
Brute force substitution will get you a non-optimal estimate, suffices for instructional purposes.

Solution for question 4

Spurious Correlation
Question 5. Spurious correlation Consider the spurious correlation (common cause type) discussed class week 5. Additional examples from class page links in Simon (1954) or Aldrich (1995, sec 7 "illusory")
The data for this problem at http://rogosateaching.com/stat209/spuriousRQ.dat
Is the association between X and Y a consequent of common cause Z? Give a point estimate, corresponding scatterplot, and 95% confidence interval for the appropriate partial correlation. Does the partial correlation coefficient settle the causal question?
One real-life multi-million-dollar invocation version is:
Spurious correlations have been used by tobacco companies to argue that the association between smoking and lung cancer may actually be a result of some other factor such as a genetic factor that predisposes people both to nicotine addiction and lung cancer.
If this is true, then smoking cannot be blamed for causing cancer?
## Also if you like try out the ppcor package; I did in solutions, doesn't give you much

Solution for question 5

Week6

Lecture slides, week 6 (pdf)
week 6, part a (pdf)
week 6, part b (pdf)
Audio companion, week 6
parta partb

1. Multilevel Data and Contextual Effects in Observational Studies

Lecture topics
1. Background: nested data, ecological fallacy, aggregation bias, levels of analysis. levels of analysis handout
2. Traditional approaches to multilevel analysis: contextual effects, school effects.
  Class handouts:   Multilevel regressions     NELS example (NELS data schools in CMatching package; larger subset in influence.ME package)   Contextual effects (p.2)
3. Modern multilevel analyses: mixed effects models,   High School and Beyond (HSB) data. (2-level via lme4).
       Legacy Stat209 Lab 2. Multilevel analysis (mixed-effects models) High School and Beyond example.
        complete Bryk dataset     first pass, Bryk data:   session    plots
  Lab2, exposition and commands provides a full write up (annotated) of the analyses
   Lab2 (Rogosa session) using lme4, lmer (with additional plots)
         Lecture slide, lme lmer for Bryk data        side-by-side boxplots, SFYS analysis

Primary Readings
        Aggregation bias, Ecological fallacy.
D.A. Freedman. "Ecological inference and the ecological fallacy." International Encyclopedia for the Social and Behavioral Sciences. Elsevier (2001) vol. 6 pp. 4027-30. N. J. Smelser and Paul B. Baltes, eds. A one-page version: D.A. Freedman. "The ecological fallacy." In the Encyclopedia of Social Science Research Methods. Sage Publications (2004) Vol. 1 p. 293. M. Lewis-Beck, A. Bryman, and T. F. Liao, eds
        Current statistical analyses in social science: multilevel models.
Using R, lme, nlme.    John Fox lme tutorial (HSB data)       Fitting linear mixed models in R Using the lme4 package Douglas Bates (pp.27-30)
High School and Beyond (HSB) data. (2-level via lme4).    Collection of HSB data analyses from various text sources              Teaching document from Indiana, HSB from every statistical package

Multilevel Data Research Examples
The original: Ecological Correlations and the Behavior of Individuals W. S. Robinson American Sociological Review Vol. 15, No. 3, Jun., 1950 .
One of many followups: Some Alternatives to Ecological Correlation Leo A. Goodman American Journal of Sociology Vol. 64, No. 6, May, 1959
A good sociological/medical overview. Ecological effects in multi-level studies. Blakely TA, Woodward AJ. J Epidemiol Community Health. 2000 May;54(5):367-74. pubmed   full text
Klein, S. P. and Freedman, D. A. (1993), "Ecological regression in voting rights cases" Chance, 6, 38-43.

Additional Resources
Aggregation bias, Ecological fallacy.
D.A. Freedman. "The ecological fallacy." In the Encyclopedia of Social Science Research Methods. Sage Publications (2004) Vol. 1 p. 293. M. Lewis-Beck, A. Bryman, and T. F. Liao, eds
A Rule for Inferring Individual-Level Relationships from Aggregate Data, Glenn Firebaugh American Sociological Review Vol. 43, No. 4 (Aug., 1978), pp. 557-572   JStor URL
American Journal of Epidemiology Vol. 139, No. 8: 747-760 Invited Commentary: Ecologic Studies -- Biases, Misconceptions, and Counterexamples S Greenland, J Robins
The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology J. Michael Oakes Social Science and Medicine 58 (2004) 19291952
R-package eiPack: R x C Ecological Inference and Higher-Dimension Data Management. R News Oct 2007
Educational multilevel data.
The Analysis of Multilevel Data in Educational Research and Evaluation Leigh Burstein Review of Research in Education, Vol. 8. (1980), pp. 158-233. Jstor link
Methodological Advances in Analyzing the Effects of Schools and Classrooms on Student Learning, Stephen W. Raudenbush; Anthony S. Bryk Review of Research in Education, Vol. 15. (1988 - 1989), pp. 423-475. Jstor link
Analyzing Multilevel Data in the Presence of Heterogeneous within-Class Regressions Leigh Burstein; Robert L. Linn; Frank J. Capell    Journal of Educational Statistics, Vol. 3, No. 4. (Winter, 1978), pp. 347-383. Jstor link
examples from analyses of voting data.
Bias in ecological regression Stephen Ansolabehere and Douglas Rivers
David A. Freedman et al., "Ecological Regression and Voting Rights," Evaluation Review 1991, pp. 673-711,
D.A. Freedman, S.P. Klein, M. Ostland, and M.R. Roberts. "Review of 'A Solution to the Ecological Inference Problem.' " Journal of the American Statistical Association, vol. 93 (1998) pp. 1518-22; with discussion, vol. 94 (1999) pp. 35257.
Multilevel models.
Using SAS PROC mixed:
Judith Singer HLM/PROC Mixed papers: Multilevel Modelling Newsletter ; JEBS1998 Using SAS PROC MIXED to Fit Multilevel Models, Jstor
HLM - Hierarchical Linear and Nonlinear Modeling (HLM): descriptions and student edition HLM6
Freedman, D. A. (census adjustments). Hierarchical Linear Regression
Using R: lme4 (lmer and nlme) and mlmRev.    John Fox lme tutorial
Doug Bates draft book (Feb 2010)     Doug Bates SASmixed package
Fitting linear mixed models in R Using the lme4 package Douglas Bates (pp.27-30)
London exam data example in Examples from Multilevel Software Comparative Reviews Douglas Bates
Regression diagnostics for lmer models. Package influence.ME
mlmRev data examples. Also, Tennessee's Student Teacher Achievement Ratio (STAR) from Creating an R data set from STAR Douglas Bates
STATA does it also
lmer for SAS PROC MIXED Users Douglas Bates Department of Statistics University of Wisconsin Madison

2. Reciprocal Causal Effects and non-recursive models in Observational Studies

Lecture topics
1. Cross-sectional Data: Simultaneous equations (2SLS, IV in butter, peer aspirations, ed and fertility, Freedman), nonrecursive models
     Simultaneous equations handouts     Duncan et al ascii
2. Reciprocal effects and non-recursive models in longitudinal data.
      Empirical research on reciprocal effects, including cross-lagged correlation. clc slides

Primary Readings
An (old) review of reciprocal effects. Rogosa, D. R. (1985). Analysis of reciprocal effects. In International Encyclopedia of Education, T. Husen and N. Postlethwaite, Eds. London: Pergamon Press, 4221-4225. (reprinted in Educational Research,Methodology & Measurement: An international handbook, J. P. Keeves Ed. Oxford: Pergamon Press, 1988.)

Reciprocal Effects Examples
Michelob ULTRA® Super Bowl LV Spot Online. Are You Happy Because You Win? Or Do You Win Because You're Happy?
      Screen time rots kids minds.
Fox17 Nashville:     Increased screen time in young children associated with developmental delays
Publication: Association Between Screen Time and Children's Performance on a Developmental Screening Test JAMA Pediatr. Published online January 28, 2019. doi:10.1001/jamapediatrics.2018.5056
     Internet use and depression
Study links excessive internet use to depression   Publication: The Relationship between Excessive Internet Use and Depression: A Questionnaire-Based Study of 1,319 Young People and Adults. Catriona M. Morrison, Helen Gore Psychopathology 2010;43:121-126 . available from Lane e-journals
     Peer Influences
Peer Influences on Aspirations: A Reinterpretation Otis Dudley Duncan, Archibald O. Haller, Alejandro Portes American Journal of Sociology, Vol. 74, No. 2 (Sep., 1968), pp. 119-137   Jstor
     Education and Fertility
Rindfus example (Freedman Chap 8; paper reprinted in Freedman text). Education and Fertility: Implications for the Roles Women Occupy Ronald R. Rindfuss; Larry Bumpass; Craig St. John American Sociological Review, Vol. 45, No. 3. (Jun., 1980), pp. 431-447.   from Jstor
     Longitudinal Data: original TV Violence and Agression
Eron LD, Huesmann LR, Lefkowitz MM, Walder LO. Does television violence cause aggression? Am Psychol. 1972;27:253–63. PubMed
     Money Supply
Granger Causality. Nobel 2003. Complete Granger
Relationships--and the Lack Thereof--Between Economic Time Series, with Special Reference to Money and Interest Rates. David A. Pierce Journal of the American Statistical Association, Vol. 72, No. 357. (Mar., 1977), pp. 11-26. Jstor

Additional Resources
Reciprocal effects: Rogosa, D. R. (1980). A critique of cross-lagged correlation. Psychological Bulletin, 88, 245-258. APA site version
Structural Equation Modeling With the sem Package in R John Fox STRUCTURAL EQUATION MODELING,13(3),465486     Jox Fox home page

Week 6 Review Questions

Question 1. Grouping and multilevel regressions
Illustrate relations among individual level (ignoring groups) group-level, and relative standing regression results.
Part I groups formed on X
Create 200 individual level observations on X and Y having correlation around .65.
I started with x values 1:200 (simple integers) for convenience, but you can be fancier.
Do an individual level Y on X regression (i.e. "total, ignoring groups which don't exist yet).
Group these 200 individuals into 10 groups of size 20 on the basis of the X-values (i.e. group 1 contains the individuals with the smallest 20 X-values, group 10 contains the individuals with the largest 20 X-values). So within-groups will be as homogeneous as possible on X, and between group differences on X will be largest.
Do a regression on group means (between groups regression) these may be classroom means for example, and you may not have individual level data.
Get a relative standing measure: individual score minus group mean for each individual.
Do a relative standing regression
Now do the multiple regression analyses ( class handouts; Burstein, Deleuuw & Kreft)
1. "context" Y on X and X-bar (X-bar is an attribute of each individual)
2. "Cronbach" (Kreft's term) Y on X minus X-bar and X-bar (predictors uncorrelated)
Demonstrate the coefficients match the basic relations shown in lecture
Part II groups formed independent of X (random)
Repeat the analyses of Part I using a different (as different as can be) mechanism for assigning individuals to groups. Form the 10 groups of size 20 at random, making the groups heterogeneous on X within group and similar between groups.

Solution for question 1

Question 2. Contextual Effects Coefficient
Use the regression recursion relation from week 4 to show that the contextual effects coefficient defined in week 6 handouts is equal as stated in the handouts (and literature) to the between groups slope minus the within-pooled slope.

Solution for question 2

Question 3. Simplified version of HSB analysis
The ubiquitous analyses of the HSB data use a level 2 model, with meanses as a covariate in addition to the 'group treatment' indicator sector (P/C).
For intro instruction use of these multilevel methods for comparing 'effects' of Public vs Catholic, it would be cleaner just to do a 't-test' in the level 2 model-- i.e. the only predictor of level and gradient being sector.
Try out that simpler model and compare with standard analysis. Note that the side-by-side boxplots are still relevant for this reduced model, as the boxplots only relect the Level 1 specifications.

Solution for question 3

Question 4. Enrichment problem (better to spend time on HSB analyses etc)
Ecological fallacy: Is Radon good for you?
Treat this as an extended example of ecological bias.
At one time I went through the Robbins paper in class...
Solutions show you data generation procedures and illustrate the sometimes very large effects of aggregation bias. If the topic interests read through the G-R paper to see the point.
Consider the artificial data example described in Ex 3 p.750 Greenland and Robbins American Journal of Epidemiology Vol. 139, No. 8: 747-760 Ecologic Studies—Biases, Misconceptions, and Counterexamples (article linked on class page, week 6 under additional resources)
intro their Example 3
Suppose that our study data are limited to regional values of mean radon, mean smoking (in packs per day), and lung-cancer rates among males aged 70-74 years, for 41 regions indexed by r = 0, . . . , 40.
follow their example set up and create your own artificial data example and produce the regression function and plot in their figure 1 for the effect of radon levels on lung cancer rates
from G&R you are demonstrating the ecological fallacy because "the regressions yield an inverse association of radon and lung cancer, despite the fact that radon is a positive risk factor in the underlying model used to generate the data,"
"Even though the lung-cancer rates show the strong upward relation to smoking one would expect from model 1, and the ecologic correlation between radon and smoking is only 0.01, there is a significant negative ecologic association of radon with lung cancer rates."

Solution for question 4

Question 5. Simultaneous effects.
For the Duncan Haller Portes occupational aspiration example from class handout (cf Fox Soc Meth 1979 paper) replicate the 2SLS (IV) analysis of this non-recursive model from the class handout.
Extra item: Can you fit a model which adds a path from Friend's family SES to respondents occupational aspiration?

Solution for question 5

Week7

Lecture slides, week 7
week 7 (pdf)
Audio companion, week 7
parta partb

1. Matching Methods for Observational Data: Part I

Lecture topics
0. Review: Matching for increased precision, Randomized block designs (see Review Questions)   package blockTools
1. Traditional matching methods: subclassification, pair matching. Case-control studies.
          handout for smoking ex, Cochran subclassification
2. Modern Implementations of matching methods The advent/onslaught of propensity score matching methodology for treatment-control comparisons
         propensity score intro      checking balance, aspirin ex

Primary Readings
Case-control studies:    Case-control overview from Encyclopedia of Public Health
Non-technical matching overviews:    Donald Rubin Nonrandomized Comparative Clinical Studies   another version,[Lane library from campus] Annals of Internal Medicine, 1997, 15 October 1997, Vol. 127. No. 8_Part_2
      Cochran's smoking, subclassification and Rubin's Breast Cancer example also discussed in Rubin "Design Trumps Analysis"    Rubin paper .   also set of slides
    Another Rubin overview of matching,    Matching Methods for Causal Inference Elizabeth Stuart Donald Rubin [does Lalonde example]
Joffe, Marshall M. and Paul R. Rosenbaum. 1999. "Invited Commentary: Propensity Scores." American Journal of Epidemiology 150(4):327-33.
Rosenbaum and Rubin, Reducing Bias in Observational Studies Using Subclassification on the Propensity Score, JASA 79[387], September 1984, 516-524. JStor [one of the original technical papers]

Matching Research Examples
   Case-Control, CSD
Carbonated Soft Drink Consumption and Risk of Esophageal Adenocarcinoma JNCI: Journal of the National Cancer Institute, Volume 98, Issue 1, 4 January 2006, Pages 72-75,
   Aspirin Pair Matching
Aspirin use and all-cause mortality among patients being evaluated for known or suspected coronary artery disease: A propensity analysis.   Gum PA1, Thamilarasan M, Watanabe J, Blackstone EH, Lauer MS. JAMA. 2001 Sep 12;286(10):1187-94.
   SAT Coaching, Full Matching
Optmatch application paper: Hansen, Ben B. Full matching in an observational study of coaching for the SAT.(Scholastic Assessment Test) Journal of the American Statistical Association; 9/1/2004;
   Coronary Artery Disease
Rosenbaum and Rubin, Reducing Bias in Observational Studies Using Subclassification on the Propensity Score, JASA 79[387], September 1984, 516-524. JStor
   Breastfeeding and Propensity Scores
Breastfeeding May Not Lead to Smarter Preschoolers       Breastfeeding does NOT boost a baby's IQ: Nourishing infants the natural way only makes them less hyper      Breast-feeding study sheds light on benefits for babies
Publication: Breastfeeding, Cognitive and Noncognitive Development in Early Childhood: A Population Study. Lisa-Christine Girard, Orla Doyle, Richard E. Tremblay. PEDIATRICS Volume 1 39, number 4 , April 2017.

Additional resources
Talks and tutorials
Strategies for Using Propensity Scores Well. A Workshop given by Thomas E. Love, Ph. D., Case Western Reserve University      Love workshop ASA
A broad review of matching and bias-reduction methods. Opiates for the Matches: Matching Methods for Causal Inference Jasjeet S. Sekhon. Annual Review of Political Science 2009
UNC, Chapel Hill Social Work: Introduction to Propensity Score Matching: A Review and Illustration     Propensity Score Matching: A New Device for Program Evaluation UNC, Chapel Hill Social Work 2004    flash version
An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies Peter C. Austin Multivariate Behav Res. 2011 May; 46(3): 399-424.
Methods to assess intended effects of drug treatment in observational studies are reviewed Journal of Clinical Epidemiology 57(2004)1223-1231 [an overview of many of past weeks topics]
Average causal effects from nonrandomized studies: A practical guide and simulated example. Schafer, Joseph L.; Kang, Joseph Psychological Methods, Vol 13(4), Dec 2008, 279-313.
A Primer for Applying Propensity-Score Matching Office of Strategic Planning and Development Effectiveness, Inter-American Development Bank
Tutorial in biostatistics: Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group Statist. Med. 17, 2265-2281 (1998)

R packages and examples:
1. Ben Hansen (local hero)   optmatch manual    R News Oct 2007        Hansen presentation: Flexible, Optimal Matching for Comparative Studies Using the optmatch package
Optmatch application paper: Full matching in an observational study of coaching for the SAT.(Scholastic Assessment Test) Journal of the American Statistical Association; 9/1/2004; Hansen, Ben B.
Additional exercises (checking balance) using the nuclearplants data (class handout ex) from Mark Fredrickson here
2. MatchIt: Nonparametric Preprocessing for Parametric Casual Inference Daniel Ho, Kosuke Imai, Gary King, Elizabeth Stuart MatchIt provides a wrapper that can call optmatch or Sekhon's genetic matching]
JSS May 2011 exposition: MatchIt: Nonparametric Preprocessing for Parametric Causal Inference more R-fun from Gary King, WhatIf: Software for Evaluating Counterfactuals
Another application (including matchit): Attributing Effects to a Get-Out-The-Vote Campaign Using Full Matching and Randomization Inference Jake Bowers and Ben Hansen.    Data archive and computing resources for the New Haven get-out-the-vote
Also:
3. Multivariate and Propensity Score Matching Software for Causal Inference Jasjeet S. Sekhon

    Propensity etc Original Technical Publications [jstor links]
Rosenbaum, P. R. And D. B. Rubin, 1983, The Central Role of the Propensity Score in Observational Studies for Causal Effects, Biometrika 70[1], April 1983, 41-55. JStor
P. Rosenbaum, Chapters 2 and 3 (on exact inference for treatment effects) in Observational Studies, New York: Springer, 1995.
Dropping out of High School in the United States: An Observational Study Paul R. Rosenbaum Journal of Educational Statistics, Vol. 11, No. 3. (Autumn, 1986), pp. 207-224. Jstor
Paul R. Rosenbaum; Donald B. Rubin. "Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score" The American Statistician, Vol. 39, No. 1. (Feb., 1985), pp. 33-38   JStor   Danish downers example
D. Rubin, Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies, Statistical Science 5[4], November 1990, 472-480. JStor
Rubin, D. B., 1974, Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies, Journal of Educational Psychology, 66, 688-701.
Rubin, D. B., 1978, Bayesian Inference for Causal Effects: The Role of Randomization,” Annals of Statistics 6[1], January 1978, 34-58. JStor

    Case-control studies
Case-control overview (shown in class) from Encyclopedia of Public Health
Breslow NE. Statistics in epidemiology: the case-control study.J Am Stat Assoc. 1996 Mar;91(433):14-28
Carbonated Soft Drink Consumption and Risk of Esophageal Adenocarcinoma JNCI: Journal of the National Cancer Institute, Volume 98, Issue 1, 4 January 2006, Pages 72-75,
Smoking and Lung Cancer in Chap 18 of HSAUR3 (Handbook of Statistical Analysis Using R). Also driving and backpain data in Chap 7 HSAUR2
Some R-packages and resources: SensitivityCaseControl: Sensitivity Analysis for Case-Control Studies; multipleNCC: Inverse Probability Weighting of Nested Case-Control Data;    Two-phase designs in epidemiology (Thomas Lumley) ;   Exact McNemar's Test and Matching Confidence Intervals

Weeks 7 and 8 Review Questions

Randomized Blocks, Experimental Designs
Question 1. Matching and Paired t-test example from lecture
(Stat 141 exam problem (circa 2005))
An experiment on treating depression by Imipramine, an anti- depressant drug, employed a matched-pairs design. A total of 60 patients were paired on a combination of age, sex, and time of entry in study to form 30 matched pairs. That is, each pair consisted of patients who entered the study within a month of each other, were of the same sex and were similar in age. One member of each pair was randomly assigned to receive Imipramine and the other to receive a placebo. The outcome measure was the score on the Hamilton rating scale for depression (higher score = more severe depression) after 5 weeks of treatment.
The file http://web.stanford.edu/~rag/stat209/depressdata contains the outcome scores for each of the 30 pairs (Imipramine vs Placebo).
a. Carry out a statistical test of the equality of treatment outcomes. That is, test null hypothesis that Imipramine and Placebo produce equivalent outcomes versus a non-directional alternative. Use Type 1 error rate .05. State the result of the statistical test.
b. Pretend that an erstwhile graduate assistant lost all records of the matched pairs before the data analysis could be completed. Consequently, all the investigator has available is the 30 scores for the patients receiving Imipramine and the 30 scores for the patients receiving Placebo (but not the information on the matching). Carry out a statistical test of the hypothesis in part a using the available information. Is the result of the test the same? Explain why or why not.
c. Regard part (b) as a bad dream and return to the data set with full matching information. But now you are told that the differences between Hamilton scale scores shouldn't be regarded as having numerical value. Comparing two Hamilton scores only indicates relative standing, that is which of the two patients in the matched pair is showing greater symptoms of depression. Under that limitation of the data carry out an appropriate statistical test of the hypothesis in part (a). Explain why the result is the same or different from the result in part (a).

Solution for question 1

Question 2. Matching to increase precision: Factorial Randomized blocks designs
Example from lecture, Neter-Wasserman problem DENTAL PAIN.
An anesthesiologist made a comparative study of the effects of acupuncture and codiene on postoperative dental pain in male subjects. The four treatments were (1) placebo treatment-- a sugar capsule and two inactive acupuncture points, (2) codiene treatment only--a codeine capsule and two inactive acupuncture points; (3) acupucture only--a sugar capsule and two active acupuncture points (4) both codeine and acupuncture. These 4 conditions have a 2x2 factorial structure.
Thirty-two subjects were grouped into 8 blocks of four according to an initial evaluation of their level of pain tolerance. The subjects in each block were then randomly assigned to the 4 treatments. Pain relief scores were obtained 2 hours after dental treatment. Data were collected on a double-blind basis.
Data in file: http://statweb.stanford.edu/~rag/stat209/dental.dat
c1 is pain relief score (higher means more pain relief); c2 is block; c3 is codiene; c4 is acupuncture--for c3 and c4, 1=no.
a. obtain cell means for the 2x2 factorial design
b. carry out the randomized blocks analysis of variance, factors are Block, main effects for Codeine Acup and interaction term Codeine*Acup
c. Give a measure for the relative efficiency of the blocking on pain tolerance--how much better in terms of precision or number of subjects needed is the analysis using blockings versus a 2x2 factorial design design that ignores pain tolerance?

Solution for question 2

Matching and propensity score methods, Observational Studies
Question 3.
Recreate the matching demonstration for Ben Hansen's "gender equity" example (done in the week 7 class handout, posted not hard copy), an example of optimal full matching. Only one matching variable. this is Example 2 in Hansen's talk, about p.48 in the linked pdf here's the data in cut-and-paste form

> geneq
  Grant gender
1   5.7      W
2   4.0      W
3   3.4      W
4   3.1      W
5   5.5      M
6   5.3      M
7   4.9      M
8   4.9      M
9   3.9      M

Solution for question 3

Question 4. Multivariate matching
The example shown in lecture, from anderson et al
Example 6.5 Multivariate caliper matching: Consider a hypothetical study comparing two therapies effective in reducing blood pressure, where the investigators want to match on three variables: previously measured diastolic blood pressure (DPB), age, and sex. Such confounding variables can be divided into two types: categorical variables, such as sex, for which the investigators may insist on a perfect match (e = 0); and numerical variables, such as age and blood pressure, which require a specific value of the caliper tolerances. Let the blood pressure tolerance be specified as 5 mm Hg and the age tolerance as 5 years. The data contains measurements of these three confounding variables. (The subjects are grouped by sex to make it easier to follow the example.)
Data with columns DBP age sex and Grp (Treatment Group or Comparison Reservoir) http://statweb.stanford.edu/~rag/stat209/matchex.dat

Table 6.6 Hypothetical Measurements on Confounding Variables   
Treatment Group                               Comparison Reservoir
Subject Diastolic Blood                  Subject  Diastolic Blood
Number  Pressure (mm Hg) Age Sex         Number    Pressure (mm Hg)   Age Sex
1          94             39  F            1              80           35  F
2          108            56  F            2              120          37  F
3          100            50  F            3              85           50  F
4          92             42  F            4              90           41  F
5          65             45  M            5              90           47  F
6          90             37  M            6              90           56  F
                                           7              108          53  F
                                           8              94           46  F
                                           9              78           32  F
                                           10             105          50  F
                                           11             88           43  F
                                           12             100          42  M
                                           13             110          56  M
                                           14             100          46  M
                                           15             100          54  M
                                           16             110          48  M
                                           17             85           60  M
                                           18             90           35  M
                                           19             70           50  M
                                           20             90           49  M

a. show preexisting difference between comparison and treatment, no matching.
b. try to do a match by hand, finding a best match for each of the treatment subjects.
c. use the 3 confounding variables to compute a propensity score (for membership in the treatment); match subjects on the propensity scores (i.e. nearest comparison to each treatment subject) by hand, or use optmatch functions to do optimal matching either 1:1 or 1:2. See which provides better (less bad) balance in the covariates.

Solution for question 4

Question 5. Extended Example: Propensity scores versus regression adjustment, single confounder
Artificial data construction
1. start with 10000 subjects-- outcome measure Y
2. subjects belong to groups (G=0,1) based on probabilistic assignment on a single unobserved variable X normal mean 10 variance 4 Prob(G = 1|X) = 1 - (1/(1 + 1/exp(-5 + .5*X)) )
3. Outcome measure Y also highly correlated with X. Y = 1.2*G + X + u (u is Normal, mean 0, variance 1.69) treatment effect is built in as 1.2 (about half a sd)
4. Besides Y and G, the observable that is available is a version of X obscured by measurement error; let Z be a fallible version of X with reliability about .72 (i.e. correlation about .85).
a. compare group differences on Z (preexisting diffs)
b. try out regression adjustment estimate for treatment effect-- Use observable Z as covariate. Compare with using X as covariate.
c. use Z to compute propensity score for each of 10000 subjects. stratify into quintiles on propoensity (as in Rubin Arch Int Med from lecture). And compute a treatment/control comparison within each of the 5 propensity strata. Also get overall comparison from main effect in the 2x5 anova.
d. repeat part c using the unobservable X. Does X give better results.
e. which works better in the 1-dimensional case, propensity matching or regression adjustment?

Solution for question 5

Question 6. Cochran subclassification for confounding variable
Week 7 class example, age as a confounder on effects of (cig) smoking.
Use lalonde data as play data to show a simple implementation of subclassification adjustment with re78 as outcome, treatment group comparsion, and just consider age as the confounder.

> library(MatchIt)
> data(lalonde)
> head(lalonde)
     treat age educ black hispan married nodegree re74 re75       re78
NSW1     1  37   11     1      0       1        1    0    0  9930.0460
NSW2     1  22    9     0      1       0        1    0    0  3595.8940
NSW3     1  30   12     1      0       0        0    0    0 24909.4500
NSW4     1  27   11     1      0       0        1    0    0  7506.1460
NSW5     1  33    8     1      0       0        1    0    0   289.7899
NSW6     1  22    9     1      0       0        1    0    0  4056.4940
> attach(lalonde)
> table(treat) # 185 in job training
treat
  0   1 
429 185

> tapply(re78, treat, mean) # oh my, seems better off with no job training, can the Republicans be right?
       0        1 
6984.170 6349.144 

> tapply(age, treat, mean) # there is an mean age diff
       0        1 
28.03030 25.81622 
> tapply(age, treat, fivenum) # same medians,but some controls older
$`0`
[1] 16 19 25 35 55

$`1`
[1] 17 20 25 29 48

Solution for question 6

Week8

Lecture slides, week 8
week 8 (pdf)
Audio companion, week 8
parta partb

Matching Methods for Observational Data: Part II

Lecture topics
Computational Examples of Matching Methods
1. Ben Hansen's Nuclear Plants data
         optmatch exs, nuclear plants, gender      ascii version for some Ben Hansen matching exs using MatchIt/optmatch
        Pair matching--nuclear plants data. 1:2 optimal pair matching using MatchIt and pairmatch in optmatch plus balance diagnostics.
2. Lalonde job training data
Lalonde NSW data. Subclassification/Stratification and Full matching.
      Lalonde data class handout
      Rogosa R-session (using R 3.3.3)        4/1/18 redo in R 3.4.4 (sparse)
      2019 lalonde Matchit: full matching, balance with cobalt love.plot and bal.tab
      2019 lalonde optmatch: fullmatch with outcome analysis
   Legacy Stat209 Lab 4, Lalonde Data, is arranged in pieces
a.   Lab4, exposition and commands
b.   Lab 4, Rogosa R-session, Base (sections 1-3)
c.   Lab 4, Rogosa R-session, additional matching exercises (incl secs 4-6)
d.   Lab 4, Rogosa R-session: not done until ancova is run
3. Alternative (non-matching) propensity score analyses. Propensity score weighting: Inverse Probability of Treatment Weighting (IPTW).
   twang package from RAND, tutorials and resources.    Also, an exposition using the Lalonde data  and    another exposition

R Implementations and Resources
1. MatchIt provides a wrapper that can call optmatch or Sekhon's genetic matching
MatchIt: Nonparametric Preprocessing for Parametric Casual Inference Daniel Ho, Kosuke Imai, Gary King, Elizabeth Stuart
   MatchIt vignette
JSS May 2011 exposition: MatchIt: Nonparametric Preprocessing for Parametric Causal Inference

2. Ben Hansen (local hero)   optmatch manual    R News Oct 2007
        optmatch:fullmatch vignette     optmatch another version     another good tutorial optmatch Functions for Optimal Matching
  Hansen presentation: Flexible, Optimal Matching for Comparative Studies Using the optmatch package
Additional exercises (checking balance) using the nuclearplants data (class handout ex) from Mark Fredrickson here
Optmatch application paper: Full matching in an observational study of coaching for the SAT.(Scholastic Assessment Test) Journal of the American Statistical Association; 9/1/2004; Hansen, Ben B.
Another optmatch example presentation: Attributing Effects to a Get-Out-The-Vote Campaign Using Full Matching and Randomization Inference Jake Bowers and Ben Hansen.    Data archive and computing resources for the New Haven get-out-the-vote

3. Cobalt:     Using cobalt with Other Preprocessing Packages     Covariate Balance Tables and Plots: A Guide to the cobalt Package

4. R Package PSAgraphics: Vignette JSS   PSAgraphics: An R Package to Support Propensity Score Analysis Journal of Statistical Software February 2009, Volume 29, Issue 6.

5. Matching package Multivariate and Propensity Score Matching Software for Causal Inference Jasjeet S. Sekhon

Week9 Week10

Time-1, Time-2 (Longitudinal) Data in Experimental Designs and Observational Studies

Lecture slides, weeks 9,10 (pdf)
Audio companion, weeks 9,10
parta partb

Lecture topics

1. Experimental Designs Cross-over designs (usually time-1, time-2).
Primary reading: Laird-Ware text slides (pdf pages 135-150).
Crossover design data from slide 137,    anova for crossover design ex       ascii version, anova for crossover design ex
     also see slides 5-14 Repeated Measures Design Mark Conaway
   R-resources for crossover designs. package Crossover    Crossover vignette     package crossdes   see Rnews Vol. 5/2, November 2005

2. Experimental Designs Comparing groups on time-1, time-2 measurements: repeated measures anova vs lmer OR the t-test
Primary reading: Comparative Analyses of Pretest-Posttest Research Designs, Donna R. Brogan; Michael H. Kutner, The American Statistician, Vol. 34, No. 4. (Nov., 1980), pp. 229-232.   JSTOR link
     urea synthesis, BK data       data, long-form
    BK plots (by group)     BK overview
    2017 Analysis handout     Extended BK lmer analysis
Additional stuff
     BK repeated measures analysis      pdf version
    Stat141 analysis
    archival example analyses. SAS and minitab

3. Observational studies   Economist's differences in differences (or diffs in diffs with matching) for observational studies.
  class slide
    A very popular subject these days. Pretty good Wiki page     LSE slides
    Austin Nichols slides. Causal inference with observational data A brief review of quasi-experimental methods July 2009
     Angrist Ch 5, MHE. Card and Krueger (1994) data, minumimum wage ex
paper On the Use of Linear Fixed Effects Regression Models for Causal Inference(sec 3.2)
       R-package did

4. Observational studies      Lord's paradox; pre-post group comparisons.
Lord notes
Primary readings:   Publication: Lord, F. M. (1967). A paradox in the interpretation of group comparisons. Psychological Bulletin, 68, 304-305.
      Wainer, H. (1991). Adjusting for differential base rates: Lord's Paradox again. Psychological Bulletin, 109, 147-151.

5. Observational studies      Exogenous Variables and Correlates of Change (use of lagged dependent variables)
   Time-1,time-2 data analysis examples    Measurement of change: time-1,time-2 data
      data example for handout    scan of regression handout      ascii version of data analysis handout
   Extra material for Correlates and predictors of change: time-1,time-2 data
    Rogosa R-session to replicate handout, demonstrate wide-to-long data set conversion, and descriptive fitting of individual growth curves. Some useful plots from Rogosa R-session
        Technical results: Section 3.2.2 esp Equation 27 in Rogosa, D. R., & Willett, J. B. (1985). Understanding correlates of change by modeling individual differences in growth. Psychometrika, 50, 203-228.      Talk slides

   Observational studies: Additional Special topics (likely unaddressed).
Interrupted Time-series designs
      Gene Glass overview      Time Series Analysis with R section 4.6   R package BayesSingleSub: Computation of Bayes factors for interrupted time-series designs
  Current implementations of value-added analysis
   American Statistical Association Statement on Using Value-Added Models for Educational Assessment
J.R. Lockwood, Harold Doran, and Daniel F. McCaffrey. Using R for estimating longitudinal student achievement models. R News, 3(3):17-23, December 2003.

Time-1, Time 2 Research Examples
Pre-post design Experiments
a. Get moving    Can't Focus? 10 Minutes Of Exercise Gives Brain Burst Of Energy        Short-term exercise equals big-time brain boost.
   Publication: Executive-related oculomotor control is improved following a 10-min single-bout of aerobic exercise: Evidence from the antisaccade task Neuropsychologia Volume 108, 8 January 2018, Pages 73-81.
b. When Adolescents Give Up Pot, Their Cognition Quickly Improves
c. Stents?   A Controversial Experiment Upends The Conventional Wisdom On Heart Stents    Publication: Percutaneous coronary intervention in stable angina (ORBITA): a double-blind, randomised controlled trial The Lancet.
d. Mere Visual Perception of Other People's Disease Symptoms Facilitates a More Aggressive Immune Response Psychological Science, April 2010   Pre-post data and difference scores (see Table 1)
e. Guns and testosterone. Guns Up Testosterone, Male Aggression
Guns, Testosterone, and Aggression: An Experimental Test of a Mediational Hypothesis Klinesmith, Jennifer; Kasser, Tim; McAndrew, Francis T, Psychological Science. Vol 17(7), Jul 2006, pp. 568-571.
Crossover designs.
      a.  This time with 3 conditions   For Exercise, Nothing Like the Great Outdoors   Publication: Niedermeier M, Einwanger J, Hartl A, Kopp M (2017) Affective responses in mountain hiking-- randomized crossover trial focusing on differences between indoor and outdoor activity. PLoS ONE 12(5): e0177719. https://doi.org/10.1371/journal.pone.0177719
      b.   Does nutrition science know anything?     Is white or whole wheat bread 'healthier?' Depends on the person    Publication: Bread Affects Clinical Parameters and Induces Gut Microbiome-Associated Personal Glycemic Responses Cell Metabolism, Korem et al DOI: 10.1016/j.cmet.2017.05.002
      c. RCT (cross-over design). Damn right! The secret of success is swearing: How shouting four letter words can help make you stronger    Swearing can help you boost your physical performance    The full power of swearing is starting to be discovered
      d. One thing at a time. Why listening to a podcast while running could harm performance    Publication: A trade-off between cognitive and physical performance, with relative preservation of brain function Scientific Reports 7, Article number: 13709 (2017) nature.com.

Additional resources
1. Repeated measures analysis of variance
Models for Pretest-Posttest Data: Repeated Measures ANOVA Revisited Earl Jennings Journal of Educational Statistics, Vol. 13, No. 3. (Autumn, 1988), pp. 273-280. Jstor
A good R-primer on repeated measures (a lots else). Notes on the use of R for psychology experiments and questionnaires Jonathan Baron, Yuelin Li.   Another version
Multilevel package has behavioral scienes applications including estimates of within-group agreement, and routines using random group resampling (RGR) to detect group effects.
More repeated measures resources: Background primer on analysis of variance (with R); see sections 6.8, 6.9 of Notes on the use of R for psychology experiments and questionnaires Jonathan Baron, Yuelin Li.   Pdf version    The ez package provides extended anova capabities.   Examples (blog notes) : Repeated measures ANOVA with R (functions and tutorials)   Repeated Measures ANOVA using R    Obtaining the same ANOVA results in R as in SPSS - the difficulties with Type II and Type III sums of squares

2. Lord's Paradox, pre-post group comparisons.
Lord, F. M. (1967). A paradox in the interpretation of group comparisons. Psychological Bulletin, 68, 304-305.L
Wainer, H. (1991). Adjusting for differential base rates: Lord's Paradox again. Psychological Bulletin, 109, 147-151.
or Wainer and Brown Three Statistical Paradoxes in the Interpretation of Group Differences: Illustrated with Medical School Admission and Licensing Data
a quick low-level read: Lord's Paradox and the Assessment of Change During College    Journal of College Student Development, May/Jun 2004 by Pike, Gary R
Another time1-time2 reading covering old-fashioned ground including Lord's paradox. Maris, Eric. (1998). Covariance Adjustment Versus Gain Scores--Revisited. Psychological Methods, 3(3) 309-327. apa link

3. Value-added analysis.
Value-added does New York City. New York schools release 'value added' teacher rankings     Formula uncovers the 'value added'    from the unions: THIS IS NO WAY TO RATE A TEACHER
Chap 9 in Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies. Howard Wainer (Author) amazon page    available in paper and Kindle
Other versions of the Chap 9 materials Value-Added Models to Evaluate Teachers: A Cry For Help H Wainer, Chance, 2011.         Journal of Consumer Research Vol. 32, No. 2, Sept 2005
More Value-added analysis. Journal of Educational and Behavioral Statistics Vol. 29, No. 1, Spring, 2004 Value-Added Assessment Special Issue
Value-Added Measures of Education Performance: Clearing Away the Smoke and Mirrors, PACE
LA Times Teacher Ratings, summer 2010        NEPC vs LATimes
Fitting Value-Added Models in R Harold C. Doran and J.R. Lockwood
Andrew Gelman on Value-added arithmetic: It's no fun being graded on a curve     more NY Principals rebel against 'value-added' evaluation

4. Interrupted time-series
Interrupted Time Series Quasi-Experiments Gene V Glass Arizona State University
Did fertility go up after the Oklahoma City bombing? An analysis of births in metropolitan counties in Oklahoma, 1990-1999. Demography, 2005.
original publication (ozone data): Box, G. E. P. and G. C. Tiao. 1975. Intervention Analysis with Applications to Economic and Environmental Problems." Journal of the American Statistical Association. 70:70-79. SAS example for ozone data     another ozone analysis with data
Box-tiao time series models for impact assessment Evaluation Quarterly 1979
Interrupted time-series analysis and its application to behavioral data Donald P. Hartmann, John M. Gottman, Richard R. Jones, William Gardner, Alan E. Kazdin, and Russell S. Vaught J Appl Behav Anal. 1980 Winter; 13(4): 543-559.
Segmented regression analysis of interrupted time series studies in medication use research. By: Wagner, A. K.; Soumerai, S. B.; Zhang, F.; Ross-Degnan, D.. Journal of Clinical Pharmacy & Therapeutics, Aug2002, Vol. 27 Issue 4, p299-309,
Interrupted Time Series Designs In Health Technology Assessment: Lessons From Two Systematic Reviews Of Behavior Change Strategies Craig R. Ramsay University Of Aberdeen, International Journal Of Technology Assessment In Health Care, 19:4 (2003), 613-623.

5. Measurement of Change, Correlates of Change, Growth Curve Analysis.   See also Stat222 website
Rogosa, D. R., & Willett, J. B. (1985). Understanding correlates of change by modeling individual differences in growth. Psychometrika, 50, 203-228. available from John Willet's pub page
A growth curve approach to the measurement of change. Rogosa, David; Brandt, David; Zimowski, Michele Psychological Bulletin. 1982 Nov Vol 92(3) 726-748 APA record   direct link
Longitudinal Data Analysis Examples with Random Coefficient Models. David Rogosa; Hilary Saner . Journal of Educational and Behavioral Statistics, Vol. 20, No. 2, Special Issue: Hierarchical Linear Models: Problems and Prospects. (Summer, 1995), pp. 149-170. Jstor
Demonstrating the Reliability of the Difference Score in the Measurement of Change. David R. Rogosa; John B. Willett Journal of Educational Measurement, Vol. 20, No. 4. (Winter, 1983), pp. 335-343. Jstor