Course introduction (slides and audio posted on main page)

Background readings (not required, but of interest if you haven't seen these before)

1. Correlation and Causation: A Comment, Stephen Stigler

2. Secret to Winning a Nobel Prize? Eat More Chocolate (Time)

Publication: Chocolate Consumption, Cognitive Function, and Nobel Laureates Franz H. Messerli, M.D. N Engl J Med 2012; 367:1562-1564 October 18, 2012

3.

From Association to Causation: Some Remarks on the History of Statistics;

Statistical Models for Causation: A critical review

Statistical Models and Shoe Leather,

Illustration using encouragement design representation in Holland (1988). copies of selected overheads.

Encouragement Designs. Potential outcomes formulation and IV parameter estimation in Holland (1988). Estimation handout

Do regression methods (path analysis) identify causal effects? Demonstrations of failure for Holland's encouragement design. class handout Encouragement design slides

Paul Holland, Causal Effects and Encouragement Designs. Causal Inference, Path Analysis, and Recursive Structural Equations Models

Paul W. Holland Sociological Methodology, Vol. 18. (1988), pp. 449-484. (Encouragement design results; sections 3-5)

Holland Appendix (esp pp. 475-480) presents the potential outcomes formulation.

A special quasi-experimental design, the encouragement design, is used to give concreteness to the discussion by focusing on the simplest problem that involves both direct and indirect causation.

It is shown that Rubin's model extends easily to this situation and specifies conditions under which the parameters of path analysis and recursive structural equations models have causal interpretations.

Gelman-Hill text sec 10.5; Data Analysis Using Regression and Multilevel/Hierarchical Models

Publication: Feasibility and efficacy of sodium reduction in the Trials of Hypertension Prevention, phase I Trials of Hypertension Prevention Collaborative Research Group. S K Kumanyika, P R Hebert, J A Cutler, V I Lasser, C P Sugars, L Steffen-Batey, A A Brewer, MI. Hypertension doi: 10.1161/01.HYP.22.4.5021993;22:502-512

Historical (Barron-Kenny) methods David Kenny web page

R-implementations: mediating variables data analysis example data file

Barron-Kenny method via Sobel function in the multilevel package.

More extensive implementation (incl BCa bootstrapping) function

power and sample size calculations in package

Vignette for

Mediation Analysis David P. MacKinnon, Amanda J. Fairchild, and Matthew S. Fritz Department of Psychology, Arizona State University, Tempe, Arizona 85287-1104; Annu. Rev. Psychol. 2007. 58:593-614

Brader T, Valentino NA, Suhat E (2008). What Triggers Public Opposition to Immigration? Anxiety, Group Cues, and Immigration." American Journal of Political Science, 52(4), 959-978. jstor link

Data in

The irisin bench-science mediation example is discussed at the beginning of Week 2 lecture for recap and because I couldn't find it at the time.

NYTimes:How Exercise May Help Keep Our Memory Sharp .

Publication: Exercise-linked FNDC5/irisin rescues synaptic plasticity and memory defects in Alzheimer's models

Stanford Medicine Common opioids less effective for patients on SSRI antidepressants Publication: Predicting inadequate postoperative pain management in depressed patients: A machine learning approach Arjun Parthipan,Imon Banerjee,Keith Humphreys,Steven M. Asch,Catherine Curtin,Ian Carroll ,Tina Hernandez-Boussard Published: February 6, 2019https://doi.org/10.1371/journal.pone.0210575

New Yorker. December 23, 2013. The Power of the Hoodie-Wearing C.E.O. Publication: The Red Sneakers Effect: Inferring Status and Competence from Signals of Nonconformity Author(s): Silvia Bellezza, Francesca Gino, and Anat Keinan Source:

Mediators and Moderators of Treatment Effects in Randomized Clinical Trials Helena Chmura Kraemer; G. Terence Wilson; Christopher G. Fairburn; W. Stewart Agras Arch Gen Psychiatry. 2002;59:877-883

additional technical papers. Causal Mediation Analysis Using R K. Imai, L. Keele, D. Tingley, and T. Yamamoto American Political Science Review Vol. 105, No. 4 November 2011 Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M.,West, S. G., Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83-104.

Chapter 14: Mediation and Moderation Alyssa Blair

Mediation and Moderation Analyses with R - OSF presentation slides

Question 1. Mediating Variable Computations: Class example continued

The data set shown in class example ss423 is linked above and in the legacy directory http://web.stanford.edu/~rag/stat209/ss423

for predictor (IV) 'belong' outcome 'depress' and (potential) mediating variable 'master' The class example showed you the Baron-Kenny analysis using functions from the multilevel and MBESS packages.

Here just use 'lm' basic regression and the recipees from the class handout to recreate point estimates and asymptotic standard errors, significance tests for the mediating variable effect.

Compare your result with the class example posting.

Extra: also try out the more 'sophisticated' functions in the mediation package.

Question 2. Potential Outcomes, Encouragement Design Estimation and (Causal) Mediation

Task 1. Create a potential Outcomes dataset following the first ALICE specification in the posted slides (week 3) ## ALICE example beta = 3 rho = 3 tau = 1, delta = 3 (I did n=400; larger would be better so I redid with n = 6400)

Task 2. Use the artificial data to show the results for the mediation (indirect) effect by hand doing the 3 regressions using multilevel package (sobel) using MBESS package using the causal mediation estimation ACME from the mediation package and compare with rho*beta

Task 3 estimate beta by the Wald estimator (assuming tau = 0) and estimate mediation effect

Question 3. Sesame Street: Encouragement Design research example

Sesame Street research setting and data description given pdf p.30 of Lecture 1 (also Gelman text).

For this exercise use

Use the encouragement design formulation to estimate the effect on child cognitive development (postnumb here) of watching more Sesame Street.

What assumption is necessary for the IV estimation in this design?

Obtain a point and interval estimate for the effect of viewing (use

From simple descriptives reproduce this instrumental variables estimate (Wald estimator).

The second approach (path analysis) analyzed by Holland requires what assumption?

Obtain the path analyses (regression) estimate for the effect on child cognitive development (postnumb here) of watching more Sesame Street.

Compare with the IV estimate (which employs different assumptions).

Moderating Variables in experimental studies (heterogeneous treatment effects)

0. Moderation, mediation recap slide

1. Review: formulation and purposes of analysis of covariance

basic (old) ancova exposition slides ancova and extensions, math notes

High School and Beyond (observational study) school means data example HSB ancova handout (ascii version) data for HSB ancova HSB ancova, scanned pdf

2. Moderating variables, Heterogeneous Treatment Effects (CATE).

Analyzing treatment effects as a function of covariate(s)

CNRL, including Johnson-Neyman technique cnrl data cnrl analysis (extended)

Rogosa, D. R. (1980). Comparing nonparallel regression lines.

R resources (below).

Aspirin may be less effective heart treatment for women than men

Publication: Aspirin Resistance in Patients with Stable Coronary Artery Disease, in the

Wash Post: Why smart people are better off with fewer friends .

Publication: Country roads, take me home... to my friends: How intelligence, population density, and friendship affect modern happiness.

Snow R.E. (1978) Aptitude-Treatment Interactions in Educational Research. In: Pervin L.A., Lewis M. (eds) Perspectives in Interactional Psychology. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-3997-7_10

Why Rich Parents Don't Matter UTexas press release: Being Poor Can Suppress Children's Genetic Potentials Publication: Emergence of a Gene x Socioeconomic Status Interaction on Infant Mental Ability Between 10 Months and 2 years DOI: 10.1177/0956797610392926 Psychological Science published online 17 December 2010 Elliot M. Tucker-Drob, Mijke Rhemtulla, K. Paige Harden, Eric Turkheimer and David Fask

package

package

Improving Present Practices in the Visual Display of Interactions Advances in Methods and Practices in Psychological Science

Covariance Adjustment in Randomized Experiments and Observational Studies Paul R. Rosenbaum

Some Aspects of Analysis of Covariance, A Biometrics Invited Paper with Discussion. D. R. Cox; P. McCullagh

Analysis of Covariance: Its Nature and Uses William G. Cochran

The Use of Covariance in Observational Studies W. G. Cochran

Estimation of the Slope and Analysis of Covariance when the Concomitant Variable is Measured with Error James S. Degracie; Wayne A. Fuller

Cronbach, L. J., & Snow, R. E. (1977). Aptitudes and instructional methods: A handbook for research on interactions. Irvington

Regions of Significant Criterion Differences in Aptitude-Treatment-Interaction Research Leonard S. Cahen; Robert L. Linn

Identifying Regions of Significance in Aptitude-by-Treatment-Interaction Research Ronald C. Serlin; Joel R. Levin

Defining Johnson-Neyman Regions of Significance in the Three-Covariate ANCOVA Using Mathematica Steve Hunka; Jacqueline Leighton

discussion of substantive issues: Trait-Treatment Interaction and Learning David C. Berliner; Leonard S. Cahen

Question 1. Background: standard analysis of covariance.(no moderating variable)

A researcher is studying the effect of an incentive on the retention of subject matter and is also interested in the role of time devoted to study.

Subjects are randomly assigned to two groups, one receiving (C3 = 1) and the other not receiving (C3 = 0) an incentive. Within these groups, subjects are randomly assigned to 5, 10, 15, or 20 minutes of study (C2) of a passage specifically prepared for the experiment. At the end of the study period, a test of retention is administered.

Treat the study time as a covariate for investigating the differential effects of the incentive. Does using the covariate improve precision in estimating the effect of incentive?

Does the ancova assumption of a constant treatment effect at levels of StudyMin appear reasonable?

full data are in file retention.dat formerly located at http://statweb.stanford.edu/~rag/stat209/retention.dat

Linked materials resolve to rag.su.domains seemlessly but to read in data files to R requires using the new file location.

update: statweb file locations will

Question 2. Revisit High School and Beyond ancova from Week 2 lecture

In the class example we used school level (mean, gradient) outcomes and used school mean ses as a covariate. Investigate the usefulness of that covariate by comparing the ancova in class example with just a simple t-test (sector) on these school level outcomes. What is the difference in precision between using the covariate or not? As this is not an RCT (revisit in Unit 2), also look at differences in the estimate of the sector effect (bias?).

Question 3. Comparing Regressions (demonstration data, not an RCT)

Let's give recognition to the guys who made S (and R) and take some data from Venables, W. N. and Ripley, B. D. (1999) Modern Applied Statistics with S-PLUS. Third Edition. Springer (now up to 4th edition). Chap 6 section 1 considers analysis of the data set whiteside (available as part of MASS subset of VR package) to access

> library(MASS) # do need to load library, MASS is part of base R > data(whiteside) > ?whiteside

Description

Mr Derek Whiteside of the UK Building Research Station recorded the weekly gas consumption and average external temperature at his own house in south-east England for two heating seasons, one of 26 weeks before, and one of 30 weeks after cavity-wall insulation was installed. The object of the exercise was to assess the effect of the insulation on gas consumption.

Format The whiteside data frame has 56 rows and 3 columns.:

Insul A factor, before or after insulation.

Temp Purportedly the average outside temperature in degrees Celsius. (These values is far too low for any 56-week period in the 1960s in South-East England. It might be the weekly average of daily minima.)

Gas The weekly gas consumption in 1000s of cubic feet.

Source. A data set collected in the 1960s by Mr Derek Whiteside of the UK Building Research Station. Reported by Hand, D. J., Daly, F., McConway, K., Lunn, D. and Ostrowski, E. eds (1993) A Handbook of Small Data Sets. Chapman & Hall, p. 69.

carry out a comparing regressions analysis with Insul as the group variable, Gas as outcome, and Temp as within-group predictor.

construct a 95% confidence interval for the effect of insul on on gas with temp = 4 (pick-a-point procedure)

for what values of temp does there appear to be an effect of Insul on Gas (simultaneous region of significance)

Question 4. R packages

In lecture there was short mention of these two R-packages that whose main functions are to carry out the pick-a-point and Johnson-Neyman claculations, which are developed in Rogosa(1980).

Try out these functions using the cnrl dataset (also from Rogosa,1980) which we worked out in the lecture materials.

Solutions spoiler alert: no joy from these packages.

parta partb

1. Compliance background: Intent-to-treat analyses, CACE estimators, research examples

2. Compliance and Dose-response data analysis (Efron-Feldman)

3. Rubin-Holland approach via Booil Jo presentation: Potential Outcomes Approach: A Brief Introduction

Class handouts: Compliance examples Compliance overview Compliance math notes Little-Rubin Ann Rev Pub Health formulation

Potential outcomes formulation (CACE): Causal Effects in Clinical and Epidemiological Studies Via Potential Outcomes: Concepts and Analytical Approaches Roderick J. Little and and Donald B. Rubin Vol. Annual Review of Public Health, 21: 121-145, May 2000.

Epidemiology exposition: An introduction to instrumental variables for epidemiologists, Sander Greenland,

Influence of adherence to treatment and response of cholesterol on mortality in the coronary drug project.

An introduction to instrumental variables for epidemiologists, Sander Greenland,

Compliance as an Explanatory Variable in Clinical Trials. B. Efron; D. Feldman

Joshua D. Angrist; Guido W. Imbens; Donald B. Rubin "Identification of Causal Effects Using Instrumental Variables"

Compliance as an Explanatory Variable in Clinical Trials. B. Efron; D. Feldman

David Freedman on Compliance Adjustments: Statistical Models for Causation: What Inferential Leverage Do They Provide? Evaluation Review 2006; 30: 691-713. On regression adjustments to experimental data Advances in Applied Mathematics vol. 40 (2008) pp. 180-93.

Intent-to-treat Analysis of Randomized Clinical Trials Michael P. LaValley Boston University ACR/ARHP Annual Scientific Meeting Orlando 10/27/2003

Intention to treat--who should use ITT? J. A. Lewis and D. Machin Br J Cancer. 1993 October; 68(4): 647-650.

Compliance analyses, R-implementations: Imai

What is meant by intention to treat analysis? Survey of published randomised controlled trials Sally Hollis and Fiona Campbell

Booil Jo, Dept of Psychiatry Estimation of Intervention Effects with Noncompliance

Compliance Publications based on Neyman-Rubin causal models:

Direct and Indirect Causal Effects via Potential Outcomes Donald B. Rubin

Imbens GW and Rubin DB (1997) Bayesian Inference for Causal Effects in Randomized Experiments with Noncompliance The Annals of Statistics, 25, 305-327.

Principal Stratification in Causal Inference Constantine E. Frangakis and Donald B. Rubin,

Addressing Complications of Intention-to-Treat Analysis in the Combined Presence of All-or-None Treatment-Noncompliance and Subsequent Missing Outcomes. Constantine E. Frangakis; Donald B. Rubin

Additional Case Studies

Principal Stratification Approach to Broken Randomized Experiments: A Case Study of School Choice Vouchers in New York City Barnard, Frangakis, Hill, and Rubin

The British Journal of Psychiatry (2003) 183: 323-331 Estimating psychological treatment effects from a randomised controlled trial with both non-compliance and loss to follow-up graham dunn, and mohammad maracy

Non-random assignment on the basis of the covariate, such as regression discontinuity designs.

Regression Discontinuity handout Example from rdd manual ascii version

Rubin, D. B., (1977), "Assignment to a Treatment Group on the Basis of a Covariate",

Thistlewaite, D., and D. Campbell (1960): "Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment," Journal of Educational Psychology, 51, 309-317.

In Rosenbaum,

Angrist-Lavy Maimondes (class size) data Angrist and Lavy, 1999. read data

R-package--rdd; Regression Discontinuity Estimation Author Drew Dimmery

Also Package

RJournal for rdrobust, rdrobust: An R Package for Robust Nonparametric Inference in Regression-Discontinuity Designs

Journal of Econometrics (special issue) Volume 142, Issue 2, February 2008, The regression discontinuity design: Theory and applications Regression discontinuity designs: A guide to practice, Guido W. Imbens, Thomas Lemieux

Also from Journal of Econometrics (special issue) Volume 142, Issue 2, February 2008, The regression discontinuity design: Theory and applications Waiting for Life to Arrive: A history of the regression-discontinuity design in Psychology, Statistics and Economics, Thomas D Cook

the original paper: Thistlewaite, D., and D. Campbell (1960): "Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment," Journal of Educational Psychology, 51, 309-317.

Trochim W.M. & Cappelleri J.C. (1992). "Cutoff assignment strategies for enhancing randomized clinical trials." Controlled Clinical Trials, 13, 190-212. pubmed link

Capitalizing on Nonrandom Assignment to Treatments: A Regression-Discontinuity Evaluation of a Crime-Control Program Richard A. Berk; David Rauma

Berk, R.A. & de Leeuw, J. (1999). "An evaluation of California's inmate classification system using a generalized regression discontinuity design."

another econometric treatment

Question 1. Regression Discontinuity, classic "Sharp" design.

Replicate the package rdd toy example: cutpoint = 0, sharp design, with treatment effect of 3 units (instead of 10). Try out the analysis of covariance (Rubin 1977) estimate and compare with rdd output and plot. Pick off the observations used in the Half-BW estimate and verify using t-test or wilcoxon.

Extra: try out also the

Question 2. Systematic Assignment, "fuzzy design". Probabilistic assignment on the basis of the covariate.

i. Create artificial data with the following specification. 10,000 observations; premeasure (Y_uc in my session) gaussian mean 10 variance 1. Effect of intervention (rho) if in the treatment group is 2 (or close to 2) and uncorrelated with Y_uc. Probability of being in the treatment group depends on Y_uc but is not a deterministic step-function ("sharp design"):

ii. Try out analysis of covariance with Y_uc as covariate. Obtain a confidence interval for the effect of the treatment.

iii. Try out the fancy econometric estimators (using finite support) as in the rdd package. See if you find that they work poorly in this very basic fuzzy design example.

Extra: try out also the

Question 3. Controlled Assignment (class example)

From Rubin, D. B., (1977), "Assignment to a Treatment Group on the Basis of a Covariate", linked on course page

From page 16 Rubin

7. A SIMPLE EXAMPLE Table I presents the raw data from an evaluation of a computer- aided program designed to teach mathematics to children in fourth grade. There were 25 children in Program 1 (the computer-aided program) and 47 children in Program 2 (the regular program). All children took a Pretest and Posttest, each test consisting of 20 problems, a child's score being the number of problems correctly solved. These data will be used to illustrate the estimation methods discussed in Sections 4, 5, and 6. We do not attempt a complete statistical analysis nor do we question the assumption of no interference between units. TABLE I Raw Data for 25 Program 1 Children and 47 Program 2 Children Pretest Posttest Scores Scores Program 1 Program 2 10 15 6,7 9 16 7,11,12 8 12 5,6,9,12 7 8,11,12 6,6,6,6,7,8 6 9,10,11,13,20 5,5,6,6,6,6,6,6,6,8,8,8,9,10 5 5,6,7,16 3,5,5,6,6,7,8 4 5,6,6,12 4,4,4,5,7,11 3 4,7,8,9,12 0,5,7 2 4 4 1 - - 0 - 7Does assignment appear to be random or is this appear to be Assignment on the Basis of Pretest?

Try to estimate the asignment rule, presuming it is based on pretest How does this differ from a regression discontinuity design (simplest version)?

Assuming that assignment to Program 1 or Program 2 was solely on the basis of pretest (plus perhaps a probabilistic component) estimate the effect of program (new vs regular).

note data in table 1 exist in a more convenient form in file hw5rubin.dat http://statweb.stanford.edu/~rag/stat209/hw5rubin.dat and data file included in the solutions

Question 4 Non-compliance. Class example week 3.

Adapted from (linked on class page): An introduction to instrumental variables for epidemiologists, Sander Greenland, International Journal of Epidemiology 2000;29:722-729

Additional Reference: Sommer and Zeger (1991). On Estimating Efficacy from Clinical Trials. Statistics in Medicine

Greenland discusses randomized trials with non-compliance where Z indicates treatment assignment, which is randomized; X indicates treatment received, which is affected but not fully determined by assignment Z.

To illustrate Greenland presents in his Table 1 individual one- year mortality data from a cluster-randomized trial of vitamin A supplementation in childhood. Of 450 villages, 229 were assigned to a treatment in which village children received two oral doses of vitamin A; children in the 221 control villages were assigned none. This protocol resulted in 12,094 children assigned to the treatment (Z = 1) and 11,588 assigned to the control (Z = 0). Only children assigned to treatment received the treatment; that is, no one had Z = 0 and X = 1. Unfortunately, 2419 (20%) of those assigned to the treatment did not receive the treatment (had Z = 1 and X = 0), resulting in only 9675 receiving treatment (X = 1). Class handout has depiction and Greenland's table of results. Use as the outcome measure Y, the Deaths per 100,000 within one year (labeled Risk in Greenland's Table 1).

Part 1, using data summary from class handout

a. Give the ITT (intent-to-treat) estimate of the effect of vitamin A on Risk

b. What is the compliance rate in the treatment group (Z=1)? In the control group (Z=0)?

c. What is the instrumental variables estimate (following Angrist Imbens Rubin) of the effect of vitamin A on Risk?

What interpretation is given to this estimate (c.f. Booil Jo presentation)? Compare with part (a) result and comment.

Don Rubin has a great overview talk For Objective Causal Inference, Design Trumps Analysis Don Rubin, posted at http://www.bristol.ac.uk/media-library/sites/cmm/migrated/documents/trumps.pdf

Starting pdf page 21 Rubin takes up noncompliance using the Viamin A data (slightly different tabulated values than in the Greenland paper handout)

d. Recreate the calculations (ITT As-treated, Per Protocol) shown on pdf p.23; refer to Booil Jo handout

e. also CACE estimate pdf p.24

The Bayesian estimates (Imbens and Rubin 1997) pdf page 25 onward are implimented in part in the

Question 5

From the Booil Jo presentation slides in lecture, consider the JHU PIRC Intervention Study: N=284

Estimate Intervention Effects With Noncompliance

The Johns Hopkins Public School Preventive Intervention Study was conducted by the Johns Hopkins University Preventive Intervention Research Center (JHU PIRC) in 1993-1994 (lalongo et al., 1999~ The study was designed to improve academic achievement and to reduce early behavioral problems of school children. Teachers and first-grade children were randomly assigned to intervention conditions. The control condition and the Family-School Partnership Intervention condition are compared in this example. In the intervention condition, parents were asked to implement 66 take-home activities related to literacy and mathematics over a six-month period. One of the major outcome measures in the JHU PIRC preventive trial was the TOCA-R (Teacher Observation of Classroom Adaptation)

• Completed at least 45 activities = compliers.

• Outcome: change score (baseline - followup) of anti-social behavior .

From the means and compliance data given in the class materials (also linked Booil talk) compute treatment effect estimate of change in anti-social behavior: give ITT estimate and CACE estimate

Question 6 Broken RCT: Compliance, measured or binary

Compliance as a measured variable. In Stat209 week 3 we examine compliance adjustments; both those based on a dichotomous compliance variable and the much much more common measured compliance (often unwisely dichotomized to match Rubin formulation). The Efron-Feldman study ( handout description) used a continuous compliance measure. An artificial data set a data frame containing Compliance, Group, and Outcome for Stat209 is constructed so that ITT for cholesterol reduction is about 20 (compliance .6) and effect of cholestyramine for perfect compliance is about 35.

Try out some IV estimators for CACE. Obtain ITT estimate of group (treatment) effect with a confidence interval. Try using G as an instrument for the Y ~ comp regression. What does that produce?

Alternatively use the Rubin formulation with a dichotomous compliance indicator defined as TRUE for compliance > .8 in these data. What is your CACE estimate. What assumptions did you make? Compare with ITT estimate. In this problem the

parta partb partc

MT woes of regression coefficients slides Class Handout. Coleman data: adjusted-variables multiple regression (ascii version) Coleman scanned pdf

Additional materials: data file, 20 schools using

Added (adjusted) Variable plots in various R-packages,

slide for regression recursion

Class handout: Measurement error: Basic Results handout Also Faraway book (linked below) Ch.4 single predictor case; Maindonald-Braun sec6.7 results and R-functions, Stigler example

Hooke's Law example in Statistical Models for Causation: A critical review

see Lab 1 legacy Stat209 Multiple regression basics for data analysis demonstration for standardized variables and regression coefficients (and regression from correlation matrix) (aside "beta weights" in Kool-Aid Psychology Scientific American, Jan 2010)

Weisberg, H. I. Statistical adjustments and uncontrolled studies. Psychological Bulletin, 1979, 86, 1149-1164.

Background piece: Correlation and Causation: A Comment, (Stanford access) Stephen Stigler

Freedman text Ch. 1 (esp. Yule on paupers, Snow on Cholera) Chap 1 exs also in From Association to Causation: Some Remarks on the History of Statistics;

Do Breast-Fed Baby Boys Grow Into Better Students? Publication: Breastfeeding Duration and Academic Achievement at 10 Years (Stanford access). Wendy H. Oddy, Jianghong Li, Andrew J. O. Whitehouse, Stephen R. Zubrick, Eva Malacova.

Ohio State breastfeeding study. Is breast truly best? Estimating the effects of breastfeeding on long-term child health and wellbeing in the United States using sibling comparisons Cynthia G. Colen, , David M. Ramey Social Science & Medicine Volume 109, May 2014, Pages 55-65.

Ohio State press release. Breast-feeding Benefits Appear to be Overstated

Pediatrics 2006;117;1018-1027 Sexy Media Matter: Exposure to Sexual Content in Music, Movies, Television, and Magazines Predicts Black and White Adolescents' Sexual Behavior (Stanford access)

2008 uproar, RAND corp.

Sex on TV linked to teen pregnancies: Watching lots of racy shows can affect adolescents over time

Publication: Does Watching Sex on Television Predict Teen Pregnancy? Findings from a National Longitudinal Survey of Youth

The real truth on sex and rock-and-roll from Frank Zappa: Zappa on Crossfire 1987; Zappa vs Tipper Gore on Nightline 1985 with Ted Koppel

a. High fish consumption in pregnancy tied to brain benefits for kids Publication: Maternal Consumption of Seafood in Pregnancy and Child Neuropsychological Development: A Longitudinal Study Based on a Population With High Consumption Levels. American Journal of Epidemiology (2016) Vol 183(3), 169-182.

b. Eating lots of fish in pregnancy linked to obesity risk for kids Publication: Fish Intake in Pregnancy and Child Growth: A Pooled Analysis of 15 European and US Birth Cohorts. JAMA Pediatrics. Published online February 15, 2016. doi:10.1001/jamapediatrics.2015.4430

There's more. From the publication:

"

DAGitty resources. Drawing and Analyzing Causal DAGs with DAGitty Main website: DAGitty -- drawing and analyzing causal diagrams (DAGs)

a, Marriage and Cancer survival. Music: Love and Marriage

Marriage may help fight cancer Marriage is good for cancer patients

Publication: Martinez, M. E., Anderson, K., Murphy, J. D., Hurley, S., Canchola, A. J., Keegan, T. H. M., Cheng, I., Clarke, C. A., Glaser, S. L. and Gomez, S. L. (2016), Differences in marital status and mortality by race/ethnicity and nativity among California cancer patients.

b. Greenery and Longevity Music: Don't fence me in and Green Acres

Living Near Green Spaces Helps You Live Longer, New Study Shows Why living around nature could make you live longer Publication: Exposure to Greenness and Mortality in a Nationwide Prospective Cohort Study of Women Environ Health Perspect; DOI:10.1289/ehp.1510363 Also a Mediation analysis example (c.f.stat209 week 1). Gelman on mediation

c. A multi-decade example: Experiments vs Observational studies, Hormone Replacement Therapy

D.B. Petitti and D.A. Freedman. Invited commentary: How far can epidemiologists get with statistical adjustment? American Journal of Epidemiology vol. 162 (2005) pp. 415-18. Freedman handout page

Mosteller-Tukey, Chap 13 (

Practical Regression and Anova using R Julian J. Faraway, chapter 4. errors in predictors

MB 3rd ed Ch.6. esp 6.2.2 adjusted variables; 6.2 Interpreting regression coefficients; 6.7 errors in variables

Background info, errors in variables. Short primer on test reliability (Wm Trochin, Cornell) Informal exposition in

Some Effects of Errors of Measurement on Multiple Correlation, W. G. Cochran

An overview of

Covariance Adjustment in Randomized Experiments and Observational Studies Paul R. Rosenbaum

Some Aspects of Analysis of Covariance, A Biometrics Invited Paper with Discussion. D. R. Cox; P. McCullagh

Analysis of Covariance: Its Nature and Uses William G. Cochran

The Use of Covariance in Observational Studies W. G. Cochran

Intro IV (Disattenuation, omitted variables, "selection effects") and other IV applications for broken regression models

IV basics and measurement error example IV intro Stat266 Music: Wishin' and hopin'

Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments. Joshua D. Angrist; Alan B. Krueger,

See Angrist and Krueger, primary reading

and should instrumental variables (IV) provide the answer? Is Rain the magic IV?

A cautionary comment, including by Nobel-laureate Jim Heckman

Economists' Full paper: Does Television Cause Autism?

Now it's rainfall. Autism Prevalence and Precipitation Rates in California, Oregon, and Washington Counties Michael Waldman; Sean Nicholson; Nodir Adilov; John Williams Arch Pediatr Adolesc Med. 2008;162(11):1026-1034.

$320,000 Kindergarten Teachers Paper: How does your kindergarten classroom affect your earnings? evidence from project STAR Raj Chetty, Harvard University and NBER John N. Friedman, Harvard University and NBER Nathaniel Hilger, Harvard University Emmanuel Saez, UC Berkeley and NBER Diane Whitmore Schanzenbach, Northwestern University and NBER Danny Yagan, Harvard University March 2011.

Talks: How Does Your Kindergarten Classroom Affect Your Earnings? Evidence from Project STAR Raj Chetty, Harvard

Other IV applications. The Effect of File Sharing on Record Sales An Empirical Analysis

Two-stage Least Squares in R (tsls in sem package) by John Fox. older package systemfit)

Question 1. 1. Yule's Data via Freedman (deep review regression)

Yule (1899), "An investigation into the causes of changes in pauperism in England, chiefly during the last two intercensal decades."

File yuledoc.dat contains the data in Table 3, p.10 of Freedman text (1871 to 1881 comp) or described sec. 4 (p.6) of class reading "Association to Causation" and elsewhere) http://statweb.stanford.edu/~rag/stat209/yuledoc.dat

(note there's some preamble text in the file I commented out the preamble so it will load without issue, but it is still good to always look at file before opening)

I scanned and posted p10-11 of Freedman's text for reference on the variables and fit http://statweb.stanford.edu/~rag/stat209/DAFp10.pdf

a. replicate Yule's regression equation forInternal Revenue Service P.O. Box 510000 San Francisco, CA 94151-5100 the metropolitan unions, 1871--81. (parameters a b c d) Yule offered a regression equation, 'DELTA"Paup = a + b * "DELTA"Out + c * "DELTA"Old + d * "DELTA"Pop + error.

In this equation, "DELTA" is percentage change over time, "Out" is the out-relief ratio N/D, N = number on welfare outside the poor-house, D = number inside, "Old" is the percentage of the population over 65, "Pop" is the population.

Data are from the English Censuses of 1871, 1881 (Subtract 100 from each entry to get the percentage changes cf Freedman text pp10-11.)

Arithmetic addendum. Example: a variable has value 50 in 1870 and 60 in 1880 that's an increase of 10units or 20% (the metric used in the regression equation)

The data in yuledoc.dat reside in a somewhat cryptic form: in this case the entry would be (60/50)*100 = 120. So we obtain the (desired) 20% entry by subtracting 100 from the value in yuledoc.dat.

Example 2. value 70 in 1870 and 56 in 1880 yuledoc.dat entry would be (56/70)*100 = 80. Subtract 100 to get -20 (20% decline)

b. More complex regression review, which you may have done before. Orig version problem Freedman p.63, problem 4 asks you to test that the regression parameters for the variables change in population over 65 (c) and change in population (d) are both 0 (i.e. null hypothesis c=d=0 in Yule's regression above )

Question 2. Revisit Coleman data example from week 4 lecture.

part a. In the discussion of the Coleman data in Chap 13 Green book (Mosteller and Tukey) their Example 9 commentary suggests trying one school resource variable and one family demographic variable (instead of the bunch of redundent variables) in predicting vach.

See what happens with momed with simpler two predictor models: predictors tverb momed or ssal momed.

Are these regressions "better" than the full model? What you get from regression critically depends on what else is in the model ??

part b. As the Coleman data snippet used in the Green book (Mosteller and Tukey) is only 20 schools (with 5 predictors), for expository purposes I created a larger artificial data set, with 320 rows, for a population having the same means and covariances as the n=20 sample.

http://statweb.stanford.edu/~rag/stat209/coleman320.dat

Repeat the multiple regression and adjusted variables demonstration for momed.

Also as was done in class handout plot outcome (or adjusted outcome) vs the adjusted predictor (residuals from momed on the other predictors), to obtain the scatterplot the scatterplot for the multiple regression weight (cf plot from week 1 materials).

Computation note, data generation for this example to create the artificial data set with 320 rows I used the mvrnorm function in R, which requires the package MASS (part of basic R distribution)

if empirical = TRUE then the sample moments would match exactly those in the "ed" dataset

extra bit, Coleman data avPlots

I mentioned in class that John Fox's

Question 3.

part a. From Class handout, top frame of "Math facts" , prove the result in the 2-predictor case that the multiple regression parameter-- coefficient of X_1 -- is identical to the coefficient for Y regressed on the adjusted variable. Hint: use the formulas on the 'regression recursion' slide

part b. The "Regression Recursion" slide (useful trick) is worth revisiting. Linked in week 4 materials.

Take the second version (using vars labelled 1 2 3), and use from the Coleman data vach as var1, momed as var2, and ses as var3.

Demonstrate that this relation holds in the sample for the parameter estimates from the corresponding regressions.

Question 4. Measurement error, single predictor linear regression

Construct a simple artificial data illustration of effects of measurement error in a single predictor variable on a regression slope. Result is shown in class handout week 4.

Set the reliability coefficient for the predictor variable to be .8. Set the slope for the perfectly measured predictor to be 1.5. Compare slope for perfectly measured predictor with the slope using the fallible predictor measurement.

A couple of ways of doing this exercise (your choice)

a. generate true predictor values, predictor error, and outcome variable and do the two regressions

b. use mvrnorm and generate (all at once) outcome, true predictor, fallible predictor, and do the two regressions

c. use the R-package DAAG function errorsINx (linked in week 4 materials)

Question 5. Observational Studies: Regression Adjustments.

The display from lecture of the regression adjustments also has a numerical example (page 2 of pdf). Recreate the results shown for the Anderson et al Head Start example.

Also for lecture materials, Regression Adjustments with Non-equivalent groups Week 4, show that the Belson adjustment procedure (using control group slope) is equivalent to evaluating the vertical distance between the within-group regression fits at the mean of the treatment group. written out proof.

Question 6. Mroz87 data analysis, Panel Study of Income Dynamics" (PSID) .

Extended Instrumental Variables data analysis examples in Lab 3 of legacy Stat209.

Lab 3 Instrumental Variables. Lab3, exposition and commands

Lab 3, Rogosa R-session Mroz87 data description Lab3

note:

Question 7.

Recreate the IV artificial data demo from class handout week 4, the "measurement error example" using mvrnorm to get the artificial data (n=1000) to match exactly the specified parameters of the data generation.

Question 8. Simulation from Freedman.

Try to recreate the simulation described in Freedman text bottom p.191-top p.192 (note: page 199 revised version)

scan of pages 191-192 at

http://statweb.stanford.edu/~rag/stat209/dafp191.pdf

parta partb

1. Traditional Path Analysis introduction and examples (incl Blau-Duncan from Freedman chap 5). class handouts; basics and examples

[time permitting a little on Structural equation models: introduction and examples. old class handout]

2. Three-strikes against these

3. Traditional Path Analysis (regression) models are

1. From David Freedman

Freedman text Chap 5 (Chap 6 in revised ver).

2. David Rogosa. Casual Models Do Not Support Scientific Conclusions: A Comment in Support of Freedman.

Journal of Educational Statistics, Vol. 12, No. 2. (Summer, 1987), pp. 185-195. Jstor link

3. Revisit Week 1--Paul Holland: Encouragement design results; sections 3-5 Causal Inference, Path Analysis, and Recursive Structural Equations Models Paul W. Holland Sociological Methodology, Vol. 18. (1988), pp. 449-484.

Depression in girls linked to higher use of social media (Guardian) Social media linked to higher risk of depression in teen girls (Reuters). Publication: Social Media Use and Adolescent Mental Health: Findings From the UK Millennium Cohort Study EClinicalMedicine published by The Lancet, 2019 has multiple regression and path analysis, wow.

CNN: Violent video games linked to child aggression publication: Longitudinal Effects of Violent Video Games on Aggression in Japan and the United States Craig A. Anderson, Akira Sakamoto, Douglas A. Gentile, Nobuko Ihori, Akiko Shibuya, Shintaro Yukawa, Mayumi Naito, and Kumiko Kobayashi Pediatrics 2008; 122: e1067-e1072.

More using latent growth curve methods (structural equation models) Do video games fuel mental health problems? New Study Links Video Games and Mental Problems Publication: Douglas A. Gentile, Hyekyung Choo, Albert Liau, Timothy Sim, Dongdong Li, Daniel Fung, and Angeline Khoo Pathological Video Game Use Among Youths: A Two-Year Longitudinal Study

Life events, fitness, hardiness, and health: A simultaneous analysis of proposed stress-resistance effects. Roth, David L.; Wiebe, Deborah J.; Fillingim, Roger B.; Shay, Kathleen A. Journal of Personality and Social Psychology. Vol 57(1), Jul 1989, 136- 142.

There Is No Meaningful Relationship Between Television Exposure and Symptoms of Attention-Deficit/Hyperactivity Disorder. Tara Stevens and Miriam Mulsow

see D. Freedman, Statistical Models for Causation

See also Social Science and Psychometrics Task Views)

John Fox sem exposition another talk also Sec.5 Stats with R

Structural Equation Models package in R, sem manual OpenMx - Advanced Structural Equation Modeling Using R for Structural Equation Model:

R-implementations: Graphical Models, Causal Diagrams. CRAN Task View: gRaphical Models in R . Peter Buehlmann and

Class Theme Song

Path analysis intros Useful classnotes: Notre Dame

Path Analysis: Sociological Examples. Otis Dudley Duncan The American Journal of Sociology, Vol. 72, No. 1. (Jul., 1966), pp. 1-16. Jstor link

D.A. Freedman, Comments on Standardizing Path Diagrams: What Are the Parameters?

A reconsideration by a wise psychologist: The Path Analysis Controversy: A new statistical approach to strong appraisal of verisimilitude Meehl, Paul E; Waller, Niels G

Path Analysis, special issue: Journal of Educational Statistics Publication Info Vol. 12, No. 2, Summer, 1987 Issue As Others See Us: A Case Study in Path Analysis(pp. 101-128) D. A. Freedman

Original publication on the longitudinal path analysis: Some Models for Analysing Longitudinal Data on Educational Attainment. Harvey Goldstein

Technical details on Rogosa longitudinal examples:

Rogosa, D. R. (1993). Individual unit models versus structural equations: Growth curve examples.

In Statistical modeling and latent variables, K. Haagen, D. Bartholomew, and M. Diestler, Eds. Amsterdam: Elsevier North Holland, 259-281.

Rogosa, D. R., & Willett, J. B. (1985). Satisfying a simplex structure is simpler than it should be.

Journal of Educational Statistics, 10, 99-107. Jstor link

Structural equation modeling is a major industry in social and behavioral science with many texts (such as Principles and Practice of Structural Equation Modeling 2nd Edition Rex B. Kline; here's a long list), specialized courses, dedicated journals (Structural Equation Modeling: A Multidisciplinary Journal), and specialized computer programs (e.g., LISREL, EQS, AMOS).

Maximum likelihood factor analysis: A General Method for Analysis of Covariance Structures, K. G. Joreskog,

Structural equation modeling from Scientific Software International home of LISREL Student editions, documentation, examples, etc

Original Epi exposition. Greenland S., Pearl J., and Robins J.M. Causal diagrams for epidemiologic research. Epidemiology, 10(1):37-48, 1999.

Richardson and Robbins attempts at unification. Single World Intervention Graphs: A Primer Longer version

Graphical Markov Models: Overview Nanny Wermuth and D.R. Cox

C. Shalizi. Advanced Data Analysis from an Elementary Point of View, 2017; Chapter 24 (except 24.2)

Class handout: Third Variables (page 1)

1. Spurious Correlation: some historical notes; partial and part correlations. (class slides)

2. Simpson's paradox wiki page (dichotomous outcome slide)

From Week 0 intro: Secret to Winning a Nobel Prize? Eat More Chocolate (Time) Publication: Chocolate Consumption, Cognitive Function, and Nobel Laureates Franz H. Messerli, M.D. N Engl J Med 2012; 367:1562-1564 October 18, 2012

Are People Who Curse Actually More Honest? Research explores whether cursing might be a sign of greater honesty. Publication: Frankly, We Do Give a Damn: The Relationship Between Profanity and Honesty, Social Psychological and Personality Science.

Size does matter. Bigger is smarter: Overall, not relative, brain size predicts intelligence. Publication: Deaner RO, Isler K, Burkart J, van Schaik C: Overall Brain Size, and Not Encephalization Quotient, Best Predicts Cognitive Ability across Non-Human Primates. Brain Behav Evol 2007;70:115-124 (DOI: 10.1159/000102973)

perennial favorite Spurious Correlation examples

Correlations Genuine and Spurious in Pearson and Yule, John Aldrich

Spurious Correlation: A Causal Interpretation. Herbert A. Simon

Kidney stone example Confounding and Simpson's paradox, BMJ, vol309, 1480-1, 1994

UC Berkeley admissions, Racial bias in Death Penalty in wiki page.

R-Package ppcor October 29, 2012 Title Partial and Semi-partial (Part) correlation

R-package

Question 1. Freedman, Blau-Duncan example in class handout.

Freedman links "Stat Models for Causation" (pp3-4) or Freedman text Ch6 (revised)

Replicate class handout computations for the path analysis

Plus questions from Freedman text: scan of pp.80-81 at (pp86-7 revised ed) http://web.stanford.edu/~rag/stat209/DAFtextp8081.pdf (includes standardization material Hookes Law on week 4 class handout).

Freedman pp80-1 (set A) prob 1 prob 5 prob 6 prob 8

pdf scan also includes freedman Set E, p.97 prob 4(a,b) (p.103 revised)

Question 2. Causal Models of Publishing Productivity

freedman p.101 prob 5 (page 107 in revised version)

This Homework problem considers one of the path analysis models from "Causal Models of Publishing Productivity in Psychology", Rogers & Maranto, J. Applied Psychology, 1989, 74(4), 636-649.

direct link to paper http://content.apa.org/journals/apl/74/4/636.pdf

The path analysis conducted by the authors from a sample of 86 men and 76 women is shown in p.101 of Freedman's text and on page 647 of the publication; that page also exists at http://www-stat.stanford.edu/~rag/stat209/pathpage647.pdf

You do have the correlation matrix from adding Table 7 fits and residuals. But here all the problem asks you to do is look at and consider the usefulness of this analysis. Note they don't display the disturbance paths so we don't get a look at Rsq values.

What are the predictors of Pubs (direct effects) in this picture?

What are the predictors of Cites (direct effects) in this picture?

The diagram provides estimates of supposed causal effects ("causal model of publishing" is the article title); it displays regression coeffs , with coefficient estimates shown on the edges.

Consider a "productive researcher" to be defined in terms of the number of publications and the number of cites. The good news is that ability "affects" pubs and cites with a positive coefficient in each case. Therefore, higher ability leads to a more "productive researcher", according to the causal path gospel. Some bad news is that sex is a predictor of pubs with a large coefficient value. However, it is likely that there are confounding variables between sex and pubs.

Question 3. Longitudinal path analysis (based on the Goldstein example)

Apply the path analysis model taken from Goldstein (1979) (in class handouts week5,also Rogosa eq 2 1988, "casual models...) to verify results for path coefficients in eq 3 of Rogosa (1988) (also in handouts).

Data are given in http://statweb.stanford.edu/~rag/stat209/casualdat using the top frame of 40 observations for variables (perfectly measured) Xi(1) Xi(3) Xi(5) and taking the times of observation to be 1 3 5 respectively.

These data are in wide form--each row is a subject.

You can verify, if you like, that each subject's data lies on a straight-line (constant rate of change)

Try

Obtain values for the path coefficients and the muliple correlations for the regression fits.

Can you obtain standard errors for the path coefficients for this small sample?

Any interpretations of the results from the path analysis?

Question 4. ENRICHMENT ITEM, Structural Equation Models, Method-of-moments for two-variable, two-indicator model

Problem 4 is an "enrichment" item, and you may want to look at the solution which is linked.

For latent variable models with multiple indicators How does structural equation model (latent vars) methods provide a correction for measurement error?

Method-of-moments for two-variable, two-indicator model

For the Structural Equation Models handout from Joreskog book, which is linked in the week 5 lecture materials (class handout) but we did not take up in detail in class, obtain parameter estimates for the no-correlated error version (9 parameters, top covariance matrix) in terms of the sample variance and covariances among the four indicators (y_ij).

Brute force substitution will get you a non-optimal estimate, suffices for instructional purposes.

Question 5. Spurious correlation Consider the spurious correlation (common cause type) discussed class week 5. Additional examples from class page links in Simon (1954) or Aldrich (1995, sec 7 "illusory")

The data for this problem at http://rogosateaching.com/stat209/spuriousRQ.dat

Is the association between X and Y a consequent of common cause Z? Give a point estimate, corresponding scatterplot, and 95% confidence interval for the appropriate partial correlation. Does the partial correlation coefficient settle the causal question?

One real-life multi-million-dollar invocation version is:

Spurious correlations have been used by tobacco companies to argue that the association between smoking and lung cancer may actually be a result of some other factor such as a genetic factor that predisposes people both to nicotine addiction and lung cancer.

If this is true, then smoking cannot be blamed for causing cancer?

## Also if you like try out the ppcor package; I did in solutions, doesn't give you much

parta partb

1. Background: nested data, ecological fallacy, aggregation bias, levels of analysis. levels of analysis handout

2. Traditional approaches to multilevel analysis: contextual effects, school effects.

Class handouts: Multilevel regressions NELS example (NELS data

3. Modern multilevel analyses: mixed effects models, High School and Beyond (HSB) data. (2-level via lme4).

Legacy Stat209 Lab 2. Multilevel analysis (mixed-effects models)

complete Bryk dataset first pass, Bryk data: session plots

Lab2, exposition and commands provides a full write up (annotated) of the analyses

Lab2 (Rogosa session) using lme4, lmer (with additional plots)

Lecture slide, lme lmer for Bryk data side-by-side boxplots, SFYS analysis

D.A. Freedman. "Ecological inference and the ecological fallacy." International Encyclopedia for the Social and Behavioral Sciences. Elsevier (2001) vol. 6 pp. 4027-30. N. J. Smelser and Paul B. Baltes, eds. A one-page version: D.A. Freedman. "The ecological fallacy." In the Encyclopedia of Social Science Research Methods. Sage Publications (2004) Vol. 1 p. 293. M. Lewis-Beck, A. Bryman, and T. F. Liao, eds

Using R, lme, nlme. John Fox lme tutorial (HSB data) Fitting linear mixed models in R Using the lme4 package Douglas Bates (pp.27-30)

High School and Beyond (HSB) data. (2-level via lme4). Collection of HSB data analyses from various text sources

The original: Ecological Correlations and the Behavior of Individuals W. S. Robinson American Sociological Review Vol. 15, No. 3, Jun., 1950 .

One of many followups: Some Alternatives to Ecological Correlation Leo A. Goodman American Journal of Sociology Vol. 64, No. 6, May, 1959

A good sociological/medical overview. Ecological effects in multi-level studies. Blakely TA, Woodward AJ.

Klein, S. P. and Freedman, D. A. (1993), "Ecological regression in voting rights cases" Chance, 6, 38-43.

D.A. Freedman. "The ecological fallacy." In the Encyclopedia of Social Science Research Methods. Sage Publications (2004) Vol. 1 p. 293. M. Lewis-Beck, A. Bryman, and T. F. Liao, eds

A Rule for Inferring Individual-Level Relationships from Aggregate Data, Glenn Firebaugh

The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology J. Michael Oakes

R-package eiPack: R x C Ecological Inference and Higher-Dimension Data Management. R News Oct 2007

Educational multilevel data.

The Analysis of Multilevel Data in Educational Research and Evaluation Leigh Burstein

Methodological Advances in Analyzing the Effects of Schools and Classrooms on Student Learning, Stephen W. Raudenbush; Anthony S. Bryk Review of Research in Education, Vol. 15. (1988 - 1989), pp. 423-475. Jstor link

Analyzing Multilevel Data in the Presence of Heterogeneous within-Class Regressions Leigh Burstein; Robert L. Linn; Frank J. Capell

Bias in ecological regression Stephen Ansolabehere and Douglas Rivers

David A. Freedman et al., "Ecological Regression and Voting Rights,"

D.A. Freedman, S.P. Klein, M. Ostland, and M.R. Roberts. "Review of 'A Solution to the Ecological Inference Problem.' "

Using SAS PROC mixed:

Judith Singer HLM/PROC Mixed papers: Multilevel Modelling Newsletter ; JEBS1998 Using SAS PROC MIXED to Fit Multilevel Models, Jstor

Freedman, D. A. (census adjustments). Hierarchical Linear Regression

Doug Bates draft book (Feb 2010) Doug Bates SASmixed package

Fitting linear mixed models in R Using the lme4 package Douglas Bates (pp.27-30)

London exam data example in Examples from Multilevel Software Comparative Reviews Douglas Bates

Regression diagnostics for lmer models. Package influence.ME

mlmRev data examples.

STATA does it also

lmer for SAS PROC MIXED Users Douglas Bates Department of Statistics University of Wisconsin Madison

1. Cross-sectional Data: Simultaneous equations (2SLS, IV in butter, peer aspirations, ed and fertility, Freedman), nonrecursive models

Simultaneous equations handouts Duncan et al ascii

2. Reciprocal effects and non-recursive models in longitudinal data.

Empirical research on reciprocal effects, including cross-lagged correlation. clc slides

An (old) review of reciprocal effects. Rogosa, D. R. (1985). Analysis of reciprocal effects. In International Encyclopedia of Education, T. Husen and N. Postlethwaite, Eds. London: Pergamon Press, 4221-4225. (reprinted in Educational Research,Methodology & Measurement: An international handbook, J. P. Keeves Ed. Oxford: Pergamon Press, 1988.)

Michelob ULTRA® Super Bowl LV Spot Online. Are You Happy Because You Win? Or Do You Win Because You're Happy?

Fox17 Nashville: Increased screen time in young children associated with developmental delays

Publication: Association Between Screen Time and Children's Performance on a Developmental Screening Test JAMA Pediatr. Published online January 28, 2019. doi:10.1001/jamapediatrics.2018.5056

Study links excessive internet use to depression Publication: The Relationship between Excessive Internet Use and Depression: A Questionnaire-Based Study of 1,319 Young People and Adults. Catriona M. Morrison, Helen Gore

Peer Influences on Aspirations: A Reinterpretation Otis Dudley Duncan, Archibald O. Haller, Alejandro Portes American Journal of Sociology, Vol. 74, No. 2 (Sep., 1968), pp. 119-137 Jstor

Rindfus example (Freedman Chap 8; paper reprinted in Freedman text). Education and Fertility: Implications for the Roles Women Occupy Ronald R. Rindfuss; Larry Bumpass; Craig St. John American Sociological Review, Vol. 45, No. 3. (Jun., 1980), pp. 431-447. from Jstor

Eron LD, Huesmann LR, Lefkowitz MM, Walder LO. Does television violence cause aggression? Am Psychol. 1972;27:253–63. PubMed

Granger Causality. Nobel 2003. Complete Granger

Relationships--and the Lack Thereof--Between Economic Time Series, with Special Reference to Money and Interest Rates. David A. Pierce

Reciprocal effects: Rogosa, D. R. (1980). A critique of cross-lagged correlation.

Structural Equation Modeling With the sem Package in R John Fox STRUCTURAL EQUATION MODELING,13(3),465486 Jox Fox home page

Question 1. Grouping and multilevel regressions

Illustrate relations among individual level (ignoring groups) group-level, and relative standing regression results.

Part I groups formed on X

Create 200 individual level observations on X and Y having correlation around .65.

I started with x values 1:200 (simple integers) for convenience, but you can be fancier.

Do an individual level Y on X regression (i.e. "total, ignoring groups which don't exist yet).

Group these 200 individuals into 10 groups of size 20 on the basis of the X-values (i.e. group 1 contains the individuals with the smallest 20 X-values, group 10 contains the individuals with the largest 20 X-values). So within-groups will be as homogeneous as possible on X, and between group differences on X will be largest.

Do a regression on group means (between groups regression) these may be classroom means for example, and you may not have individual level data.

Get a relative standing measure: individual score minus group mean for each individual.

Do a relative standing regression

Now do the multiple regression analyses ( class handouts; Burstein, Deleuuw & Kreft)

1. "context" Y on X and X-bar (X-bar is an attribute of each individual)

2. "Cronbach" (Kreft's term) Y on X minus X-bar and X-bar (predictors uncorrelated)

Demonstrate the coefficients match the basic relations shown in lecture

Part II groups formed independent of X (random)

Repeat the analyses of Part I using a different (as different as can be) mechanism for assigning individuals to groups. Form the 10 groups of size 20 at random, making the groups heterogeneous on X within group and similar between groups.

Question 2. Contextual Effects Coefficient

Use the regression recursion relation from week 4 to show that the contextual effects coefficient defined in week 6 handouts is equal as stated in the handouts (and literature) to the between groups slope minus the within-pooled slope.

Question 3. Simplified version of HSB analysis

The ubiquitous analyses of the HSB data use a level 2 model, with meanses as a covariate in addition to the 'group treatment' indicator sector (P/C).

For intro instruction use of these multilevel methods for comparing 'effects' of Public vs Catholic, it would be cleaner just to do a 't-test' in the level 2 model-- i.e. the only predictor of level and gradient being sector.

Try out that simpler model and compare with standard analysis. Note that the side-by-side boxplots are still relevant for this reduced model, as the boxplots only relect the Level 1 specifications.

Question 4. Enrichment problem (better to spend time on HSB analyses etc)

Ecological fallacy: Is Radon good for you?

Treat this as an extended example of ecological bias.

At one time I went through the Robbins paper in class...

Solutions show you data generation procedures and illustrate the sometimes very large effects of aggregation bias. If the topic interests read through the G-R paper to see the point.

Consider the artificial data example described in Ex 3 p.750 Greenland and Robbins American Journal of Epidemiology Vol. 139, No. 8: 747-760 Ecologic Studies—Biases, Misconceptions, and Counterexamples (article linked on class page, week 6 under additional resources)

intro their Example 3

Suppose that our study data are limited to regional values of mean radon, mean smoking (in packs per day), and lung-cancer rates among males aged 70-74 years, for 41 regions indexed by r = 0, . . . , 40.

follow their example set up and create your own artificial data example and produce the regression function and plot in their figure 1 for the effect of radon levels on lung cancer rates

from G&R you are demonstrating the ecological fallacy because "the regressions yield an inverse association of radon and lung cancer, despite the fact that radon is a positive risk factor in the underlying model used to generate the data,"

"Even though the lung-cancer rates show the strong upward relation to smoking one would expect from model 1, and the ecologic correlation between radon and smoking is only 0.01, there is a significant negative ecologic association of radon with lung cancer rates."

Question 5. Simultaneous effects.

For the Duncan Haller Portes occupational aspiration example from class handout (cf Fox Soc Meth 1979 paper) replicate the 2SLS (IV) analysis of this non-recursive model from the class handout.

Extra item: Can you fit a model which adds a path from Friend's family SES to respondents occupational aspiration?

Weeks 7 and 8 materials are designed as an introduction to matching methods with computing using the

Students with experience, especially in computing, in matching methods may want to turn to the more advanced materials from the Spring qtr course: Epi292/Stat266 available on rogosateaching.com.

The

0. Review: Matching for increased precision, Randomized block designs (see Review Questions) package

1. Traditional matching methods: subclassification, pair matching. Case-control studies.

handout for smoking ex, Cochran subclassification

2. Modern Implementations of matching methods The advent/onslaught of propensity score matching methodology for treatment-control comparisons

propensity score intro checking balance, aspirin ex

Case-control studies: Case-control overview from Encyclopedia of Public Health

Non-technical matching overviews: Donald Rubin Nonrandomized Comparative Clinical Studies another version,[Lane library from campus]

Cochran's smoking, subclassification and Rubin's Breast Cancer example also discussed in Rubin "Design Trumps Analysis" Rubin paper . also set of slides

Another Rubin overview of matching, Matching Methods for Causal Inference Elizabeth Stuart Donald Rubin [does Lalonde example]

Joffe, Marshall M. and Paul R. Rosenbaum. 1999. "Invited Commentary: Propensity Scores." American Journal of Epidemiology 150(4):327-33.

Rosenbaum and Rubin, Reducing Bias in Observational Studies Using Subclassification on the Propensity Score, JASA 79[387], September 1984, 516-524. JStor [one of the original technical papers]

Carbonated Soft Drink Consumption and Risk of Esophageal Adenocarcinoma JNCI: Journal of the National Cancer Institute, Volume 98, Issue 1, 4 January 2006, Pages 72-75,

Aspirin use and all-cause mortality among patients being evaluated for known or suspected coronary artery disease: A propensity analysis. Gum PA1, Thamilarasan M, Watanabe J, Blackstone EH, Lauer MS. JAMA. 2001 Sep 12;286(10):1187-94.

Optmatch application paper: Hansen, Ben B. Full matching in an observational study of coaching for the SAT.(Scholastic Assessment Test)

Rosenbaum and Rubin, Reducing Bias in Observational Studies Using Subclassification on the Propensity Score, JASA 79[387], September 1984, 516-524. JStor

Breastfeeding May Not Lead to Smarter Preschoolers Breastfeeding does NOT boost a baby's IQ: Nourishing infants the natural way only makes them less hyper Breast-feeding study sheds light on benefits for babies

Publication: Breastfeeding, Cognitive and Noncognitive Development in Early Childhood: A Population Study. Lisa-Christine Girard, Orla Doyle, Richard E. Tremblay. PEDIATRICS Volume 1 39, number 4 , April 2017.

Strategies for Using Propensity Scores Well. A Workshop given by Thomas E. Love, Ph. D., Case Western Reserve University Love workshop ASA

A broad review of matching and bias-reduction methods. Opiates for the Matches: Matching Methods for Causal Inference Jasjeet S. Sekhon.

UNC, Chapel Hill Social Work: Introduction to Propensity Score Matching: A Review and Illustration Propensity Score Matching: A New Device for Program Evaluation UNC, Chapel Hill Social Work 2004 flash version

An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies Peter C. Austin Multivariate Behav Res. 2011 May; 46(3): 399-424.

Methods to assess intended effects of drug treatment in observational studies are reviewed

Average causal effects from nonrandomized studies: A practical guide and simulated example. Schafer, Joseph L.; Kang, Joseph Psychological Methods, Vol 13(4), Dec 2008, 279-313.

A Primer for Applying Propensity-Score Matching Office of Strategic Planning and Development Effectiveness, Inter-American Development Bank

Tutorial in biostatistics: Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group Statist. Med. 17, 2265-2281 (1998)

Optmatch application paper: Full matching in an observational study of coaching for the SAT.(Scholastic Assessment Test)

Additional exercises (checking balance) using the nuclearplants data (class handout ex) from Mark Fredrickson here

JSS May 2011 exposition: MatchIt: Nonparametric Preprocessing for Parametric Causal Inference more R-fun from Gary King, WhatIf: Software for Evaluating Counterfactuals

Another application (including matchit): Attributing Effects to a Get-Out-The-Vote Campaign Using Full Matching and Randomization Inference Jake Bowers and Ben Hansen. Data archive and computing resources for the New Haven get-out-the-vote

Also:

Rosenbaum, P. R. And D. B. Rubin, 1983, The Central Role of the Propensity Score in Observational Studies for Causal Effects, Biometrika 70[1], April 1983, 41-55. JStor

P. Rosenbaum, Chapters 2 and 3 (on exact inference for treatment effects) in Observational Studies, New York: Springer, 1995.

Dropping out of High School in the United States: An Observational Study Paul R. Rosenbaum

Paul R. Rosenbaum; Donald B. Rubin. "Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score"

D. Rubin, Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies, Statistical Science 5[4], November 1990, 472-480. JStor

Rubin, D. B., 1974, Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies, Journal of Educational Psychology, 66, 688-701.

Rubin, D. B., 1978, Bayesian Inference for Causal Effects: The Role of Randomization,” Annals of Statistics 6[1], January 1978, 34-58. JStor

Case-control overview (shown in class) from Encyclopedia of Public Health

Breslow NE. Statistics in epidemiology: the case-control study.J Am Stat Assoc. 1996 Mar;91(433):14-28

Carbonated Soft Drink Consumption and Risk of Esophageal Adenocarcinoma JNCI: Journal of the National Cancer Institute, Volume 98, Issue 1, 4 January 2006, Pages 72-75,

Smoking and Lung Cancer in Chap 18 of HSAUR3 (Handbook of Statistical Analysis Using R). Also driving and backpain data in Chap 7 HSAUR2

Some R-packages and resources: SensitivityCaseControl: Sensitivity Analysis for Case-Control Studies; multipleNCC: Inverse Probability Weighting of Nested Case-Control Data; Two-phase designs in epidemiology (Thomas Lumley) ; Exact McNemar's Test and Matching Confidence Intervals

Question 1. Matching and Paired t-test example from lecture

(Stat 141 exam problem (circa 2005))

An experiment on treating depression by Imipramine, an anti- depressant drug, employed a matched-pairs design. A total of 60 patients were paired on a combination of age, sex, and time of entry in study to form 30 matched pairs. That is, each pair consisted of patients who entered the study within a month of each other, were of the same sex and were similar in age. One member of each pair was randomly assigned to receive Imipramine and the other to receive a placebo. The outcome measure was the score on the Hamilton rating scale for depression (higher score = more severe depression) after 5 weeks of treatment.

The file http://web.stanford.edu/~rag/stat209/depressdata contains the outcome scores for each of the 30 pairs (Imipramine vs Placebo).

a. Carry out a statistical test of the equality of treatment outcomes. That is, test null hypothesis that Imipramine and Placebo produce equivalent outcomes versus a non-directional alternative. Use Type 1 error rate .05. State the result of the statistical test.

b. Pretend that an erstwhile graduate assistant lost all records of the matched pairs before the data analysis could be completed. Consequently, all the investigator has available is the 30 scores for the patients receiving Imipramine and the 30 scores for the patients receiving Placebo (but not the information on the matching). Carry out a statistical test of the hypothesis in part a using the available information. Is the result of the test the same? Explain why or why not.

c. Regard part (b) as a bad dream and return to the data set with full matching information. But now you are told that the differences between Hamilton scale scores shouldn't be regarded as having numerical value. Comparing two Hamilton scores only indicates relative standing, that is which of the two patients in the matched pair is showing greater symptoms of depression. Under that limitation of the data carry out an appropriate statistical test of the hypothesis in part (a). Explain why the result is the same or different from the result in part (a).

Question 2. Matching to increase precision: Factorial Randomized blocks designs

Example from lecture, Neter-Wasserman problem DENTAL PAIN.

An anesthesiologist made a comparative study of the effects of acupuncture and codiene on postoperative dental pain in male subjects. The four treatments were (1) placebo treatment-- a sugar capsule and two inactive acupuncture points, (2) codiene treatment only--a codeine capsule and two inactive acupuncture points; (3) acupucture only--a sugar capsule and two active acupuncture points (4) both codeine and acupuncture. These 4 conditions have a 2x2 factorial structure.

Thirty-two subjects were grouped into 8 blocks of four according to an initial evaluation of their level of pain tolerance. The subjects in each block were then randomly assigned to the 4 treatments. Pain relief scores were obtained 2 hours after dental treatment. Data were collected on a double-blind basis.

Data in file: http://statweb.stanford.edu/~rag/stat209/dental.dat

c1 is pain relief score (higher means more pain relief); c2 is block; c3 is codiene; c4 is acupuncture--for c3 and c4, 1=no.

a. obtain cell means for the 2x2 factorial design

b. carry out the randomized blocks analysis of variance, factors are Block, main effects for Codeine Acup and interaction term Codeine*Acup

c. Give a measure for the relative efficiency of the blocking on pain tolerance--how much better in terms of precision or number of subjects needed is the analysis using blockings versus a 2x2 factorial design design that ignores pain tolerance?

Question 3.

Recreate the matching demonstration for Ben Hansen's "gender equity" example (done in the week 7 class handout, posted not hard copy), an example of optimal full matching. Only one matching variable. this is Example 2 in Hansen's talk, about p.48 in the linked pdf here's the data in cut-and-paste form

> geneq Grant gender 1 5.7 W 2 4.0 W 3 3.4 W 4 3.1 W 5 5.5 M 6 5.3 M 7 4.9 M 8 4.9 M 9 3.9 M

Question 4. Multivariate matching

The example shown in lecture, from anderson et al

Example 6.5 Multivariate caliper matching: Consider a hypothetical study comparing two therapies effective in reducing blood pressure, where the investigators want to match on three variables: previously measured diastolic blood pressure (DPB), age, and sex. Such confounding variables can be divided into two types: categorical variables, such as sex, for which the investigators may insist on a perfect match (e = 0); and numerical variables, such as age and blood pressure, which require a specific value of the caliper tolerances. Let the blood pressure tolerance be specified as 5 mm Hg and the age tolerance as 5 years. The data contains measurements of these three confounding variables. (The subjects are grouped by sex to make it easier to follow the example.)

Data with columns DBP age sex and Grp (Treatment Group or Comparison Reservoir) http://statweb.stanford.edu/~rag/stat209/matchex.dat

Table 6.6 Hypothetical Measurements on Confounding Variables Treatment Group Comparison Reservoir Subject Diastolic Blood Subject Diastolic Blood Number Pressure (mm Hg) Age Sex Number Pressure (mm Hg) Age Sex 1 94 39 F 1 80 35 F 2 108 56 F 2 120 37 F 3 100 50 F 3 85 50 F 4 92 42 F 4 90 41 F 5 65 45 M 5 90 47 F 6 90 37 M 6 90 56 F 7 108 53 F 8 94 46 F 9 78 32 F 10 105 50 F 11 88 43 F 12 100 42 M 13 110 56 M 14 100 46 M 15 100 54 M 16 110 48 M 17 85 60 M 18 90 35 M 19 70 50 M 20 90 49 Ma. show preexisting difference between comparison and treatment, no matching.

b. try to do a match by hand, finding a best match for each of the treatment subjects.

c. use the 3 confounding variables to compute a propensity score (for membership in the treatment); match subjects on the propensity scores (i.e. nearest comparison to each treatment subject) by hand, or use optmatch functions to do optimal matching either 1:1 or 1:2. See which provides better (less bad) balance in the covariates.

Question 5. Extended Example: Propensity scores versus regression adjustment, single confounder

Artificial data construction

1. start with 10000 subjects-- outcome measure Y

2. subjects belong to groups (G=0,1) based on probabilistic assignment on a single unobserved variable X normal mean 10 variance 4 Prob(G = 1|X) = 1 - (1/(1 + 1/exp(-5 + .5*X)) )

3. Outcome measure Y also highly correlated with X. Y = 1.2*G + X + u (u is Normal, mean 0, variance 1.69) treatment effect is built in as 1.2 (about half a sd)

4. Besides Y and G, the observable that is available is a version of X obscured by measurement error; let Z be a fallible version of X with reliability about .72 (i.e. correlation about .85).

a. compare group differences on Z (preexisting diffs)

b. try out regression adjustment estimate for treatment effect-- Use observable Z as covariate. Compare with using X as covariate.

c. use Z to compute propensity score for each of 10000 subjects. stratify into quintiles on propoensity (as in Rubin Arch Int Med from lecture). And compute a treatment/control comparison within each of the 5 propensity strata. Also get overall comparison from main effect in the 2x5 anova.

d. repeat part c using the unobservable X. Does X give better results.

e. which works better in the 1-dimensional case, propensity matching or regression adjustment?

Question 6. Cochran subclassification for confounding variable

Week 7 class example, age as a confounder on effects of (cig) smoking.

Use lalonde data as play data to show a simple implementation of subclassification adjustment with re78 as outcome, treatment group comparsion, and just consider age as the confounder.

> library(MatchIt) > data(lalonde) > head(lalonde) treat age educ black hispan married nodegree re74 re75 re78 NSW1 1 37 11 1 0 1 1 0 0 9930.0460 NSW2 1 22 9 0 1 0 1 0 0 3595.8940 NSW3 1 30 12 1 0 0 0 0 0 24909.4500 NSW4 1 27 11 1 0 0 1 0 0 7506.1460 NSW5 1 33 8 1 0 0 1 0 0 289.7899 NSW6 1 22 9 1 0 0 1 0 0 4056.4940 > attach(lalonde) > table(treat) # 185 in job training treat 0 1 429 185 > tapply(re78, treat, mean) # oh my, seems better off with no job training, can the Republicans be right? 0 1 6984.170 6349.144 > tapply(age, treat, mean) # there is an mean age diff 0 1 28.03030 25.81622 > tapply(age, treat, fivenum) # same medians,but some controls older $`0` [1] 16 19 25 35 55 $`1` [1] 17 20 25 29 48

Matching Methods for Observational Data: Part II

optmatch exs, nuclear plants, gender ascii version for some Ben Hansen matching exs using MatchIt/optmatch

Pair matching--nuclear plants data. 1:2 optimal pair matching using MatchIt and pairmatch in optmatch plus balance diagnostics.

Lalonde NSW data. Subclassification/Stratification and Full matching.

Lalonde data class handout

Rogosa R-session (using R 3.3.3) 4/1/18 redo in R 3.4.4 (sparse)

2019 lalonde Matchit: full matching, balance with cobalt love.plot and bal.tab

2019 lalonde optmatch: fullmatch with outcome analysis

Legacy Stat209 Lab 4, Lalonde Data, is arranged in pieces

a. Lab4, exposition and commands

b. Lab 4, Rogosa R-session, Base (sections 1-3)

c. Lab 4, Rogosa R-session, additional matching exercises (incl secs 4-6)

d. Lab 4, Rogosa R-session: not done until ancova is run

MatchIt: Nonparametric Preprocessing for Parametric Casual Inference Daniel Ho, Kosuke Imai, Gary King, Elizabeth Stuart

MatchIt vignette

JSS May 2011 exposition: MatchIt: Nonparametric Preprocessing for Parametric Causal Inference

optmatch:fullmatch vignette optmatch another version another good tutorial optmatch Functions for Optimal Matching

Hansen presentation: Flexible, Optimal Matching for Comparative Studies Using the optmatch package

Additional exercises (checking balance) using the nuclearplants data (class handout ex) from Mark Fredrickson here

Optmatch application paper: Full matching in an observational study of coaching for the SAT.(Scholastic Assessment Test)

Another optmatch example presentation: Attributing Effects to a Get-Out-The-Vote Campaign Using Full Matching and Randomization Inference Jake Bowers and Ben Hansen. Data archive and computing resources for the New Haven get-out-the-vote

Time-1, Time-2 (Longitudinal) Data in Experimental Designs and Observational Studies

Primary reading: Laird-Ware text slides (pdf pages 135-150).

Crossover design data from slide 137, anova for crossover design ex ascii version, anova for crossover design ex

R-resources for crossover designs. package

Primary reading: Comparative Analyses of Pretest-Posttest Research Designs, Donna R. Brogan; Michael H. Kutner,

urea synthesis, BK data data, long-form

BK plots (by group) BK overview

2017 Analysis handout Extended BK lmer analysis

Additional stuff

BK repeated measures analysis pdf version

Stat141 analysis

archival example analyses. SAS and minitab

class slide

A very popular subject these days. Pretty good Wiki page LSE slides

Austin Nichols slides. Causal inference with observational data A brief review of quasi-experimental methods July 2009

Angrist Ch 5, MHE. Card and Krueger (1994) data, minumimum wage ex

paper On the Use of Linear Fixed Effects Regression Models for Causal Inference(sec 3.2)

R-package

Lord notes

Primary readings: Publication: Lord, F. M. (1967). A paradox in the interpretation of group comparisons.

Wainer, H. (1991). Adjusting for differential base rates: Lord's Paradox again. Psychological Bulletin, 109, 147-151.

Time-1,time-2 data analysis examples Measurement of change: time-1,time-2 data

data example for handout scan of regression handout ascii version of data analysis handout

Extra material for Correlates and predictors of change: time-1,time-2 data

Rogosa R-session to replicate handout, demonstrate wide-to-long data set conversion, and descriptive fitting of individual growth curves. Some useful plots from Rogosa R-session

Technical results: Section 3.2.2 esp Equation 27 in Rogosa, D. R., & Willett, J. B. (1985). Understanding correlates of change by modeling individual differences in growth. Psychometrika, 50, 203-228. Talk slides

Interrupted Time-series designs

Gene Glass overview Time Series Analysis with R section 4.6 R package

Current implementations of

American Statistical Association Statement on Using Value-Added Models for Educational Assessment

J.R. Lockwood, Harold Doran, and Daniel F. McCaffrey. Using R for estimating longitudinal student achievement models. R News, 3(3):17-23, December 2003.

a. Get moving Can't Focus? 10 Minutes Of Exercise Gives Brain Burst Of Energy Short-term exercise equals big-time brain boost.

Publication: Executive-related oculomotor control is improved following a 10-min single-bout of aerobic exercise: Evidence from the antisaccade task

b. When Adolescents Give Up Pot, Their Cognition Quickly Improves

c. Stents? A Controversial Experiment Upends The Conventional Wisdom On Heart Stents Publication: Percutaneous coronary intervention in stable angina (ORBITA): a double-blind, randomised controlled trial The Lancet.

d. Mere Visual Perception of Other People's Disease Symptoms Facilitates a More Aggressive Immune Response

e. Guns and testosterone. Guns Up Testosterone, Male Aggression

Guns, Testosterone, and Aggression: An Experimental Test of a Mediational Hypothesis Klinesmith, Jennifer; Kasser, Tim; McAndrew, Francis T,

a. This time with 3 conditions For Exercise, Nothing Like the Great Outdoors Publication: Niedermeier M, Einwanger J, Hartl A, Kopp M (2017) Affective responses in mountain hiking-- randomized crossover trial focusing on differences between indoor and outdoor activity. PLoS ONE 12(5): e0177719. https://doi.org/10.1371/journal.pone.0177719

b. Does nutrition science know anything? Is white or whole wheat bread 'healthier?' Depends on the person Publication: Bread Affects Clinical Parameters and Induces Gut Microbiome-Associated Personal Glycemic Responses Cell Metabolism, Korem et al DOI: 10.1016/j.cmet.2017.05.002

c. RCT (cross-over design). Damn right! The secret of success is swearing: How shouting four letter words can help make you stronger Swearing can help you boost your physical performance The full power of swearing is starting to be discovered

d. One thing at a time. Why listening to a podcast while running could harm performance Publication: A trade-off between cognitive and physical performance, with relative preservation of brain function Scientific Reports 7, Article number: 13709 (2017) nature.com.

1. Repeated measures analysis of variance

Models for Pretest-Posttest Data: Repeated Measures ANOVA Revisited Earl Jennings

A good R-primer on repeated measures (a lots else). Notes on the use of R for psychology experiments and questionnaires Jonathan Baron, Yuelin Li. Another version

Multilevel package has behavioral scienes applications including estimates of within-group agreement, and routines using random group resampling (RGR) to detect group effects.

More repeated measures resources: Background primer on analysis of variance (with R); see sections 6.8, 6.9 of

2. Lord's Paradox, pre-post group comparisons.

Lord, F. M. (1967). A paradox in the interpretation of group comparisons.

Wainer, H. (1991). Adjusting for differential base rates: Lord's Paradox again. Psychological Bulletin, 109, 147-151.

or Wainer and Brown Three Statistical Paradoxes in the Interpretation of Group Differences: Illustrated with Medical School Admission and Licensing Data

a quick low-level read: Lord's Paradox and the Assessment of Change During College Journal of College Student Development, May/Jun 2004 by Pike, Gary R

Another time1-time2 reading covering old-fashioned ground including Lord's paradox. Maris, Eric. (1998). Covariance Adjustment Versus Gain Scores--Revisited.

3. Value-added analysis.

Value-added does New York City. New York schools release 'value added' teacher rankings Formula uncovers the 'value added' from the unions: THIS IS NO WAY TO RATE A TEACHER

Chap 9 in Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies. Howard Wainer (Author) amazon page available in paper and Kindle

Other versions of the Chap 9 materials Value-Added Models to Evaluate Teachers: A Cry For Help H Wainer, Chance, 2011. Journal of Consumer Research Vol. 32, No. 2, Sept 2005

More Value-added analysis. Journal of Educational and Behavioral Statistics Vol. 29, No. 1, Spring, 2004 Value-Added Assessment Special Issue

Value-Added Measures of Education Performance: Clearing Away the Smoke and Mirrors, PACE

LA Times Teacher Ratings, summer 2010 NEPC vs LATimes

Fitting Value-Added Models in R Harold C. Doran and J.R. Lockwood

Andrew Gelman on Value-added arithmetic: It's no fun being graded on a curve more NY Principals rebel against 'value-added' evaluation

4. Interrupted time-series

Interrupted Time Series Quasi-Experiments Gene V Glass Arizona State University

Did fertility go up after the Oklahoma City bombing? An analysis of births in metropolitan counties in Oklahoma, 1990-1999. Demography, 2005.

original publication (ozone data): Box, G. E. P. and G. C. Tiao. 1975. Intervention Analysis with Applications to Economic and Environmental Problems." Journal of the American Statistical Association. 70:70-79. SAS example for ozone data another ozone analysis with data

Box-tiao time series models for impact assessment Evaluation Quarterly 1979

Interrupted time-series analysis and its application to behavioral data Donald P. Hartmann, John M. Gottman, Richard R. Jones, William Gardner, Alan E. Kazdin, and Russell S. Vaught J Appl Behav Anal. 1980 Winter; 13(4): 543-559.

Segmented regression analysis of interrupted time series studies in medication use research. By: Wagner, A. K.; Soumerai, S. B.; Zhang, F.; Ross-Degnan, D.. Journal of Clinical Pharmacy & Therapeutics, Aug2002, Vol. 27 Issue 4, p299-309,

Interrupted Time Series Designs In Health Technology Assessment: Lessons From Two Systematic Reviews Of Behavior Change Strategies Craig R. Ramsay University Of Aberdeen, International Journal Of Technology Assessment In Health Care, 19:4 (2003), 613-623.

5. Measurement of Change, Correlates of Change, Growth Curve Analysis. See also Stat222 website

Rogosa, D. R., & Willett, J. B. (1985). Understanding correlates of change by modeling individual differences in growth. Psychometrika, 50, 203-228. available from John Willet's pub page

A growth curve approach to the measurement of change. Rogosa, David; Brandt, David; Zimowski, Michele Psychological Bulletin. 1982 Nov Vol 92(3) 726-748 APA record direct link

Longitudinal Data Analysis Examples with Random Coefficient Models. David Rogosa; Hilary Saner . Journal of Educational and Behavioral Statistics, Vol. 20, No. 2, Special Issue: Hierarchical Linear Models: Problems and Prospects. (Summer, 1995), pp. 149-170. Jstor

Demonstrating the Reliability of the Difference Score in the Measurement of Change. David R. Rogosa; John B. Willett Journal of Educational Measurement, Vol. 20, No. 4. (Winter, 1983), pp. 335-343. Jstor