Statistics 209 / HRP 239/ Education 260A
Winter 2016
Statistical Methods for Group Comparisons and Causal Inference
David Rogosa
Lecture: WF, 1:30 - 3 Sequoia 200
course web page at http://rogosateaching.com/stat209/
To see full course materials from Winter 2015 go here
Instructor. David Rogosa, Sequoia 224, rag {AT} stanford {DOT} edu .
Office hours W,F 3 - 3:45.
TA Lucas Janson ljanson {AT} stanford {DOT} edu
Office Hours M 12:30 - 2, Sequoia Library (rm105) Holiday workaround Tues 1/19 10:30-12 in Bowker
Registrar's Information
STATS 209: Statistical Methods for Group Comparisons and Causal Inference (EDUC 260A, HRP 239)
Description
Critical examination of statistical methods in social science and life sciences applications, especially for cause and effect determinations.
Topics: mediating and moderating variables, potential outcomes framework, encouragement designs, multilevel models,
matching and propensity score methods, analysis of covariance, instrumental variables, compliance, path analysis and graphical models,
group comparisons with longitudinal data.
See http://rogosateaching.com/stat209/. Prerequisite: intermediate-level statistical methods.
Terms: Win | Units: 3 | Grading: Letter or Credit/No Credit
2015-2016 Winter
STATS 209 | 3 units | Class # 35191 | Section 01 | Grading: Letter or Credit/No Credit | LEC
01/04/2016 - 03/11/2016 Wed, Fri 1:30 PM - 3:20 PM at Sequoia Hall 200
Course Overview
For students who have had intermediate-level instruction in statistical methods including multiple regression, logistic regression, log-linear models.
At the very least, the content of the course should provide some consolidation of previous instruction in statistical methods.
The goal is also to instill some introspection and critical analysis for the uses of statistical methods common in social science and medical applications, especially for observational studies.
The focus of the course is on understanding what useful information statistical modeling can provide in experimental and especially non-experimental social science settings.
Quick Course Outline
Week 1. Course Introduction; properties of regression models
Week 2. Experiments vs observational studies; Neyman-Rubin-Holland formulation
Week 3. Path analysis and causal modeling, multiple regression with pictures. Graphical models.
Week 4. Multilevel data. Contextual effects, aggregation bias, random effects models
Week 5. The many uses and forms of analysis of covariance (including regression discontinuity designs)
Week 6. Instrumental variable methods, simultaneous equations, reciprocal effects
Week 7. Compliance and experimental protocols; encouragement designs; intent to treat
Week 8. Matching and propensity score methods
Week 9. Time-1, Time-2 group comparisons for experimental and non-experimental designs:
Dead Week. Overflow and course summary.
Course Readings, Files and Examples
Texts (optional).
Statistical Models: Theory and Practice David Freedman (2005) Revised edition (2009).
The course was created around David Freedman's text, and covers that material using auxiliary texts and online materials.
One intent of this course is for students to read some statistical literature and actual research reports to augment the texts (on that theme Freedman's text actually includes reprints of four published empirical research papers which are also available through Jstor).
Primary resource for R and data analysis.
Data analysis and graphics using R (2007) J. Maindonald and J. Braun,
Cambridge 2nd edition 2007. 3rd edition 2010 short draft version in CRAN
Text resource page UCLA DAAG page R-packages for Text Data Sets etc R-Package DAAG R-Package DAAGxtras
Auxiliary texts,
Design of observational studies. Rosenbaum, Paul R. New York : Springer, c2010. Stanford access
Regression Analysis : A Constructive Critique Richard A Berk (2003). Table of contents
Jan de Leeuw, Preface to Berk's "Regression Analysis: A Constructive Critique"
Data analysis and regression: A second course in statistics. Mosteller, F. and Tukey, J. W. (1977) (the green book)
Matched Sampling for Causal Effects, Donald B. Rubin Cambridge University Press 2006
Observational Studies Paul R. Rosenbaum, Publisher: Springer; 2 edition (January 8, 2002)
David Freedman Statistical Models and Causal Inferencee Cambridge 2010 ISBN 978-0-521-19500-3
Grading, Homework and Exams.
Weekly homework assignments following class content will be posted, along with solutions. Homeworks are not graded.
Assessment. Two take home problem sets will be scheduled:
TH1 covering content weeks 1-4.
TH2 covering content weeks 5-8.
In class exam, Exam 3 scheduled by registrar, exam week. My best reading of the Registrar's chart indicates Monday March 14 12:15 (in our classroom). If needed, Exam 3 can be taken remotely).
See also class calendar
Course Assignments Page
Note to auditors. We should have plenty of room in Sequoia 200 for auditors.
The Registrar does have a form (no-fee) for faculty, staff, post-docs: Application for Auditor or Permit to Attend (PTA) Status
Statistical computing
Class presentation will be in, and students are encouraged to use, R, (with occasional reference to SAS, Mathematica, and Matlab).
1/7/09. NYTimes endorses R: Data Analysts Captivated by R's Power
We have a set of 4 computer labs to supplement lecture materials (weeks 2, 4, 6, 8).
Lab 1. Multiple regression basics Lab1 posted 1/18/16
Lab 2. Multilevel analysis (mixed-effects models) High School and Beyond example.
Lab 2 has evolved in three pieces.
a. Lab2, exposition and commands provides a full write up (annotated) of the analyses
b. Lab 2, Rogosa R-session (nlme legacy version)
c. Lab2 (abbreviated version) using lme4, lmer (with additional plots) Lecture slide, lme lmer for Bryk data
For those who are strapped for time or otherwise saturated, I provide a full single Bryk dataset that skips over the data manipulation portion of the activity
Additional materials for HSB analyses are posted in Week 4 Lecture topics, sec 3(iii)
Lab 2 posted 1/30/16
Lab 3, Instrumental Variables.
Lab3, exposition and commands
Lab 3, Rogosa R-session Mroz87 data description Lab3 posted 2/11/16
note: I triple-checked and the dataset is where the description indicates and read.table("http://statweb.stanford.edu/~rag/stat209/Mroz87.dat", header = T) reads in the 753 cases.
Lab 4 Matching and propensity scores. Lalonde job training data
This lab is arranged in pieces
a. Lab4, exposition and commands posted 2/28
b. Lab 4, Rogosa R-session, Base (sections 1-3) posted 2/28
c. Lab 4, Rogosa R-session, additional matching exercises (incl secs 4-6) posted 2/28
d. Lab 4, Rogosa R-session: not done until ancova is run posted 2/28
Current version of R is R version 3.2.3 (Wooden Christmas-Tree) released on 2015-12-10.(Rogosa is running 3.2.2) . For references and software: The R Project for Statistical Computing Closest download mirror is Berkeley
The CRAN Task View: Statistics for the Social Sciences provides an overview of relevant R packages. Also of interest are CRAN Task View: Psychometric Models and Methods and CRAN Task View: Design of Experiments (DoE) and Analysis of Experimental Data
This past fall qtr I did short 5 week intro R-course intended for users of other statistical packages; see Ed401 page: http://rogosateaching.com/ed401/ Older introductory materials on R 2007 Stat141 site, especially the Course Files and Examples page
Among the infinite number of introduction to R resources is John Verzani's page A good R-primer on various applications (repeated measures and lots else). Notes on the use of R for psychology experiments and questionnaires
Jonathan Baron, Yuelin Li. Another version
Even more stuff: According to Peter Diggle: "The best resource for R that I have found is Karl Broman's Introduction to R page." And a remarkably useful set of R-resources from Murray State
Wm. Revelle who develops the psych package also has a draft text which covers standard statistics plus specialized measurement topics (plus other R intros)
For those with a life sciences background a useful resource may be the book Analysis of epidemiological data using R and Epicalc and the Epicalc package.
An additional R resource that is efficient if you are experienced with another statistical package is a presentation An Introduction to R,
John Verzani For categorical data, especially if you've had a course using Agresti, the lengthy guide by Laura Thompson has more than you want to know.
Case Studies in Cause and Effect
Freedman text includes a series of older social science publications as case studies.
1. Smart Babies
Upside
a. Breastfeeding Boosts Kids' Brains, Especially Boys' Do Breast-Fed Baby Boys Grow Into Better Students? Publication: Breastfeeding Duration and Academic Achievement at 10 Years.
Wendy H. Oddy, Jianghong Li, Andrew J. O. Whitehouse, Stephen R. Zubrick, Eva Malacova. Pediatrics; Vol 127, Numb 1, Jan 2011
b. Extended Womb Time Makes Better Students
More time in womb tied to better academic performance later in life Publication: Noble KG, et al. Academic achievement varies with gestational age among children born at term Pediatrics 2012; DOI: 10.1542/peds.2011-2157.
Downside
c. Moderate drinking in pregnancy 'harms IQ' Just one glass of wine a week while pregnant 'can harm a baby's IQ'
Publication: Fetal Alcohol Exposure and IQ at Age 8: Evidence from a Population-Based Birth-Cohort Study. PLoS ONE 7(11): e49407. doi:10.1371/journal.pone.0049407
2. Is TV bad or is it bad parenting? Attention Deficit Disorder and TV
and should the question be answered with LISREL (structural equation models)
2004 version : Pediatrics. 2004;113:708-713. Christakis DA, Zimmerman FJ, DiGiuseppe DL, McCarty CA. Early television exposure and subsequent attentional problems in children.
Publication
press release
audio NPR interview
2006 reversal? (with LISREL) Pediatrics. March 2006. Stevens T and Mulsow M. There is no meaningful relationship between television exposure and symptoms of attention-deficit hyperactivity disorder. Pediatrics. 2006; 117(3):665-672.
News Reports: TV may not cause kids' attention disorders
Researchers say TV is not to blame for ADHD
Good general commentary in Slate Feb '06 The Benefits of Bozo Proof that TV doesn't harm kids.
Or maybe it is what you eat?
Diet May Help ADHD Kids More Than Drugs
Food additives and hyperactive behaviour in 3-year-old and 8/9-year-old children in the community: a randomised, double-blinded, placebo-controlled trial.
Or maybe it is genetic?
Study finds first evidence that ADHD is genetic
Clues to the Genetic Roots of ADHD
Publication: Rare chromosomal deletions and duplications in attention-deficit hyperactivity disorder: a genome-wide analysis
The Lancet, Volume 376, Issue 9750, Pages 1401 - 1408, 23 October 2010
Or Domestic Violence?
Childhood ADHD, Conduct Disorder Linked to Intimate Partner Violence Childhood ADHD, Conduct Disorder Linked to Intimate Partner Violence
Auxiliary notes This research example raises an important theme of this course-- similarities (often indistinguishability) between social science and medical research. Is the TV and ADHD child development or medical research? (point being the division is often unclear or unuseful)
further aside: ADHD medication: Prescribing of hyperactivity drugs is out of control
 ADHD on the Rise: Almost One in 10 Children Diagnosed, Says CDC;
As with most important issues, definitive wisdom is provided by South Park via Cartman: here, episode 404 (4/19/2000), episode summary
and episode video
3. Money and Happiness
Would You Be Happier If You Were Richer? A Focusing Illusion
Science 30 June 2006: Vol. 312. no. 5782, pp. 1908 - 1910,
Daniel Kahneman Alan B. Krueger, David Schkade, Norbert Schwarz, Arthur A. Stone
Or is it age-dependent? The Midlife Happiness Crisis Is Well-being U-Shaped over the Life Cycle?
David G. Blanchflower, Andrew Oswald NBER Working Paper No. 12935 February 2007.
Same for Apes? Evidence for a midlife crisis in great apes consistent with the U-shape in human well-being PNAS, Proceedings of the National Academy of Sciences, 2012
It's not TV? Unhappy People Watch TV, Happy People Read/Socialize Publication What Do Happy People Do?
John P. Robinson and Steven Martin Soc Indic Res (2008) 89:565-571
Is Happiness Overrated? Study Finds Physical Benefits to Some (Not All) Good Feelings
Or, instead, does money cause evil? Shame on the Rich Science News, February 2012. Higher social class predicts increased unethical behavior PNAS 2012
4. Kindergarten and Money
$320,000 Kindergarten Teachers
Paper: HOW DOES YOUR KINDERGARTEN CLASSROOM AFFECT YOUR EARNINGS? EVIDENCE FROM PROJECT STAR Raj Chetty, Harvard University and NBER John N. Friedman, Harvard University and NBER Nathaniel Hilger, Harvard University Emmanuel Saez, UC Berkeley and NBER Diane Whitmore Schanzenbach, Northwestern University and NBER Danny Yagan, Harvard University March 2011. Policy Brief, Kennedy School of Government
Talks: How Does Your Kindergarten Classroom Affect Your Earnings? Evidence from Project STAR Raj Chetty, Harvard another version
Using R. Tennessee's Student Teacher Achievement Ratio (STAR) from Creating an R data set from STAR Douglas Bates
Other Studies: Long-Term Effects of Class Size Peter Fredriksson, Bjorn Ockert and Hessel Oosterbeek The Quarterly Journal of Economics (2012)
Effect of Class Size in Grades K-3 on Adult Earnings, Employment, and Disability Status: Evidence from a Multi-center Randomized Controlled Trial Elizabeth Ty Wilde, PhD Jeremy Finn, PhD Gretchen Johnson, MA Peter Muennig, MD, MPH Journal of Health Care for the Poor and Underserved 22 (2011): 1424-1435.
More economists on Early Education: Jim Heckman. It Pays to Invest in Early Education Says a Nobel Economist Who Boosts Kids' IQ
5. Beer and productivity
NY Times: For Scientists, a Beer Test Shows Results as a Litmus Test Slashdot Scientists' Success Or Failure Correlated With Beer but Beer-Drinking Scientist Debunks Productivity Correlation 21/03: In Defense of Beer-Drinking Scientists
Publication. A possible role of social activity to explain differences in
publication output among ecologist Thomas Grim Oikos, 2008 Abstract Beer Scatterplot
Collection: Beer vs. science -- first laugh, then think (what to drink:-) Selection of media reports, interviews and commentaries on probably the most discussed ecological paper of the year 2008
Not too late: Starting Drinking in Middle Age Reduces Cardiovascular Risk
6. God and IQ
High IQ turns academics into atheists
Why people who believe in God 'are more likely to have a lower IQ'
Publication. Lynn, Richard; John Harvey and Helmuth Nyborg. "Average intelligence predicts atheism rates across 137 nations". Intelligence 2008. Elsevier Inc. doi:10.1016/j.intell.2008.03.004
Wikipedia page Religiosity and intelligence
7. Media and Teen Vice
2006 version.
Pediatrics 2006;117;1018-1027 Sexy Media Matter: Exposure to Sexual Content in Music, Movies, Television, and Magazines Predicts Black and White Adolescents' Sexual Behavior UNC Teen Media Center (NICHD funded)
2008 uproar, RAND corp.
Sex-saturated TV shows making teens pregnant
Sex on TV linked to teen pregnancies: Watching lots of racy shows can affect adolescents over time
Publication: Does Watching Sex on Television Predict Teen Pregnancy? Findings from a National Longitudinal Survey of Youth Pediatrics, v. 122, no. 5, Nov. 2008, p. 1047-1054
January 2012: Or are they just clueless??? CDC: Many teen moms didn't think they would get pregnant Half of Teen Moms Don't Use Birth Control --Why That's No Surprise
CDC Report, Jan 2012: Prepregnancy Contraceptive Use Among Teens with Unintended Pregnancies Resulting in Live Births - Pregnancy Risk Assessment Monitoring System (PRAMS), 2004-2008
the real truth on sex and rock-and-roll from Frank Zappa: Zappa on Crossfire 1987;
Zappa vs Tipper Gore on Nightline 1985 with Ted Koppel