Statistical Methods for Longitudinal Research

David Rogosa Sequoia 224, rag{AT}stanford{DOT}edu

To see full course materials from Autumn 2018 go here

For recreation of in-classroom experience, linked below are youtube versions of the music I play

before starting lecture and after lecture concludes. Some may wish to reverse that ordering.

Registrar's informationSTATS 222 (Same as EDUC 351A): Statistical Methods for Longitudinal Research Units: 2 Grading Basis: Letter or Credit/No Credit Course Description: STATS 222: Statistical Methods for Longitudinal Research (EDUC 351A) Research designs and statistical procedures for time-ordered (repeated-measures) data. The analysis of longitudinal panel data is central to empirical research on learning, development, aging, and the effects of interventions. Topics include: measurement of change, growth curve models, analysis of durations including survival analysis, experimental and non-experimental group comparisons, reciprocal effects, stability. See http://rogosateaching.com/stat222/. Prerequisite: intermediate statistical methods Terms: Aut | Units: 2 | Grading: Letter or Credit/No Credit Instructors: Rogosa, D. (PI)

Week 1. Course Overview, Longitudinal Research; Analyses of Individual Histories and Growth Trajectories

Week 2. Introduction to Data Analysis Methods for assessing Individual Change for Collections of Growth Curves (mixed-effects models)

Week 3. Analysis of Collections of growth curves: linear, generalized linear and non-linear mixed-effects models

Week 4. Special case of time-1, time-2 data; Traditional measurement of change for individuals and group comparisons

Week 5. Assessing Group Growth and Comparing Treatments: Traditional Repeated Measures Analysis of Variance and Linear Mixed-effects Models

Week 6. Comparing group growth continued: Power calculations, Cohort Designs, Cross-over Designs, Methods for missing data, Observational studies.

Week 7. Analysis of Durations: Introduction to Survival Analysis and Event History Analysis

Weeks 8-9. Further topics in analysis of durations: Diagnostics and model modification; Interval censoring, Time-dependence, Recurrent Events, Frailty Models, Behavioral Observations and Series of Events (renewal processes)

Dead Week. Assorted Special Topics: Assessments of Stability (including Tracking), Reciprocal Effects, (mis)Applications of Structural Equation Models, Longitudinal Network Analysis

1. Garrett M. Fitzmaurice Nan M. Laird James H. Ware Applied Longitudinal Analysis (Wiley Series in Probability and Statistics; 2nd ed 2011)

Text Website second edition website Text lecture slides

2. Judith D. Singer and John B. Willett . Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence New York: Oxford University Press, March, 2003.

Text web page Text data examples at UCLA IDRE Powerpoint presentations good gentle intro to modelling collections of growth curves (and survival analysis) is Willett and Singer (1998)

3. Douglas M. Bates. lme4: Mixed-effects modeling with R February 17, 2010 Springer (chapters). A merged version of Bates book: lme4: Mixed-effects modeling with R January 11, 2010 has been refound

Manual for R-package lme4 and mlmRev, Bates-Pinheiro book datasets.

Additional Doug Bates materials. Collection of all Doug Bates lme4 talks Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions Douglas Bates 8th International Amsterdam Conference on Multilevel Analysis 2011-03-16 another version

Original Bates-Pinheiro text (2000). Mixed-Effects Models in S and S-PLUS (Stanford access). Appendix C has non-linear regression models.

Fitting linear mixed-effects models using lme4,

4. A handbook of statistical analyses using R (second edition). Brian Everitt, Torsten Hothorn CRC Press, Index of book chapters Stanford access Longitudinal chapters: Chap11 Chap12 Chap13. Data sets etc Package 'HSAUR2' August 2014, Title A Handbook of Statistical Analyses Using R (2nd Edition)

There is now a third edition of HSAUR, but full text not yet available in crcnetbase.com. CRAN HSAUR3 page with Vignettes (chapter pieces) and data in reference manual

5. Peter Diggle , Patrick Heagerty, Kung-Yee Liang , Scott Zeger. Analysis of Longitudinal Data 2nd Ed, 2002

Amazon page Peter Diggle home page Book data sets

A Short Course in Longitudinal Data Analysis Peter J Diggle, Nicola Reeve, Michelle Stanton (School of Health and Medicine, Lancaster University), June 2011 earlier version associated exercises: Lab 1 Lab2 Lab3

6. Longitudinal and Panel Data: Analysis and Applications for the Social Sciences by Edward W. Frees (2004). Full book available and book data and programs (mostly SAS).

7. Growth Curve Analysis and Visualization Using R. Daniel Mirman Chapman and Hall/CRC 2014 Print ISBN: 978-1-4665-8432-7 Stanford Access Mirman web page (including data links).

8.

9. Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. Springer Series in Statistics. New-York: Springer. Extended presentation: Introduction to Longitudinal Data Analysis A shorter exposition: Methods for Analyzing Continuous, Discrete, and Incomplete Longitudinal Data

10. Survival analysis Rupert G. Miller. Available as Stanford Tech Report

11. Event History Analysis with R (Stanford access). Goran Brostrom CRC Press 2012. R-package

12. John D. Kalbfleisch , Ross L. Prentice The Statistical Analysis of Failure Time Data 2nd Ed

Amazon page online from Wiley

13. Klein J, Moeschberger M (2003). Survival Analysis, 2nd edition. New York: Springer.

14. Therneau TM, Grambsch PM (2000). Modeling Survival Data: Extending the Cox Model. New York: Springer.

15. Advanced survival analysis topics.

Interval-Censored Time-to-Event Data Methods and Applications Chapman and Hall/CRC 2012 (esp Chap 14--glrt).

Recurrent Events: Chapter 9 of Kalbfleisch and Prentice (2nd edition), "Modeling and Analysis of Recurrent Event Data".

Cook, R. J. and Lawless, J. F. (2007). The Statistical Analysis of Recurrent Events. (Stanford access) Springer, New. York.

Joint Models for Longitudinal and Time-to-Event Data. With Applications in R. Dimitris Rizopoulos. Chapman and Hall/CRC 2012(Stanford access) Book website

Additional Specialized Resources

Harvey Goldstein. The Design and Analysis of Longitudinal Studies: Their Role in the Measurment of Change (1979). Elsevier

Amazon page Goldstein Chap 6 Repeated measures data Multilevel Statistical Models by Harvey Goldstein with data sets

David Roxbee Cox, Peter A. W. Lewis The statistical analysis of series of events. Chapman and Hall, 1966

Google books Poisson process computing program

David J Bartholomew. Stochastic Models for Social Processes, Chichester 3rd edition: John Wiley and Sons.

David J Bartholomew web page

Stat222/Ed351A is listed as Letter or Credit/No Credit grading for 2-units

For Autumn 2020 grading for the 2-units will be based on a 'take home'(i.e. do at home) Problem Set.

Each week are posted a few exercises for that week's content--towards the end of the qtr I will identify a subset of those exercises to be turned in.

Those selected problems will constitute the graded Problem Set.

Also as you will see, for each week's content a number of Review Questions with Solutions are posted.

Course Problem Set 2020 partial posting Oct 30; complete posting Nov 10

Class presentation will be in, and students are encouraged to use, R (occasionally, some references to SAS and Mathematica).

Current version of R is R version 4.0.2 (Taking Off Again) released 2020-06-22.

For references and software: The R Project for Statistical Computing Closest download mirrors in the past, UCLA and Berkeley, seem no longer avaliable, pick your fave anywhere in the world.

The CRAN Task View: Statistics for the Social Sciences provides an overview of some relevant R packages. Also the new CRAN Task View: Psychometric Models and Methods and CRAN Task View: Survival Analysis and CRAN Task View: Computational Econometrics.

A good R-primer on various applications (repeated measures and lots else). Notes on the use of R for psychology experiments and questionnaires Jonathan Baron, Yuelin Li. Another version

A Stat209 text, Data analysis and graphics using R (2007) J. Maindonald and J. Braun, Cambridge 2nd edition 2007. 3rd edition 2010 has available a short version in CRAN .

According to Peter Diggle: "The best resource for R that I have found is Karl Broman's Introduction to R page."

A. Longitudinal research overview

B. Examples, illustrations for longitudinal research overview, taken from course resources above:

Laird,Ware (#1) slides 1-16; Diggle (#5) slides 4-14, 22-28 Verbeke (#9) slides from Ch 2 and Sec3.3

C. Data Analysis Examples of Model Fitting for Individual Trajectories and Histories.

ascii version of class handout annotated version pdf version with plots datasets

Starting up R-addendum: installing packages and obtaining data (sleepstudy in lme4)

For Count Data (glm) example. Link functions for generalized linear mixed models (GLMMs), Bates slides (pdf pages 11-18)

AIDS in Belgium example, (from Simon Wood) single trajectory, count data using glm. Rogosa R session for aids data

aditional expositions of AIDS data, Poisson regression: Duke Kentucky

A

Non-linear models, esp logistic. From week 1, also week 3 Self-Starting Logistic model SSlogis help page, do

Trend in Proportions: College fund raising example prop.trend.test help page

Trend in proportions, group growth, Cochran-Armitage test. Expository paper: G. Salanti and K. Ulm (2003): Tests for Trend in Binary Response (SU access)

1. For the straight-line (constant rate of change) fit example to subj 372 in the sleepstudy data. Obtain a confidence interval for the rate of change from the OLS fit. Now compare the OLS fit with day-to-day differences. Under the constant rate of change model these 9 day to day differences also estimate the rate of change. Obtain a estimate of the mean and a confidence interval for rate of change from these first differences. Compare with OLS results.

For reference, Self-Starting Logistic model SSlogis help page, do

North Carolina, female math performance (also in Rogosa-Saner) North Carolina data (wide format); NC data (long)

For that female, what is the rate of improvement over grades 1 through 8? Compare the observed improvement for grades 1 through 8 (the

Seperately, consider three observations at taken at equally spaced time intervals: What is a simple expression for the OLS slope (rate of change)?

Music to accompany long-distance truck driver data: Flying Burrito Brothers "Six Days on the Road"

Source Publication: Belenky, G., Wesensten, N. J., Thorne, D. R., Thomas, M. L., Sing, H. C., Redmond, D. P., Russo, M., & Balkin, T. (2003). Patterns of performance degradation and restoration during sleep restriction and subsequent recovery: A sleep dose-response study. Journal of Sleep Research, 12(1), 1-12.

Sleepstudy data analysis from Doug Bates lme4 book lme4: Mixed-effects modeling with R February 17, 2010 (draft chapters) Chapter 4: Sleepstudy example or a set of Bates slides for the sleepstudy example

Why lmer (lme4) does not provide p-values for fixed effects : Doug Bates lmer, p-values and all that

A number of add-on packages seek to provide lmer p-values.(We show afex package in Review Question 1 with 2020 update).

Individual plots (frame-by-frame) Plot of straight-line fits Initial descriptive analyses (SFYS)

Sleepstudy, Bates Ch 4, lme4 analyses, ascii Sleepstudy class handout, pdf scan

more Doug Bates Slides (pdf pages 8-28)

Individual effects: fixed and random (and BLUPs). sleepstudy Rsession session plots

North Carolina data (wide format); NC data (long) plots for NC data

Data formatting:

The UCLA archive has a tutorial using built-in

North Carolina example. wide-form descriptives, background, plots Initial SFYS analyses of NC data, ascii

Model Comparisons for North Carolina, female math performance ascii version NC class handout, pdf scan

model ncCon2 without redundent model term NC bootstrap results (SAS)

North Carolina example (week 2 handouts), plotting residuals , ascii session.

Fitting linear mixed-effects models using lme4, Journal of Statistical Software Douglas Bates Martin Machler Ben Bolker also Rnews_2005 pp.27-30

Bates book, Chapter 5, Computational Methods. Bates talk slides: Mixed models in R using the lme4 package Part 4: Theory of linear mixed models

Plots and diagnostics: Package

Non-Gaussian modelling. Hierarchical Generalized Linear Models, Package

Extensions of lme4 modeling: Package

Package

Package

North Carolina Data also in (with full development of the modelling) Longitudinal Data Analysis Examples with Random Coefficient Models. David Rogosa; Hilary Saner . Journal of Educational and Behavioral Statistics, Vol. 20, No. 2, Special Issue: Hierarchical Linear Models: Problems and Prospects. (Summer, 1995), pp. 149-170. Jstor

Douglas Bates class resource item #3, Texts and Resources. Other Doug Bates materials: Three packages, "SASmixed", "mlmRev" and "MEMSS" with examples and data sets for mixed effect models

North Carolina Data also in (with full development of the modelling) Longitudinal Data Analysis Examples with Random Coefficient Models. David Rogosa; Hilary Saner . Journal of Educational and Behavioral Statistics, Vol. 20, No. 2, Special Issue: Hierarchical Linear Models: Problems and Prospects. (Summer, 1995), pp. 149-170. Jstor Data sets for Rogosa-Saner

Additional talk materials: An Assortment of Longitudinal Data Analysis Examples and Problems 1/97, Stanford biostat. Overview and Implementation for Basic Longitudinal Data Analysis CRESST Sept '97. Another version (short) of the expository material is from the Timepath '97 (old SAS progranms) site: Growth Curve models ; Data Analysis and Parameter Estimation ; Derived quantities for properties of collections of growth curves and bootstrap inference procedures

I start by fitting the lmer model for the collection of growth curves:

Then try out

Then look at the

a. identify the fastest and slowest growth among the 277 females. Compare medians of growth rates for females with verbal ability (Z) at or above 106 with that for females with verbal ability below 106. Show side-by-side boxplots.

b. In the class handout version of the NC analyses (and other postings, but not all) the first thing to do was make the 'time' variable have intitial value = 0 (making the intercept of a straight line fit correspond to level at initial time): i.e. 1 to 8 becomes 0 to 7. Obtain lmList results and fit the ncUnc lmer model (straight-line growth, no Z) using time 1 to 8. Comment on differences of these analyses with those using timeInt in the class handout. In particular, look at the correlation of change and initial status. The correlation between observed change and observed initial status using timeInt was .279 from lmer (

North Carolina data has 277 subjects, a frame-by-frame display of individuals requires subsampling. Construct a plot for 24 (arbitrary) individuals data trajectories.

A subsample of data from the National Youth Survey is obtained in long-form by

and in wide form by

Yearly observations from ages 11 to 15 on the tolerance measure (tolerance to deviant behavior e.g. cheat, drug, steal, beat; larger values indicates more tolerance on a 1to4 scale). Also in this data set are gender (is_male) and an

i. obtain individual OLS fits (tolerance over time) and plot the collection of those straight-lines. Provide descriptive statistic summaries for the rate of change in tolerance and initial level.

ii. fit a mixed effects model for tolerance over time (unconditional) for this collection of individuals. Obtain interval estimates for the fixed and random effects. Show that the fixed effects estimates correspond to quantities obtained in part i. Explain.

iii. Investigate whether the

Consider the sleepstudy and Ramus examples, collections of growth trajectories with no exogenous variable. Ramus Data example. Example consists of 4 longitudinal observations on each of 20 cases. The measurement is the height of the mandibular ramus bone (in mm) for boys each measured at 8, 8.5, 9, 9.5 years of age. These data, which have been used by a number of authors (e.g., Elston and Grizzle 1962), can be found in Table 4.1 of Goldstein (1979). Ramus data example long form for Ramus data tutorial on creating long form data manipulation .

Fitting the lmer models with

Data on early childhood cognitive development described in Doug Bates talk materials (pdf pages 49-52). Obtain these data from the R-package "mlmRev" or the Willett-Singer book site (in our week 1 intro links). Data are in long form and consist of 3 observations 58 treatment and 45 control children; see the Early entry in the mlmRev package docs. Produce the plot of individual trajectories shown pdf p.49, Bates talk. (note:Bates does connect-the-dots, we have done straight-line fit, your choice). Show five-number summaries of rates of impovement in cognitive scores for treatment and control groups. Develop and fit the

Artificial data example from Review Question 3 (used in Myths chapter to illustrate time-1,time-2 data analysis) Start out with the "X" data, and standardize (i.e. transform to mean 0, var 1) at each of the 3 time points. Note "scale" will do this for you (in wide form). For the standardized data obtain the plot showing each subject's data and straight-line fit. What do you have here? Compare the results the mixed-effects models fitting the collection of straight-line growth curves for the measured and standardized data.

lm and gee alternatives (ignore individual growth). ascii session

also resource item #3, Douglas M. Bates lme4: Mixed-effects modeling with R, section 1.4

note: you may need to install HSAUR3 package (instead of 2) for the

Logistic Curve Example: Orange Tree growth. Data from MEMSS package Data sets and sample analyses from Pinheiro and Bates, Mixed effects Models in S and S-PLUS (Springer, 2000).

Plots and nlmer analysis, Orange tree data

Doug Bates Slides Orange trees analysis (pdf pages 8-16), Logistic SS (pdf p.6), pharmacokinetics ex (pdf pages 7, 17-24) Bates NLMM.Rnw From week 1 SSlogis (Self-Starting Logistic model) links and materials. another analysis of Orange Trees in the ASReml package manual section 8.9

Also LDA book Chapter 5. Chapter 5. Non-linear mixed-effects models Marie Davidian

additional tools in the grofit package and nlmeODE package Title Non-linear mixed-effects modelling in nlme using differential equations

A somewhat common practice (which I'm not that fond of) is to constrain individuals differences in an lmer model--e.g. force all individuals to have the same rate of change. For purposes here use the

The UCLA data archive has a comma delimited file (access by

Measurements on 82 adolescents (initial age 14) included 3 time-ordered observations on alcohol use and two background (exogenous) variables: dichotomous

a. Construct an lmer model with the individual growth curve a quadratic function of grade (year), most convenient to use uncorrelated predictors

b. Investigate (via lmer model) gender differences (isMale) in vocabulary growth. Fit appropriate lmer models and interpret results,

1. Properties of Collections of Growth Curves. class handout

2. Time-1, time-2 data. (paired data)

The R-package PairedData has some interesting plots and statistical summaries for "before and after" data;

here is a McNeil plot for Xi.1, Xi.5 in data example

Paired dichotomous data, McNemar's test (in R, mcnemar.test {stats}), Agresti (2nd ed) sec 10.1

Also see R-package

3. Issues in the Measurement of Change. Class lecture covers Myths 1-6+.

Slides from Myths talk . Class Handout, Companion for Myths talk

4. Examples for Exogenous Variables and Correlates of Change (use of lagged dependent variables)

Time-1,time-2 data analysis examples Measurement of change: time-1,time-2 data

data example for handout scan of regression handout ascii version of data analysis handout

Extra material for Correlates and predictors of change: time-1,time-2 data

Rogosa R-session to replicate handout, demonstrate wide-to-long data set conversion, and descriptive fitting of individual growth curves. Some useful plots from Rogosa R-session

Technical results: Section 3.2.2 esp Equation 27 in Rogosa, D. R., & Willett, J. B. (1985). Understanding correlates of change by modeling individual differences in growth. Psychometrika, 50, 203-228. Talk slides

5. Comparing groups on time-1, time-2 measurements: repeated measures anova vs lmer OR the t-test

Comparative Analyses of Pretest-Posttest Research Designs, Donna R. Brogan; Michael H. Kutner,

urea synthesis, BK data data, long-form

BK plots (by group) BK overview

2017 Analysis handout Extended BK lmer analysis

Additional stuff

BK repeated measures analysis pdf version

Stat141 analysis

archival example analyses. SAS and minitab

Myths Chapter. Rogosa, D. R. (1995). Myths and methods: "Myths about longitudinal research," plus supplemental questions. In The analysis of change, J. M. Gottman, Ed. Hillsdale, New Jersey: Lawrence Erlbaum Associates, 3-65.

Myths Talk. Rogosa, D. R. (1983)

I noticed John Gottman did a pub rewriting the myths: Journal of Consulting and Clinical Psychology 1993, Vol. 61, No. 6,907-910 The Analysis of Change: Issues, Fallacies, and New Ideas

Also John Willett did a rewrite of the Myths 'cuz I didn't want to reprint it again (or write a new version): Questions and Answers in the Measurement of Change REVIEW OF RESEARCH IN EDUCATION 1988 15: 345

Reliability Coefficients: Background info. Short primer on test reliability Informal exposition in

A growth curve approach to the measurement of change. Rogosa, David; Brandt, David; Zimowski, Michele Psychological Bulletin. 1982 Nov Vol 92(3) 726-748 APA record direct link

Rogosa, D. R., & Willett, J. B. (1985). Understanding correlates of change by modeling individual differences in growth. Psychometrika, 50, 203-228.

available from John Willet's pub page

Demonstrating the Reliability of the Difference Score in the Measurement of Change. David R. Rogosa; John B. Willett Journal of Educational Measurement, Vol. 20, No. 4. (Winter, 1983), pp. 335-343. Jstor

Maris, Eric. (1998). Covariance Adjustment Versus Gain Scores--Revisited.

A good R-primer on repeated measures (a lots else). Notes on the use of R for psychology experiments and questionnaires Jonathan Baron, Yuelin Li. Another version

Multilevel package has behavioral scienes applications including estimates of within-group agreement, and routines using random group resampling (RGR) to detect group effects.

Application publications, time-1, time-2 Experimental Group Comparisons:

a. Mere Visual Perception of Other People's Disease Symptoms Facilitates a More Aggressive Immune Response

b. Guns and testosterone. Guns Up Testosterone, Male Aggression

Guns, Testosterone, and Aggression: An Experimental Test of a Mediational Hypothesis Klinesmith, Jennifer; Kasser, Tim; McAndrew, Francis T,

Repeat the handout demonstration regressions using the fallible measures (the X's) from the bottom half of the linked data page. The X's are simply error-in-variable versions of the Xi's: X = Xi + error, with error having mean 0 and variance 10. Compare 5-number summaries for the amount of change from the earliest time "1" to the final observation "5" using the "Xi" measurements (upper frame) and the fallible "X" observations (lower frame).

Consider a population with true change between time1 and time2 distributed Uniform [99,101] and measurement error Uniform [-1, 1]. If you used discrete Uniform in this construction then you could say measurement of change is accurate to 1 part in a hundred.

Calculate the reliability of the difference score.

Also try error Uniform [-2,2], accuracy one part in 50.

A similar demonstration can be found in my

a. Demonstrate the Brogan-Kutner Section 5 equivalences (from paper, shown in class) for repeated measures anova and/or BK lmer analyses.

b. Is amount of gain/decline related to initial status? For the 8 new procedure patients and for the 13 old procedure patients, seperately, estimate the correlation between change and initial status and obtain a confidence interval if possible.

c. Analysis of Covariance. For the Brogan-Kutner data carry out an analysis of covariance (using premeasure as covariate) for the relative effectiveness of the surgery methods. Compare with class analyses.

Slides 203-204 in the Laird-Ware text materials purport to demonstrate that analysis of covariance produces a more precise treatment effect estimate than difference scores (repeated measures anova). What

Use lmer (or lme) to determine the comparative efficacy of the surgical methods on liver function. Investigate whether a model allowing for pretest differences is helpful.

The file captopril.dat contains the data shown in Section 2.2 of Verbeke, Introduction to Longitudinal Data Analysis, slides. Captopril is an angiotensin-converting enzyme inhibitor (ACE inhibitor) used for the treatment of hypertension.

a. Smart First Year Student analyses. Use the before and after Spb measurements to examine the improvement (i.e. decrease) in blood pressure. Obtain a five-number summary for observed improvement. What is the correlation between change and initial blood pressure measurement? Obtain a confidence interval for the correlation and show the corresponding scatterplot. What special challenges are present in this analysis?

b. lmer analyses. Try to obtain a good confidence interval for the amount of decline. Obtain a point and interval estimate for the correlation beween initial status and change in Spb.

In the "HistData" or "psych" packages reside the "galton" dataset, the primordial regression toward mean example.

Description: Galton (1886) presented these data in a table, showing a cross-tabulation of 928 adult children born to 205 fathers and mothers, by their height and their mid-parent's height. A data frame with 928 observations on the following 2 variables. parent Mid Parent heights (in inches) child Child Height. Details: Female heights were adjusted by 1.08 to compensate for sex differences. (This was done in the original data set)

Consider "parent" as time1 data and "child" as time2 data and investigate whether these data indicate

Aside: if you like odd plots, look at the

Let's use again the 40 subjects in the Review Question 1 "X" data.

a. Measured data. Take the time1 and time5 observations and obtain a 95% Confidence Interval for the amount of change. Compare the width of that interval with a confidence interval for the difference beween the time5 and time1 means if we were told a different group of 40 subjects was measured at each of the time points (data no longer paired).

b. Dichotomous data. Instead look at these data with the criterion that a score of 50 or above is a "PASS" and below that is "FAIL". Carry out McNemar's test for the paired dichotomous data, and obtain a 95% CI for the difference between dependent proportions. Compare that confidence interval with the "unpaired" version (different group of 40 subjects was measured at each of the time points) for independent proportions.

Data in wide form:

Investigate the effectiveness of Beat the Blues from these 2-wave data. Follow the various descriptive and modelling strategies shown in the BK class example.

The 3 billion dollar (and counting) change score

(items below clipped from 2017 various press reports; we do not have the data)

Sage Therapeutics (NASDAQ:SAGE) surged in response to its announcement of positive results from a Phase 2 clinical trial assessing SAGE-217 for the treatment of adult patients with moderate-to-severe major depressive disorder (MDD), a Fast Track indication in the U.S.[2020 note: name, Zuranolone]

It is estimated that there are around 16 million people in the United States with MDD.

SAGE-217, a neuroactive steroid, is next-generation GABA modulator. The GABA system, the major inhibitory signaling pathway in the brain and central nervous system (CNS), plays a key role in regulating CNS function. The company intends to advance the program into Phase 3 development.

The phase 2 looked at the effect of the positive allosteric modulator of the gamma-aminobutyric acid (GABA) receptor as compared to placebo in 89 patients with MDD.

About the Placebo-controlled Phase 2 trial of SAGE-217 in MDD:

In the randomized, double-blind, parallel-group, placebo-controlled trial, 89 eligible patients (with a minimum total score of 22 on the Hamilton Rating Scale for Depression) were stratified based on use of antidepressant treatment (current/stable or not treated/withdrawn >= 30 days) and randomized in a 1:1 ratio to receive SAGE-217 Capsules (30mg) (n=45) or matching placebo (n=44). All doses of study drug were administered at night with food. The study consisted of a 14-day treatment period, and a 4-week follow-up period. The mean HAM-D total scores at baseline were 25.2 for the SAGE-217 group and 25.7 for the placebo group (overall range 22-33), representing patients with moderate to severe MDD. Approximately 90 percent of patients in each group completed the study.

Sage Therapeutics (NASDAQ: SAGE), a clinical-stage biopharmaceutical company developing novel medicines to treat life-altering central nervous system (CNS) disorders, today announced positive top-line results from the Phase 2, double-blind, placebo-controlled clinical trial of SAGE-217 in the treatment of 89 adult patients with moderate to severe major depressive disorder (MDD). In the trial, treatment for 14 days with SAGE-217 was associated with a statistically significant mean reduction in the Hamilton Rating Scale for Depression (HAM-D) 17-Item total score from baseline to Day 15 (the time of the primary endpoint) of 17.6 points for SAGE-217, compared to 10.7 for placebo (p<0.0001). Statistically significant improvements were observed in the HAM-D compared to placebo by the morning following the first dose through Week 4 and the effects of SAGE-217 remained numerically greater than placebo through the end of follow-up at Week 6. At Day 15, 64 percent of patients who received SAGE-217 achieved remission, defined as a score of 7 or less on the HAM-D scale, compared with 23 percent of patients who received placebo (p=0.0005).

The 89-subject study met its primary endpoint of a statistically significant average reduction in HAM-D score from baseline to day 15 (p<0.0001) versus placebo. HAM-D is a rating scale for depression. At day 15, 64% of patients in the treatment group achieved remission compared to 23% for placebo (p=0.0005).

There were a total of 89 patients recruited into the study who were either given SAGE-217 or a placebo compound. Patients were treated for a 14 day period and were then measured for clinical outcome using the Hamilton Rating Scale for Depression or HAM-D 17-item total score from baseline. It was shown that SAGE-217 achieved a statistically significant improvement over placebo according to the HAM-D scale. Patients that took SAGE-217 were shown to achieve a 17.6 point improvement at day 15, compared to only a 10.7 point improvement for placebo. That meant that the drug achieved a statistically significant p-value of p < 0.0001. It was also noted that 64% of patients who took SAGE-217 achieved MDD remission, compared to only 23% of placebo patients. MDD remission was classified as patients having a HAM-D score of 7 or less. This was the secondary endpoint of the study, which was also achieved.

Investigators saw a statistically significant improvement in SAGE-217 patients on a depression scale the day after the first dose. By the time the two-week treatment period came to an end, the mean score in the SAGE-217 arm had dropped 17.6 points, as compared to a 10.7 point decline in the control group. That seven-point placebo-adjusted improvement was enough for the trial to hit its primary endpoint with a p value of less than 0.0001.

The positive results continued beyond the end of the treatment period. The mean reduction on the depression scale in the treatment arm remained statistically superior to that of the placebo group two weeks after dosing stopped.

--------------------------------------------------------

Consider the remission outcome (secondary) at day 15 (after 14 days of dose).

part a. For these time1-time2 dichotomous data (remission or not), explain what I did below to approximate the results reported by SAGE.

part b. In week 4 (time1-time2 data) materials we introduced some more advanced capabilities for time1-time2 dichotomous data, such as mcnemar.test from base R and diffpropci.mp from package PropCIs. Comment on the applicability of those functions to the remission study and whether those are preferable here to the basic analysis in part a.

--------------- > sage2 = matrix(c(29, 10, 16, 34), nr=2) # remission counts for the two groups > sage2 [,1] [,2] [1,] 29 16 [2,] 10 34 > prop.test(sage2) 2-sample test for equality of proportions with continuity correction data: sage2 X-squared = 14.078, df = 1, p-value = 0.0001754 alternative hypothesis: two.sided 95 percent confidence interval: 0.2079003 0.6264431 sample estimates: prop 1 prop 2 0.6444444 0.2272727 > chisq.test(sage2) Pearson's Chi-squared test with Yates' continuity correction data: sage2 X-squared = 14.078, df = 1, p-value = 0.0001754-----------------------------------------------------------

part c. Consider the primary outcome, change in depression score (HAM-D).

In weeks 4 and 5 we conducted analysis of time1- time2 (and multiwave) outcome data for comparisons of experimental groups. For the SAGE study pretend we have long form data, with time coded 0 for baseline and 1 for Day 15 endpoint, and outcome HAM-D score at the timepoints (0,1) and group indicating T/P. So we have 178 rows, and columns HAM-D group time subj.

If we fit the model in R syntax

from the information you have, give the point estimate for the fixed effects, time and time:group .

Write out the level 1, level 2 model corresponding to the combined model in the lmer statement.

From the

When Adolescents Give Up Pot, Their Cognition Quickly Improves

From 2017. Stents? A Controversial Experiment Upends The Conventional Wisdom On Heart Stents Publication: Percutaneous coronary intervention in stable angina (ORBITA): a double-blind, randomised controlled trial The Lancet.

1. Does nutrition science know anything? Is white or whole wheat bread 'healthier?' Depends on the person Publication: Bread Affects Clinical Parameters and Induces Gut Microbiome-Associated Personal Glycemic Responses Cell Metabolism, Korem et al DOI: 10.1016/j.cmet.2017.05.002

2. This time with 3 conditions For Exercise, Nothing Like the Great Outdoors Publication: Niedermeier M, Einwanger J, Hartl A, Kopp M (2017) Affective responses in mountain hiking-- randomized crossover trial focusing on differences between indoor and outdoor activity. PLoS ONE 12(5): e0177719. https://doi.org/10.1371/journal.pone.0177719

3. One thing at a time. Why listening to a podcast while running could harm performance Publication: A trade-off between cognitive and physical performance, with relative preservation of brain function Scientific Reports 7, Article number: 13709 (2017) nature.com.

4. Another crossover design (from Stat266). RCT (cross-over design). Damn right! The secret of success is swearing: How shouting four letter words can help make you stronger Swearing can help you boost your physical performance The full power of swearing is starting to be discovered. Publication: Stephens, R, Spierer, DK and Katehis, E (2018) Effect of swearing on strength and power performance. Psychology of Sport and Exercise, 35. pp. 111-117. ISSN 1469-0292 .

R-resources for crossover designs. package

also see slides 5-14 Repeated Measures Design Mark Conaway

Link functions for generalized linear mixed models (GLMMs), Bates slides (pdf pages 11-18)

Note:

A Handbook of Statistical Analyses Using R, Second Edition Torsten Hothorn and Brian S . Everitt Chapman and Hall/CRC 2009. Analysing Longitudinal Data II -- Generalised Estimation Equations and Linear Mixed Effect Models: Treating Respiratory Illness and Epileptic Seizures (Stanford access)

Data sets etc Package 'HSAUR2' August 2014, Title A Handbook of Statistical Analyses Using R (2nd Edition)

Note:

Recap Group Comparisons, Epilepsy example. Comparison of lmer models

For SAS (and GEE) fans another analysis

R-package

R-package: powerlmm

Background pubs: Power for linear models of longitudinal data with applications to Alzheimer's Disease Phase II study design Michael C. Donohue, Steven D. Edland, Anthony C. Gamst

Sample Size Planning for Longitudinal Models: Accuracy in Parameter Estimation for Polynomial Change Parameters Ken Kelley Notre Dame Joseph R. Rausch

basic R analogues,

Nontechnical overviews:

Phil Lavori et al. Psychiatric Annals, Volume 38, Issue 12, December 2008 Missing Data in Longitudinal Clinical Trials, Part A Part B

Robin Henderson, Missing Data in Longitudinal Studies pdf pages 89-93

Missing data wide-form imputation:

New package:

Vignette from

Rogosa R-session for vignette

Technical review: Missing data methods in longitudinal studies: a review Joseph G. Ibrahimcorresponding author and Geert Molenberghs

More on Missing data and imputation, including

Multiple Imputation. van Buuren S and Groothuis-Oudshoorn K (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. see also multiple imputation online Flexible Imputation of Missing Data. Stef van Buuren Chapman and Hall/CRC 2012. Chapter 9, Longitudinal Data Sec 3.8 Multilevel data. He is the originator of

CHAPTER 17 Incomplete data: Introduction and overview.

Handling drop-out in longitudinal studies (pages 1455-1497) Joseph W. Hogan, Jason Roy and Christina Korkontzelou, Statistics in Medicine 15 May 2004 Volume 23, Issue 9. (SAS implementations)

Bayesian approach. Missing Data in Longitudinal Studies. Strategies for Bayesian Modeling and Sensitivity Analysis Joseph W . Hogan and Michael J . Daniels Chapman and Hall/CRC 2008 Ch 5 Missing Data Mechanisms and Longitudinal Data Corresponding talk, A Brief Tour of Missing Data in Longitudinal Studies Mike Daniels

Overview and applications paper: Assessing missing data assumptions in longitudinal studies: an example using a smoking cessation trial Xiaowei Yanga, Steven Shoptawb. Drug and Alcohol Dependence Volume 77, Issue 3, 7 March 2005, Pages 213-225

R resources. Multivariate Analysis Task View,

a. Longpower package (vignette). Reconstruct the sample size calculation for the Alzheimer's disease trial (7 waves) on p.4 of the vignette.

b. MBESS package. Recreate the sample size calculation for width of confidence interval for differential growth using

a. try to do lmList on these data to get odds(good) for each of the each 111 subjects. Investigate effectiveness of treatment.

b Use lmer analyses to compare treament and placebo. Obtain a confidence interval for effectiveness of treament. Investigate gender differences in response to the intervention (i.e. the treatment)

c. Extend the lmer model in part b by adding the age and baseline measurements to the level 2 model. Compare with part b results.

To supplement the longitudinal texts (HSAUR, ALA etc) full model for the epilepsy data, lets try to build up the analysis from basic description comparing placebo vs drug up through some basic some basic glmer models.

A somewhat similar effort was made in the second class posting "Recap group comparisons (epcomp)" linked above. In this exercise treat period as a time measurement (1,2,3,4) rather than an ordered factor.

How many subjects in placebo and drug groups? Use lmList to obtain slopes and intercepts for fits of time trends to seizures for each subject and compare drug and placebo groups.

Fit and compare glmer models with treatment as the only level 2 predictor (for intercept) without and with a time trend. Compare.

Add the baseline to the glmer models above (in level 2 model for intercept; is effect of the drug significant (use confint)? Does adding age help this model?

Revisit and extend analysis of ep3 model in RQ3 and lecture.

Obtain confidence interval for effect of drug, look at residaul plots and identify any anomolous individuals, try out

These data are from a study of reading from Purdue. We use the data to compare two methods: Basal, traditional method of teaching; DRTA, an innovative method; coded 1 and 2 respectively in the data. Random assignment placed twenty-two students in each group; reading test measures were obtained pre and post instruction.

The Directed Reading Thinking Activity (DRTA) is a strategy that guides students in asking questions about a text, making predictions, and then reading to confirm or refute their predictions. The DRTA process encourages students to be active and thoughtful readers, enhancing their comprehension.

Use descriptive and inferential statistical methods to assess the relative efficacy DRTA method.

Start out by just using the subset of the longitudinal data Lead Level Week 0 and Week 6. Carry out the repeated measures anova for the relative effectiveness of chelation treatment with succimer or placebo (A,P). Show the three equivalences in the Brogan-Kutner paper between the repeated measures anova results and simple t-tests for these data. Next compare with a lmer fit following the B-K class example (posted). Finally use all 4 longitudinal measures (weeks 0,1,4,6) for a Active vs Placebo comparison using lmer. Compare with the results that use only 2 observations.

Data set is available at http://www.hsph.harvard.edu/fitzmaur/ala/ecg.txt (needs to be cut-and-paste into editor). Carry out the basic analysis of variance for this crossover design following week 5 Lecture topic 2. You may want to use glm to take into account the binary outcome. Does the treatment increase the probability of abnormal ECG? Give a point estimate and significance test for the treatment effect.

Data in long form and a wide-form version

Description: The data are from a longitudinal clinical trial of contracepting women.In this trial women received an injection of either 100 mg or 150 mg of depot-medroxyprogesterone acetate (DMPA) on the day of randomization and three additional injections at 90-day intervals. There was a final follow-up visit 90 days after the fourth injection, i.e., one year after the first injection.

Throughout the study each woman completed a menstrual diary that recorded any vaginal bleeding pattern disturbances. The diary data were used to determine whether a women experienced amenorrhea, the absence of menstrual bleeding for a specified number of days. A total of 1151 women completed the menstrual diaries and the diary data were used to generate a binary sequence for each woman according to whether or not she had experienced amenorrhea in the four successive three month intervals.

In clinical trials of modern hormonal contraceptives, pregnancy is exceedingly rare (and would be regarded as a failure of the contraceptive method), and is not the main outcome of interest in this study. Instead, the outcome of interest is a binary response indicating whether a woman experienced amenorrhea in the four successive three month intervals. A feature of this clinical trial is that there was substantial dropout. More than one third of the women dropped out before the completion of the trial. In the linked data, missing data are designated by "." [note: in the week 6 terminology consider the dropouts to be

The purpose of this analysis is to assess the influence of dosage on the risk of amenorrhea and any individual differences in the risk of amenorrhea.

Show your model for these data and the results. Provide significance tests and/or interval estimates for the odds of amenorrhea as a function of dose. Display and interpret individual differences in response by showing the random effects within each experimental group.

Artificial data example from week 2 RQ3 and Week 4 Lecture item 4 (used in Myths examples to illustrate time-1,time-2 data analysis) Two part artificial data example. The top frame (the Xi's) is 40 subjects each with three equally spaced time observations (here in wide form). For these these perfectly measured "Xi" measurements each subject's observation fall on a straight-line.

a. Use data set W6prob1a , for which about 15% of the observations have been made missing. Use these data (with lm) to recreate the multiple regression demonstration in Week 4 lecture, part 4: "Correlates and predictors of change: time-1,time-2 data" . Compare with the results for the full data on 40 subjects. What does

b. Repeat part a with data set W6prob1b. Can you find any reason to doubt a "missing at random" assumption for this data set?

Note: if we don't get to it in Week 5, then in Week 10 (DW) we will demonstrate multiple imputation procedures (

Technical resource: Weisberg, H. I. Statistical adjustments and uncontrolled studies. Psychological Bulletin, 1979, 86, 1149-1164. class handout

Publications: Lord, F. M. (1967). A paradox in the interpretation of group comparisons.

Wainer, H. (1991). Adjusting for differential base rates: Lord's Paradox again. Psychological Bulletin, 109, 147-151.

A very popular subject these days. Pretty good Wiki page LSE slides

Austin Nichols slides. Causal inference with observational data A brief review of quasi-experimental methods July 2009

Angrist Ch 5, MHE. Card and Krueger (1994) data, minumimum wage ex

paper On the Use of Linear Fixed Effects Regression Models for Causal Inference (sec 3.2)

R-package

Intros: Interrupted Time Series Quasi-Experiments Gene V Glass Arizona State University.

Assessing impact of stewardship: the why, when, and how of interrupted time series

Time Series Analysis with R section 4.6 December 2012 Handbook of Statistics 30(1):661-712

ITSpages for closing time example Class example: Closing time (glm kludge)

Rogosa R-session

Original publication (ozone data):

Box, G. E. P. and G. C. Tiao. 1975. Intervention Analysis with Applications to Economic and Environmental Problems." Journal of the American Statistical Association. 70:70-79. SAS example for ozone data

Applications:

Did fertility go up after the Oklahoma City bombing? An analysis of births in metropolitan counties in Oklahoma, 1990-1999. Demography, 2005.

Box-tiao time series models for impact assessment Evaluation Quarterly 1979

Interrupted time-series analysis and its application to behavioral data Donald P. Hartmann, John M. Gottman, Richard R. Jones, William Gardner, Alan E. Kazdin, and Russell S. Vaught J Appl Behav Anal. 1980 Winter; 13(4): 543-559.

Segmented regression analysis of interrupted time series studies in medication use research. By: Wagner, A. K.; Soumerai, S. B.; Zhang, F.; Ross-Degnan, D.. Journal of Clinical Pharmacy & Therapeutics, Aug2002, Vol. 27 Issue 4, p299-309,

R-packages:

New resource,

Propensity Scores for Repeated Treatments A Tutorial for the iptw Function in the TWANG Package

Using cobalt with Longitudinal Treatments

J.R. Lockwood, Harold Doran, and Daniel F. McCaffrey. Using R for estimating longitudinal student achievement models. R News, 3(3):17-23, December 2003.

Value-added does New York City. New York schools release 'value added' teacher rankings

from the unions: THIS IS NO WAY TO RATE A TEACHER

Value-Added Models to Evaluate Teachers: A Cry For Help H Wainer, Chance, 2011.

American Statistical Association Statement on Using Value-Added Models for Educational Assessment

Robinson, K., Schmidt, T. and Teti, D. M. (2008) Issues in the Use of Longitudinal and Cross-Sectional Designs, in Handbook of Research Methods in Developmental Science (ed D. M. Teti), Blackwell Publishing Ltd, Oxford, UK

vignette for Panel Data Econometrics in R: The plm Package Yves Croissant Giovanni Millo (esp. section 7. "plm versus nlme/lme4" ).

R-package

Create a version of the its 'closing time' example presented in class (example linked above) with the 50 months before intervention having mean fatality = 1 and after intervention mean fatality = 2.

Carry out the glm approximation to the time series analysis.

Part 1. Lord's paradox example

a. construct a two-group pre-post example with 20 observations in each group that mimics the description in Lord (1967):

statistician 1 (difference scores) obtains 0 group effect

statistician 2 (analysis of covariance) obtains large group effect for the group higher on the pre-existing differences in pretest

b. construct second example for which

statistician 1 (difference scores) obtains large group effect

statistician 2 (analysis of covariance) obtains 0 group effect

c. construct a third example (if possible) for which

statistician 1 (difference scores) obtains large postive group effect

statistician 2 (analysis of covariance) obtains large negative group effect

Part 2. Group Comparisons by repeated measures analysis of variance or lmer

For the examples in part 1, (a and c), carry out the group comparison (i.e. is there differential change?) for the artificial data using a repeated measures anova (one within, one between factor) or lmer equivalent.

Demonstrate the equivalence from Brogan-Kutner paper that testing the groupXtime interaction term is equivalent to a t-test between groups on individual improvement (i.e. a statistician 1 analysis).

Solution to Question 2 RQ2 solution

The display from lecture of the regression adjustments also has a numerical example (page 2 of pdf). Recreate the results shown for the Anderson et al Head Start example. Also for lecture materials, Regression Adjustments with Non-equivalent groups Week 6, show the Belson adjustment procedure (using control group slope) is equivalent to evaluating the vertical distance between the within-group regression fits at the mean of the treatment group. written out proof.

We reuse some time-1, time-2 observational data generated to illustrate Lord's paradox (RQ2) -- gender differences in weight gain. (The 'paradox' is solved by Holland, Wainer, Rubin using potential outcomes.) The set up for these artificial data is females gain, males no change

corr .7 within gender, equal vars time1 time 2 within gender means M F X (t1) 170 120 Y (t2) 170 130comparison of "gains" 170 - 170 - (130 - 120) = -10 negative effect males (females gain more).

ancova: 170 - 130 - .7*(170 - 120) = 5 positive male effect

So: does being male cause a student to gain weight or lose weight? Illustrate forms of diffs-in-diffs analyses.

wide form for these data long form for these data

'data.frame': 6402 obs. of 15 variables: $ id : int 31 31 31 31 31 31 31 31 36 36 ... $ lnw : num 1.49 1.43 1.47 1.75 1.93 ... $ exper : num 0.015 0.715 1.734 2.773 3.927 ... $ ged : int 1 1 1 1 1 1 1 1 1 1 ... $ postexp : num 0.015 0.715 1.734 2.773 3.927 ... $ black : int 0 0 0 0 0 0 0 0 0 0 ... $ hispanic : int 1 1 1 1 1 1 1 1 0 0 ... $ hgc : int 8 8 8 8 8 8 8 8 9 9 ... $ hgc.9 : int -1 -1 -1 -1 -1 -1 -1 -1 0 0 ... $ uerate : num 3.21 3.21 3.21 3.29 2.9 ... $ ue.7 : num -3.79 -3.79 -3.79 -3.71 -4.11 ... $ ue.centert1 : num 0 0 0 0.08 -0.32 ... $ ue.mean : num 3.21 3.21 3.21 3.21 3.21 ... $ ue.person.cen: num 0 0 0 0.08 -0.32 ... $ ue1 : num 3.21 3.21 3.21 3.21 3.21 ...Variables we will use are

a. How many individuals in this data set? Give a five-number summary of the number of observations per person. How many of the individuals in these data have black = 1?

b. SFYS descriptive analyses. We are interested in wages (measured by lnw) as a function of experience (lnw ~ exper). Show a five-number summary of the gradient (slope; i.e. change in log-wage for unit change in exper) and level (here fit for exper = 0, initial status) for the set of individuals. Then stratify on black = 1 vs black = 0 (combining the white and hispanic high school drop-outs). Also show side-by-side boxplots for gradient and initial level stratifying on

c. Use a formal mixed-effects model analysis to obtain random and fixed effects for the

d. Are there differences in the lnw ~ exper relation for students black = 1 vs black = 0? Show by estimates and confidence intervals from mixed-effects models.

e. Does inclusion of

For this problem consider gender differences in Vocabulary growth. Obtain the means (over persons) and plot the group growth curves, separately by gender. Does there appear to be curvature (i.e. deceleration in vocabulary skill growth) for both males and females? Construct an lmer model with the individual growth curve a quadratic function of grade (year), most convenient to use uncorrelated predictors

The class handout on regression adjustments shown in class and linked in RQ3 above contained summary statistics for the Head Start data considered in Anderson et al (1980)

Try out the various regression adjustments described on the handout for these pretest-posttest data. (Handout shows some approximate estimates). Also show the result for the basic diffs-in-diffs estimator from Week 6.

Lifespan is continuing to increase regardless of socioeconomic factors Publication: Advancing front of old-age human survival Wenyun Zuo, Sha Jiang, Zhen Guo, Marcus W. Feldman, and Shripad Tuljapurkar PNAS October 30, 2018 115 (44) 11209-11214;

John Fox tutorial: Cox Proportional-Hazards Regression for Survival Data

Survival analysis text by Rupert G. Miller (Ch 2,3,4,6). Available as Stanford Tech Report

Cox Regression exposition from NCSU. handout source Ch.6 st745

A concise, thorough Q&A overview from Columbia: Time-To-Event Data Analysis

CHAPTER 11 (9) Survival Analysis: Glioma Treatment and Breast Cancer Survival A handbook of statistical analyses using R (second edition). Brian Everitt, Torsten Hothorn CRC Press, Complete version (through Stanford access)

An Introduction to Survival Analysis Mark Stevenson EpiCentre, IVABS, Massey University. Author R-package

From Lausanne a very nice survival analysis intro plus analyses of Rossi recidivism Data

An R overview from openintro.org

Quick overview Survival analysis in clinical trials: Basics and must know areas Perspect Clin Res. 2011 Oct-Dec; 2(4): 145-148. Ritesh Singh and Keshab Mukhopadhyay

CHAPTER 11 Survival Analysis: Retention of Heroin Addicts in Methadone Maintenance Treatment. Handbook of Statistical Analyses Using Stata, Second Edition. (Stanford access) Sophia Rabe-Hesketh Chapman and Hall/CRC 2000.

Event History Analysis with R. (Stanford access) Goran Brostrom CRC Press 2012. R-package

Slides on renewal processes and hazard functions

Set of Slides An introduction to survival analysis, Geert Verbeke

Main R-package: survival; Terry Therneau, Stanford Stat Ph.D

CRAN Task View: Survival Analysis . Survival analysis, also called event history analysis in social science, or reliability analysis in engineering, deals with time until occurrence of an event of interest. However, this failure time may not be observed within the relevant time period, producing so-called censored observations. This task view aims at presenting the useful R packages for the analysis of time to event data.

KM bootstrap in Hmisc package,

Class handouts (scanned) week 7

1. Miller leukemia data (Kaplan-Meier); pdf p.42 in online version class example in R, data in package

extensions of leukemia example (week 7) 2017 update of exact tests

Cox fits, zph plot

Using eha package for aml

Legacy versions SAS Minitab

2. Heroin (addict) data. Source: D.J. Hand, (et al.) Handbook of Small Data Sets. Properly formatted version Analyses in Stevenson and Stata expositions above. Lou Reed, heroin

ascii version Rogosa R-session class handout

Additional analyses for heroin: Bootstrapping, Math 159 Pomona

Publication Source: Caplehorn, J., Bell, J. 1991. Methadone dosage and the retention of patients inmaintenance treatment. The Medical Journal of Australia,154,195-199.

Additional survival data.

3. Recidivism data from John Fox tutorial. Source: Rossi PH, Berk RA, Lenihan KJ (1980). Money, Work, and Crime: Some Experimental Results. Academic Press, New York. Another data location (UCLA)

4. Kalbfleisch and Prentice (1980) rat survival Data and description plus SAS analysis (Cox regression). Also best subsets Cox regression example, myeloma

5. R Textbook Examples. Applied Survival Analysis Chapter 3: Regression Models for Survival Data

Part b. In file teachb.dat are the more realistic data: censored versions of the 75 "survival times" in part a. Column 1 has the times (career) and Column 2 has the censoring indicator (Note these data have status = 1 if censored). Compute naive answers (ignoring censoring) to the questions in part a: what is the median survival time? what proportion of teachers are still in the district after 2 years? 4 years? 6 years?

Use the Kaplan-Meier product-limit estimate to answer the questions in part a for these censored data: what is the median survival time? what proportion of teachers are still in the district after 2 years? 4 years? 6 years? Plot a survival curve with 95% confidence intervals. Obtain bootstrap (percentile) confidence intervals for the median survival time, and for the lower quartile (25th) of the survival time distribution.

try ggsurvplot as substitute for plot survfit plot cox.zph

Part a. Social Security Life Tables. Use the 2007 Actuarial Life Table, useful discussion on benefits. Plot the hazard functions for males and females. Do these hazard functions appear to be exponential? Also plot the corresponding survival curves. Can you verify (approximately numerically) the relation between surival curve and integrated hazard from the week 7 handout-- S(t) = exp(-H(t)) ?

Part b. Refer to the hazard function shown in class for Alcohol and Incidence of Total Stroke ( Publication: Alcohol Consumption and Risk of Stroke in Women, Stroke, March 2012. Nurses' Health Study). (figure underneath Table 2).

What is the increase in hazard between 2 drinks/day and 3 drinks/day?

> str(melanom) 'data.frame': 205 obs. of 6 variables: $ no : int 789 13 97 16 21 469 685 7 932 944 ... $ status: int 3 3 2 3 1 1 1 1 3 1 ... $ days : int 10 30 35 99 185 204 210 232 232 279 ... $ ulc : int 1 2 2 2 1 1 1 1 1 1 ... $ thick : int 676 65 134 290 1208 484 516 1288 322 741 ... $ sex : int 2 2 2 1 2 2 2 2 1 1 ... ,We are interested in

days: time on study after operation for malignant melanoma

status: the patient's status at the end of study

Documentation shows the possible values of status are: 1: dead from malignant melanoma 2: alive at end of study 3: dead from other causes. Consider 'dead from other causes' as censored (along with alive). Thus, status vector should be status == 1 and the survival object is

a. How many survival times are censored? Obtain an estimate of the survival curve at each event time (along with CI) using the Kaplan-Meier estimate and plot the survival curve and confidence interval.

b. Does survival differ in men and women? Compare asymptotic (log-rank) and exact tests for gender differences? Compare the exact test with a bootstrap approximation. Plot the male and female survival curves.

c. Use Cox regression to carry out the gender comparison of the survival curves in part b. Obtain a confidence interval for the effect of gender on the hazard.

For example subject 1 died from AIDS 5 months after being seen in the HMO clinic while subject 2 was not known to have died from AIDS at the conclusion of the study and had been followed for 6 months.

a. Give a point and interval estimate of the survival times of these AIDS patients. Compare survival times for IV drug users and non-IV drug users.

b. Use Cox regression to investigate age and drug use as predictors of survival time. Check the proportional hazards assumptions and interpret the coefficients from the fit.

c. Repeat part b using the age information recoded into four categories

is comprised of survival times of patients with locally advanced, nonresectable gastric carcinoma. The patients were either treated with chemotherapy plus radiation or chemotherapy alone.

Usage: data(gastric) A data frame with 90 observations on 4 variables.

Source: Stablein DM, Carter J, Novak JW. (1981) Analysis of Survival Data with Nonproportional Hazard Functions. Controlled Clinical Trials 2:149-159.

References Gastrointestinal Tumor Study Group. (1982) A Comparison of Combination Chemotherapy and Combined Modality Therapy for Locally Advanced Gastric Carcinoma. Cancer 49:1771-7.

What proportion of the observations are right-censored in this experiment? For this experiment, provide description including graphics and then inferential analyses on the relative efficacy (in terms of survival) for chemotherapy alone versus chemotherapy plus radiation.

1. Continue Cox Regression examples. class handout from Ch.6 st745

Recidivism data from John Fox tutorial. . Another data location (UCLA)

Fox recidivism example (Rossi data) Rogosa R-session(ascii)

(c.f. week 7, review question 3 and week 8, review question 1a for full output).

See John Fox tutorial (week 7); also in his Sociology 761 notes.

From Lausanne a very nice survival analysis intro plus analyses of Rossi Data

Source: Rossi PH, Berk RA, Lenihan KJ (1980). Money, Work, and Crime: Some Experimental Results. Academic Press, New York.

2. Interval Censoring; breast cancer data.

Note: to run the

R Package for Analyzing Interval-Censored Survival Data. Exact and Asymptotic Weighted Logrank Tests for Interval Censored Data: The interval R Package

Interval Censoring: Tutorial on methods for interval-censored data and their implementation in R Statistical Modelling 2009; 9(4): 259-297.

Interval-Censored Time-to-Event Data Methods and Applications Chapman and Hall/CRC 2012 (esp Chap 14--glrt).

Finkelstein-Wolfe dataset is

package

Also

another new package:

Extension: Bayesian Regression Models for Interval-censored Data in R (R Journal Dec 2017)

3. Discrete-time survival analysis. Teacher and first-sex examples (ascii) from Willet and Singer (Chap 10,11). Note data locations at UCLA repository have moved from

[links in text resource number 2, text data examples] Willet-Singer presentation version Slides for Ch.11

4. Background: renewal processes and hazard functions. Slides on renewal processes and hazard functions

Multistate models: Package 'etm' Empirical Transition Matrix etm presentation

tutorial Computing Cumulative Incidence Functions with the etmCIF Function, with a view Towards Pregnancy Applications .

5. Design: sample size and power calculations for survival analysis. See

Function

6. Parametric Survival Models,

Also package

1. Fox Recidivism data and Additional Diagnostics for Cox Regression.

Use

a. From the lecture materials for the Cox Regression try out the reduced model that omits the

b. try out the diagnostic plots from

a. In the class example, after seeing a problem with proportional hazards assumption for the clinic variable, we used strata(clinic) which allows different baseline hazards in each clinic with predictors prison and dose. Look at this further by fitting the coxhernSt model seperately within each clinic (i.e. dropping the strata(clinic) term). Are the effects of clinic and dose similar within each clinic?

b. In class question asked whether some model-modification (e.g. interaction terms) in the coxhern model might mitigate the proportional hazards violation in the coxhern model. Try out some augmented models using interactions between the predictors in coxhern.

c. The

Data Example: Grade at First Intercourse. Research Question: Whether, and when, adolescent males experience heterosexual intercourse for the first time? Citation: Capaldi, et al. (1996). Sample: 180 high-school boys. Research Design: Event of interest is the first experience of heterosexual intercourse. Boys tracked over time, from 7th thru 12th grade. 54 (30% of sample) were virgins (censored) at end of data collection.

The Willet-Singer displays show lifetables and logistic regression estimates for survival analyses.

a. investigate time-to-event as a function of parental transitions (pt = 1, 1 or more transitions) using Kaplan-Meier and cox regression methods shown in class (not really correct for these interval censored data). Compare with the logistic regression results in the Willet-Singer materials.

b. clearly the firstsex data are really interval censored, rather than inately discrete-time. Sex was had sometime during the reported grade. Indicate how you would set up these data for the proper interval censored analysis. See class examples, week 8.

The Week 8 class example and RQ1 did crude stratification on age (to adress violations of proportional hazards). As an exercise, carry out the Cox Regression, stratifying on age in thirds (more carefully than I did in the class examples) and also stratifying on age split in fifths. Check the proportional hazards assumptions and compare results. It would be ugly/concerning if small changes in the age stratification kludge for proportional hazards determined statistical significance of the intervention variable (fin).

For these data 'status': 0=censored, 1=liver transplant, 2=death; so status = 2 represents observed values of

a. Use Kaplan-Meier methods to carry out a simple two-group comparison of the effectiveness of the drug, along with any useful plots.

b. Extend the two-group comparison with a Cox regression using additional predictors (chosen as you wish)

a. Repeat the gender comparison in parts b or c in Ex 2, week 7, stratifying on ulceration of the tumor (or not). Compare with the result in Ex 2 week 7 and interpret.

b. Carry out a Cox regression using predictors log(thick) and the gender indicator, stratifying on ulceration. Interpret the results. Check the viability of the proportional hazards assumption for this cox model.

R> library("interval") R> data("bcos", package = "interval")Class examples show parametric and non-parametric survival analyses for these interval censored data. Before these methods were available, various Kludges (imputations) existed. One is to take the midpoint of the interval for any observed event in [left, right] or if right is NA (censored) treat as left+ and carry out a survival analysis for right censored data. Repeat the breast cancer example Cox regression using this strategy and compare with the results from week 8 using the interval censoring.

Use these variables in the uis dataset to investigate influences on time to relapse. Formulate a useful Cox regression model, check the proportional hazards assumption, and provide interpretations of the parameter estimates.

Channing House is a retirement centre in Palo Alto, California. These data were collected between the opening of the house in 1964 until July 1, 1975. In that time 97 men and 365 women passed through the centre. For each of these, their age on entry and also on leaving or death was recorded. A large number of the observations were censored mainly due to the resident being alive on July 1, 1975 when the data was collected. Over the time of the study 130 women and 46 men died at Channing House. Differences between the survival of the sexes, taking age into account, was one of the primary concerns of this study.

sex A factor for the sex of each resident ("Male" or "Female"). entry The residents age (in months) on entry to the centre exit The age (in months) of the resident on death, leaving the centre or July 1, 1975 whichever event occurred first. time Length of time (in months) that the resident spent at Channing House.(time=exit-entry) cens Indicator of right censoring. 1 indicates that the resident died at Channing House, 0 indicates that they left the house prior to July 1, 1975 or that they were still alive and living in the centre at that date. > head(channing) sex entry exit time cens 1 Male 782 909 127 1 2 Male 1020 1128 108 1 3 Male 856 969 113 1 4 Male 915 957 42 1 5 Male 863 983 120 1 6 Male 906 1012 106 1 > 780/12 [1] 65 # in years just for reference #some quick descriptives > attach(channing) > table(cens, sex) sex cens Female Male 0 235 51 1 130 46 > tapply(time, sex, quantile) $Female $Male 0% 25% 50% 75% 100% 0% 25% 50% 75% 100% 0 36 87 137 137 0 33 72 120 137 > tapply(entry, sex, quantile) $Female $Male 0% 25% 50% 75% 100% 0% 25% 50% 75% 100% 733 851 899 948 1140 751 863 919 967 1073 #Kaplan-Meier fits and tests > kmfit = survfit(Surv(time, cens) ~ sex, data = channing) > kmfit Call: survfit(formula = Surv(time, cens) ~ sex, data = channing) records n.max n.start events median 0.95LCL 0.95UCL sex=Female 365 365 365 130 NA 124 NA sex=Male 97 97 97 46 108 82 NA > plot(kmfit, conf.int=TRUE) ## SEE PLOT > survdiff(Surv(time, cens) ~ sex, data = channing) Call: survdiff(formula = Surv(time, cens) ~ sex, data = channing) N Observed Expected (O-E)^2/E (O-E)^2/V sex=Female 365 130 143.1 1.19 6.41 sex=Male 97 46 32.9 5.18 6.41 Chisq= 6.4 on 1 degrees of freedom, p= 0.0113 > surv_test(Surv(time, cens) ~ sex, data = channing, distribution=approximate(B = 4000)) Approximative Logrank Test data: Surv(time, cens) by sex (Female, Male) Z = -2.4251, p-value = 0.015 alternative hypothesis: two.sided #Cox regressions > coxfit1 = coxph(Surv(time, cens) ~ sex, data = channing) > coxfit1 Call: coxph(formula = Surv(time, cens) ~ sex, data = channing) coef exp(coef) se(coef) z p sexMale 0.432 1.54 0.172 2.52 0.012 Likelihood ratio test=5.89 on 1 df, p=0.0153 n= 462, number of events= 176 > summary(coxfit1) Call: coxph(formula = Surv(time, cens) ~ sex, data = channing) n= 462, number of events= 176 coef exp(coef) se(coef) z Pr(>|z|) sexMale 0.4321 1.5406 0.1718 2.516 0.0119 * --- exp(coef) exp(-coef) lower .95 upper .95 sexMale 1.541 0.6491 1.1 2.157 Concordance= 0.541 (se = 0.016 ) Rsquare= 0.013 (max possible= 0.985 ) Likelihood ratio test= 5.89 on 1 df, p=0.01526 Wald test = 6.33 on 1 df, p=0.01187 Score (logrank) test = 6.43 on 1 df, p=0.01123 > coxfit2 = coxph(Surv(time, cens) ~ sex + entry, data = channing) > coxfit2 Call: coxph(formula = Surv(time, cens) ~ sex + entry, data = channing) coef exp(coef) se(coef) z p sexMale 0.38004 1.46 0.17191 2.21 2.7e-02 entry 0.00724 1.01 0.00105 6.90 5.4e-12 Likelihood ratio test=50.9 on 2 df, p=8.67e-12 n= 462, number of events= 176 > summary(coxfit2) Call: coxph(formula = Surv(time, cens) ~ sex + entry, data = channing) n= 462, number of events= 176 coef exp(coef) se(coef) z Pr(>|z|) sexMale 0.380045 1.462350 0.171914 2.211 0.0271 * entry 0.007239 1.007266 0.001050 6.896 5.36e-12 *** --- exp(coef) exp(-coef) lower .95 upper .95 sexMale 1.462 0.6838 1.044 2.048 entry 1.007 0.9928 1.005 1.009 Concordance= 0.65 (se = 0.023 ) Rsquare= 0.104 (max possible= 0.985 ) Likelihood ratio test= 50.94 on 2 df, p=8.671e-12 Wald test = 53.05 on 2 df, p=3.02e-12 Score (logrank) test = 54.2 on 2 df, p=1.703e-12 > anova(coxfit1, coxfit2) Analysis of Deviance Table Cox model: response is Surv(time, cens) Model 1: ~ sex Model 2: ~ sex + entry loglik Chisq Df P(>|Chi|) 1 -972.57 2 -950.04 45.056 1 1.915e-11 *** > cox.zph(coxfit2) rho chisq p sexMale -0.0304 0.162 0.687 entry 0.0481 0.400 0.527 GLOBAL NA 0.553 0.759 > plot(cox.zph(coxfit2)) > k = 1:462 > channing$ID = k > head(channing) sex entry exit time cens ID 1 Male 782 909 127 1 1 2 Male 1020 1128 108 1 2 3 Male 856 969 113 1 3 4 Male 915 957 42 1 4 5 Male 863 983 120 1 5 6 Male 906 1012 106 1 6 > coxfit4 = coxme(Surv(time, cens) ~ sex + entry + (1|ID), data = channing) > print(coxfit4) Cox mixed-effects model fit by maximum likelihood Data: channing events, n = 176, 462 Iterations= 100 503 NULL Integrated Fitted Log-likelihood -975.5084 -950.0293 -947.0135 Chisq df p AIC BIC Integrated loglik 50.96 3.00 4.9937e-11 44.96 35.45 Penalized loglik 56.99 4.97 4.8849e-11 47.05 31.29 Model: Surv(time, cens) ~ sex + entry + (1 | ID) Fixed coefficients coef exp(coef) se(coef) z p sexMale 0.382989266 1.466662 0.173129158 2.21 2.7e-02 entry 0.007286551 1.007313 0.001058856 6.88 5.9e-12 Random effects Group Variable Std Dev Variance ID Intercept 0.13172664 0.01735191 > fixed.effects(coxfit4) > exp(fixed.effects(coxfit4)) sexMale entry sexMale entry 0.382989266 0.007286551 1.466662 1.007313 > stem(exp(ranef(coxfit4)[[1]])) The decimal point is 2 digit(s) to the left of the | 96 | 34 96 | 8 97 | 1234 97 | 6689 98 | 01111222244 98 | 5555566666666677777778888888999999999999 99 | 00000000000011111111111111112222222222222222223333333333333333333444+4 99 | 55555555555555555666666666666666666667777777777777777777777778888888+33 100 | 00000000000000000000000000000000000011222223334444 100 | 555666677777778888999999999 101 | 00000000111111111111111222222222222222233333333333333344444444444444 101 | 5555555555555555556666666666666667777777777777777777777777 > quantile(exp(ranef(coxfit4)[[1]])) 0% 25% 50% 75% 100% 0.9633244 0.9928323 0.9986529 1.0106724 1.0174679

a. Compare median survival times for males and females obtained by ignoring censoring, with median survival times from Kaplan-Meier. Comment on similarities or differences. Why does KM show NA for female median survival time?

b. Are gender differences in survival statistically significant? Can differences in survival be attributed to gender or Channing House?

c. For the Cox regressions, cox2 adds entry age to the model in cox1. Is entry age a significant predictor in this model?

d. Does the proportional hazards assumption appear to be satisfactory for cox2?

e. What is is the point estimate of the proportional increase or decrease in hazard (give direction) for an increase of 1 year in entry age in the cox2 model?

f. Explain what I am doing in the formulation of the cox4 model.

g. Does cox4 appear to improve the fit compared to cox2? Give a rough test statistic.

h. For the cox4 fit, what is the magnitude of the individual effects--i.e. how much do individual effects impact the estimated hazards?

Time-dependent Covariates, Mixed Effects Cox models (Nested data, Recurrent Events, Frailty Models)

Joint Modelling of Longitudinal and Time-to-Event Data

1.

2. That Daily Glass Of Wine May Lead To An Earlier Death Publication: Daily Drinking Is Associated with Increased Mortality Alcoholism: Clinical and Experimental Research, October 2018.

3. Longitudinal Reciprocal Associations Between Anxiety, Depression, and Alcohol Use in Adolescent Girls Alcoholism: Clinical and Experimental Research, November 2018.

a. Using Time Dependent Covariates and Time Dependent Coefficients in the Cox Model Terry Therneau Cindy Crowson Mayo Clinic October 2018. function

b. Recidivism data, weekly employment measure (Section 4, Fox tutorial).

week 9 class handout (ascii), recidivism data longer version shown in class (ascii) John Fox script from Soc761

fold/unfold functions (wide-to-long): Package

coxme Class handout plain text version

Additional materials. frailtyHL: A Package for Fitting Frailty Models with H-likelihood by Il Do Ha, Maengseok Noh and Youngjo Lee; R Journal December 2012

Comparing Survivival curves

Class example (ascii) R-session, tumor recurrence in patients with bladder cancer. plots for bladder recurrence

additional R-packages,

Recurrent Events: Chapter 9 of Kalbfleisch and Prentice (2nd edition), "Modeling and Analysis of Recurrent Event Data"

Cook, R. J. and Lawless, J. F. (2007). The Statistical Analysis of Recurrent Events. (Stanford access) Springer, New. York.

Additional examples: Frailty models (individual differences, random effects) and Recurrent events (observe multiple on/off transitions and timing). Asthma data example from Duchateau et al (2003). Evolution of Recurrent Asthma Event Rate over Time in Frailty Models Journal of the Royal Statistical Society. Series C (Applied Statistics) 355-363. see also Ch 3 in Frailty Models in Survival Analysis Andreas Wienke Chapman and Hall/CRC 2010

Additional R-packages:

Class handout. JM for aids example (ascii).

R-package

Book (Stanford access) Joint Models for Longitudinal and Time-to-Event Data. With Applications in R. Dimitris Rizopoulos. Chapman and Hall/CRC 2012. Book Table of Contents Book website (scripts)

joineR package joineRML package JoineR website presentation

Use the fox.long dataset created and shown in Week 9 lecture and run the original coxph in the Fox tutorial and class example. Show that the same results are obtained in the original wide form and long form.

a. Class example; eortc breast cancer data

Compare results ignoring center, stratifying on centers, and using coxme (in class example) for patients nested within centers.

b. Rat data (see R Journal December 2012).

The data set presented by Mantel et al. (1977) is based on a tumorigenesis study of 50 (q = 50) litters of female rats. For each litter, one rat was selected to receive the drug and the other two rats were placebo treated controls (ni = 3). Here each litter is treated as a cluster. The survival time (time) is the time to development of tumor, measured in weeks. Death before occurrence of tumor yields a right-censored observation; forty rats developed a tumor, leading to censoring of about 73%. The survival times for rats in a given litter may be correlated due to a random effect representing shared genetic or environmental effects.

Compare Cox regression models that investigate effect of the drug (rx)

(i) ignoring litters and (ii) mixed effects models incorporating the nested structure of these data (rats within litters)

Class questions motivated extending the anlaysis allowing treatment effects to vary over the 37 centers (random effects in the survival analysis). In the class example model

Data on 6805 hemodialysis patients in all federally funded clinics (67 centers) in Rio de Janeiro State, Brazil.

Format A data frame with 6805 observations on the following 7 variables. center a numeric code indicating in which of 67 centers the patient was treated. age of the patient. begin The month in which treatment began, with 1 representing January 1998. end The month in which observation terminated, either because of death or censoring. The study ended in month 44 (August, 2000). event 1, death, or 0, censoring. time the difference between end and begin. disease a factor with levels congen, (congenital); diabetes; hypert (hypertension); other; and renal.Use Cox regression with age and disease as predictors to study survival time of patients. Follow the in-class analysis of the eortc data to compare results ignoring centers and using a mixed-effects models with random factor

Data are available in the

1. Observational Studies, Survival analysis. Marriage and Cancer Survival example (from weeks 5 and 6, Stat266 2016)

2. Stability over time (Myth 8). Change and Sameness

3. Structural equation models for longitudinal data (don't do it; Myth 7) (handout in week 10 handout collection)

4. Reciprocal effects ( Myth 9) Rogosa, Encyclopedia of Social Science

5. Longitudinal Network Data

Resources

David Rogosa. Casual Models Do Not Support Scientific Conclusions: A Comment in Support of Freedman. Journal of Educational Statistics, Vol. 12, No. 2. (Summer, 1987), pp. 185-195. Jstor link Theme Song

Rogosa, D. R. (March 1994). Longitudinal reasons to avoid structural equation models, UC Berkeley.

Rogosa, D. R. (1993). Individual unit models versus structural equations: Growth curve examples. In Statistical modeling and latent variables, K. Haagen, D. Bartholomew, and M. Diestler, Eds. Amsterdam: Elsevier North Holland, 259-281.

original publication on the longitudinal path analysis: Some Models for Analysing Longitudinal Data on Educational Attainment. Harvey Goldstein

Rogosa, D. R., & Willett, J. B. (1985). Satisfying a simplex structure is simpler than it should be. Journal of Educational Statistics, 10, 99-107. Jstor link Follow-up paper: Two Aspects of the Simplex Model: Goodness of Fit to Linear Growth Curve Structures and the Analysis of Mean Trends. Frantisek Mandys; Conor V. Dolan; Peter C. M. Molenaar. Journal of Educational and Behavioral Statistics, Vol. 19, No. 3. (Autumn, 1994), pp. 201-215. Jstor link

J.H. Ware Tracking in S. Kotz, N.L. Johnson (Eds.), The Encyclopedia of Statistical Sciences (13th Edn.), Vol. 9 John Wiley, New York (1988)

Rogosa, D. R., Floden, R. E., & Willett, J. B. (1984). Assessing the stability of teacher behavior. Journal of Educational Psychology, 76, 1000-1027. APA link also available from John Willet's pub page

Rogosa, D. R., & Willett, J. B. (1983). Comparing two indices of tracking. Biometrics, 39, 795-6. JStor link

Rogosa, D. R. Stability section of Individual unit models versus structural equations (link above)

Rogosa, D. R. Stability of school scores from educational assessments: Confusions about Consistency in Improvement David Rogosa, June 2003 ; Education Writers Association April 2004

A non-parametric index of tracking James A. Koziol Pages 181-191 Published online: 02 Aug 2010 Journal of Applied Statistics Volume 25, 1998 - Issue 2

Personality research. Stability versus change, dependability versus error: Issues in the assessment of personality over time David Watson Journal of Research in Personality 38 (2004) 319-350.

Some applications: A Stochastic Model for Analysis of Longitudinal AIDS Data J.M.G. Taylor, W.G. Cumberland, Sy J.P.; Journal of the American Statistical Association, Vol. 89, 1994

Body mass index from childhood to middle age: a 50-y follow-up V A Casey J T Dwyer K A Coleman I Valadian The American Journal of Clinical Nutrition, Volume 56, Issue 1, 1 July 1992, Pages 14-18

Tracking and its applicability to Physical Education and Sport

Tracking of objectively measured physical activity from childhood to adolescence: The European youth heart study. Scandinavian Journal of Medicine & Science in SportsVolume 18, Issue 2, 2007.

Factors Associated With Tracking of BMI: A Meta-Regression Analysis on BMI Tracking. Obesity (2011) 19 5, 1069-1076. doi:10.1038/oby.2010.250

Long-term tracking of cardiovascular risk factors among men and women in a large population-based health system The Vorarlberg Health Monitoring and Promotion Programme. European Heart Journal (2003) 24, 1004-1013.

Journal of Traumatic Stress. Reliability of Reports of Violent Victimization and Posttraumatic Stress Disorder Among Men and Women With Serious Mental Illness Volume 12 Issue4 587 - 599 1999-10-01 Lisa A. Goodman Kim M. Thompson Kevin Weinfurt Susan Corl Pat Acker Kim T. Mueser Stanley D. Rosenberg

Computing: Foulkes-Davis gamma (not in R). A GAUSS program for computing the Foulkes-Davis tracking index for polynomial growth curves TRACK: A FORTRAN program for calculating the Foulkes-Davis tracking index Gerard E. Dallal Computers in Biology and Medicine Volume 19, Issue 5, 1989, Pages 367-371

Rogosa, D. R. (1980). A critique of cross-lagged correlation.

An old review of reciprocal effects. Rogosa, D. R. (1985). Analysis of reciprocal effects. In International Encyclopedia of Education, T. Husen and N. Postlethwaite, Eds. London: Pergamon Press, 4221-4225. (reprinted in Educational Research,Methodology & Measurement: An international handbook, J. P. Keeves Ed. Oxford: Pergamon Press, 1988.)

Granger Causality. Nobel 2003. Complete Granger

Relationships--and the Lack Thereof--Between Economic Time Series, with Special Reference to Money and Interest Rates. David A. Pierce

R-package

Huisman, M. E. and Snijders, T. A. B. (2003). Statistical analysis of longitudinal network data with changing composition. Sociological Methods and Research, 32:253-287.

Application: Kids' friends influence physical activity levels Publication: The Distribution of Physical Activity in an After-school Friendship Network Sabina B. Gesell, Eric Tesdahl, Eileen Ruchman, Pediatrics; originally published online May 28, 2012.

For the Ramus data (week 2 exercise, 20 individuals, 4 time points), the Foulkes-Davis (gamma) index of tracking has point estimate .83 and (bootstrap) standard error .06 for the 18-month time interval 8yrs to 9.5 years. Compare that estimate of consistency of individual differences with time1-time2 correlations for the time intervals [8, 9.5] and [9, 9.5].