Multilevel Modeling Using R

David Rogosa Sequoia 224, rag{AT}stanford{DOT}edu Office hours: ???

Course web page: http://rogosateaching.com/stat196/

3/29 Audio message (.mp3) on Spring 2020

**new!** 4/23 mid-quarter update msg

For recreation of classroom experience here

are youtube versions of the music I play

before starting lecture

and after lecture concludes.

Some may wish to reverse that ordering.

For recreation of classroom experience here

are youtube versions of the music I play

before starting lecture

and after lecture concludes.

Some may wish to reverse that ordering.

To see full course materials from Spring 2019 go here

From explorecourses

STATS 196A (EDUC 401D): Multilevel Modeling Using R Multilevel data analysis examples using R. Topics include: two-level nested data, growth curve modeling, generalized linear models for counts and categorical data, nonlinear models, three-level analyses. For more information, see course website: http://rogosateaching.com/stat196/ Terms: Spr | Units: 1 | Grading: Satisfactory/No Credit Instructors: Rogosa, D. (PI) STATS 196A | 1 units | Class # 17080 | Section 01 | Grading: Satisfactory/No Credit | WKS | Wed 3:30 PM - 5:20 PM at Sequoia Hall 200 with Rogosa, D. (PI) Instructors: Rogosa, D. (PI) Notes: Class meets on April 8, April 15, April 22, April 29, May 13.

For the 1-unit enrollment in this short course, students are expected to engage in the four presentation class sessions, and for the fifth session each student makes a short (~10 min) presentation of a relevant data analysis they have conducted.Course ScheduleFive (2hr) mtgs W 3:30 - 5:20 on April 8, April 15, April 22, April 29, May 13. Sequoia 200 Weeks 1 - 4. a. Introduction: Basic analyses for two-level nested data, normal models (UK Exam data) b. Additional two-level (normal) models: experimental designs (Dyestuff), longitudinal data (growth curves, sleepstudy), observational data (High School and Beyond) c. Generalized linear mixed models for counts and categorical outcomes d. Three-level analyses (nested data and longitudinal data) e. Specialized applications (as time permits): regression diagnostics, power calculations and design, ecological inference, survival analysis, nonlinear functional forms, mediation analysis, propensity scores and matching, imputation, item response theory Week 5. Student presentations of multilevel data analyses

The Registrar does have a form (no-fee) for faculty, staff, post-docs: Application for Auditor or Permit to Attend (PTA) Status

Many of the example presented in this short course are described in Examples from Multilevel Software Comparative Reviews Douglas Bates. Code version of MlmSoftRev

R-package containing mlmRev data examples. Bates talk on mlmRev U Bristol documentation Additional examples in core package lme4

Another set of examples: lmer for SAS PROC MIXED Users Douglas Bates Department of Statistics University of Wisconsin Madison Data sets from SAS System for Mixed Models

Overviews and additional examples from Doug Bates:

lme4: Mixed-effects modeling with R February 17, 2010 Springer (book chapters). An merged version of Bates book:[broken 2/18] lme4: Mixed-effects modeling with R January 11, 2010

R Journal intro Fitting linear mixed models in R Using the lme4 package Douglas Bates (pp.27-30)

Collection of all Doug Bates lme4 talks Fitting linear mixed-effects models using lme4,

Technical topics: Mixed models in R using the lme4 package Part 4: Theory of linear mixed models

HSB and growth curve examples in John Fox lme tutorial

Another nice lmer exposition with life sciences examples: Mixed-effects models, Remko Duursma, Jeff Powell Hawkesbury Institute for the Environment, Western Sydney University. September 2016. HIE Datasets

Current version of R is R version 3.6.3 (Holding the Windsock) released on 2020-02-29.. For references and software: The R Project for Statistical Computing

Closest download mirror is Berkeley. If Berkeley is offline, choose a mirror from the main R page (first link).

A recent text (potentially) provides more infrastructure for this short course, but sadly it has many shortcomings. This text has free access at Stanford via crcnetbase.com

Multilevel Modeling Using R http://www.crcpress.com/product/isbn/9781466515857 Chapman & Hall/CRC Statistics in the Social and Behavioral Sciences Published:June 23, 2014 by CRC Press Journal of Statistical Software Book Review Book website, including data

1. Introductory Example. Nested data, two-levels. Goldstein Exam Data.

a. Introductory descriptive approaches for gender gap analysis (Smart First Year Student analyses using lmList, additional plots).

b. Various lmer analyses for gender gap.

Rogosa R-session basic plots models used 2020 isSingular fix Stat 209 Gender gap data analysis. scanned class handout

c. Residual plots, add-on regression diagnostics: packages

Rogosa session with Exam data (week 1) (ascii) resulting plots

d. more P-values, tests add-ons to lmer.

afex package with Exam data ggaplmer2

Faraway text addendums: Inferential Methods for Linear Mixed Models

e. Plus Plots for random, fixed effects.

2. Matrix Formulation for Mixed Effects Models (growth curves and nested data).

1. Recap Introductory Example. Nested data, two-levels. Goldstein Exam Data.

Add-on package

prediction with lmer :

Rogosa session. plots

2. Common/canonical two-level examples (measured outcome)

A. Growth Curve models and analysis. Bates Sleepstudy example (week2 Stat222).

Chap. 4 Bates book [more Doug Bates Slides (pdf pages 8-28) ]

Sleepstudy class handout, pdf scan Sleepstudy, 2018 clean ascii Individual plots (frame-by-frame) Plot of straight-line fits

Reduced/constrained models: growth curve example

B. Two-level nested (normal) data recap. Brief overview of HSB (High School and Beyond) analysis (from Stat209): plots and model.

a full single Bryk dataset (longform) (abbreviated) Rogosa R-session Bryk data plots, Rogosa R-session

Caution froma prior year: side-by-side boxplot creation and lmList subset issue

A nice teaching document from Indiana that does HSB data with every known statistical package (including

3. Growth Curve Modeling exercise, Brain Volume Data Analysis. analyses from "Variation in longitudinal trajectories of regional brain volumes of healthy men and women (ages 10 to 85 years) measured with atlas-based parcellation of MRI" cartoon plot of Lateral Ventricles data; actual data plot of Lateral Ventricles data; development of lmer (mixed effect) growth models

4. Data from designed experiments.(basics).

a. Dyestuff data, Bates book, Chapter 1 (sec 1.2, 1.3) Rogosa Dyestuff session

b. Penicillin data (also Pastes, ratbrain), Bates book, Chapter 2.

From Doug Bates presentation Rogosa R-session

Random effects anova recap (see Bates book Chap1, Chap2).

Main topic: Generalized Linear Mixed Models: counts and proportions.

nice overview: Generalized Linear Mixed Models from

1. Dichotomous outcomes,

a. Respiratory clinical trial from HSAUR. lmList does logistic, introducing glmer lmList, glmer for respiration data (placebo group)

b. Contraception (Bangladesh) use from Bates review Rogosa R-session glmer model slide

c. Test scores (pass/fail outcome) from Ch 8, Multilevel Modeling Using R. Rogosa R-session

2. Count outcome, GLMM poisson models.

a. Count data: Contagious bovine pleuropneumonia, data(cbpp) in lme4.

Rogosa R-session herd plots

b. Factorial design, Count outcome. From HIE Sydney. EucFACE ground cover data Rogosa glmer session.

c. Another count data example from mlmRev package, data(Mmmec): Malignant Melanoma Mortality in the European Community associated with the impact of UV radiation exposure.

Rogosa glmer session more Rogosa Mmmec session

3. hglm -- a different package for fitting hierarchical generalized linear models.

R Journal December 2010. manual vignette

Measured outcomes

a. Achieve data from

b. example from mlmRev package, data(Chem97): Scores on A-level Chemistry in 1997. Rogosa R-session

Count outcome.

c. data(Mmmec): Malignant Melanoma Mortality in the European Community associated with the impact of UV radiation exposure. Rogosa 3-level session

Vignette: Analyzing Imputed Data with Multilevel Models and merTools

Rogosa R-session for vignette

Missing data wide-form imputation:

Missing data background. Multiple Imputation. Nhanes data example (mice primer) in van Buuren S and Groothuis-Oudshoorn K (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67.

See also Flexible Imputation of Missing Data. Stef van Buuren Chapman and Hall/CRC 2012. Chapter 9, Longitudinal Data Sec 3.8 Multilevel data. He is the originator of

R resources. Multivariate Analysis Task View,

New package: