Data analysis examples using R

David Rogosa Sequoia 224, rag{AT}stat{DOT}stanford{DOT}edu

Course web page: http://rogosateaching.com/ed401//

For 2014 course materials go here

From the RegistrarData analysis examples using R. Ed401C Aut 2015 (1 unit) Description We will do basic and intermediate level data analysis examples, like those that students will have seen in their courses, in R. Examples include: descriptive statistics and plots, analysis of variance, correlation and regression, categorical variables, multilevel data. See http://rogosateaching.com/ed401/ Terms: Aut | Units: 1 | Grading: Satisfactory/No Credit Instructors: Rogosa, D. (PI) EDUC 401C: Data Analysis Examples Using R 2015-2016 Autumn EDUC 401C | 1 units | Class # 17377 | Section 01 | Grading: Satisfactory/No Credit | WKS 09/28/2015 - 11/02/2015 Mon 3:30 PM - 5:20 PM at~~Lathrop 282 with Rogosa, D. (PI) note:this is the old GSB building~~Axess Enrollment will open for students on August 1st. Instructors: Rogosa, D. (PI) Notes: Class meets on Sept 28, Oct 5, Oct 12, Oct 19, Nov 2.

Course Schedule Five (2hr) mtgs M 3:30-5:20 Lathrop 282 Sept 28 1. Descriptive stats; analysis of means (up through anova, factorial designs) Oct 5 2. Correlation and regression (up through multiple regression, variable selection etc) Oct 12 3. Categorical variables (tables, logistic regression) Oct 19 4. Overflow and additional regression topics. Missing Data (mice); Generalized linear models for counts; Smoothers (loess); Multilevel data (descriptives, plots, and intro to mixed-effects models) Nov 2 5. Student analyses (students present a small analysis of their own)

1/7/09. NY Times endorses R: Data Analysts Captivated by R's Power

Current version of R is version 3.2.2 (Fire Safety) 2015-08-14.

For references and software: The R Project for Statistical Computing Closest download mirror is Berkeley

Many students employ RStudio to enhance their R-enjoyment. I won't use it, but it serves very well especially on a single screen (e.g. portable) machine. "RStudio IDE is a powerful and productive user interface for R. It's free and open source, and works great on Windows, Mac, and Linux." A short R-intro that includes RStudio (and much more)

The greatest challenge here is not being overwhelmed by all the options.

0. Reference Cards and other short documents section of CRAN page

1. When I taught the introductory course Stat141, the text for computing was

An online version available from John Verzani's page . alternate version, single pdf UsingR R-package

2. In Stat209 a primary resource for R and data analysis is

3.

4. From CRAN central: An Introduction to R Notes on R: A Programming Environment for Data Analysis and Graphics Version 3.0.1 (2013-05-16) W. N. Venables, D. M. Smith and the R Core Team

manual for package

i. correlation and scatterplots platelet session platelet plots platelet data extensions,fixes

extra stat141 example Brain and Body Weights for 62 Species of Land Mammals

ii. Straight-line regression single subject Sleepstudy example R session plots and handout version

more Coleman. using

Mroz87 data Mroz87 data description IV data analysis session Woolridge stata ivreg

10/15 Background exposition for IV and returns to schooling: Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments. Joshua D. Angrist; Alan B. Krueger,

i. single variable. Stef Van Buren example

ii. traditional bivariate multivariate data methods, correlation and regression example

ssdat missing data example

iii. Multiple Imputation.

nhanes data in package

Background materials, Multiple Imputation in R. van Buuren S and Groothuis-Oudshoorn K (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. see also multiple imputation online Flexible Imputation of Missing Data. Stef van Buuren Chapman and Hall/CRC 2012. Book contents online book extras He is the originator of

So I gathered together some quick resources, esp for use within R-studio where use of

RStudio help. Using Sweave and knitr also Using Sweave and LaTeX with R 3.0.2 Rstudio support queries: 1 2

Some additional intro docs. San Diego State UW Montana Wharton,UPenn Germany Minnesota

Also the

Addendum on scripts. Introduction to the R Project for Statistical Computing for use at ITC Appendix B ; A (very) short introduction to R scripts section; Kickstarting R - Writing R scripts

Multiple Imputation example. nhanes data in package

Background materials, Multiple Imputation in R. van Buuren S and Groothuis-Oudshoorn K (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. see also multiple imputation online Flexible Imputation of Missing Data. Stef van Buuren Chapman and Hall/CRC 2012. Book contents online book extras He is the originator of

High School and Beyond data. complete Bryk dataset Data construction from files in the MEMSS

First pass, Bryk data: session plots Additional plots for Multilevel data. R session xyplots

Publication and Display continued (student question).

Resources: Package

also, Package prettyR