week5 RQ3.

This numerical example illustrates the results presented in class
for the longitudinal path analysis (aka Goldstein ex) with
handouts from  Lecture topics , item 2. 

clever read statement (not from me);
I would just save the text file and edit to make the data set

# read in the portion of the web page you need and rename the columns
> casual = read.table("http://www-stat.stanford.edu/~rag/stat209/casualdat",  header=T, skip=1, nrows=40)
#these days use statweb esp if off-campus
> casual = read.table("http://statweb.stanford.edu/~rag/stat209/casualdat",  header=T, skip=1, nrows=40)
> head(casual)
     Xi.1.    Xi.3.    Xi.5.        W
1 37.55913 49.29053 61.02193 15.97247
2 45.65429 51.58451 57.51472 15.37724
3 40.93881 52.87978 64.82076 11.47902
4 47.35937 55.44879 63.53822 16.88944
5 52.70511 62.70351 72.70191 19.17834
6 30.45231 46.34082 62.22934 11.81822

> attach(casual)
> xi1 = casual[,1]
> xi3 = casual[,2]
> xi5 = casual[,3]
# or do names() to simplify var names from file
#path regressions (see handouts) match exactly results stated in handout
# for population coefficients.
> lm1 = lm(xi3 ~ xi1)

> lm2 = lm(xi5 ~ xi3 + xi1)
> summary(lm2)

Call:
lm(formula = xi5 ~ xi3 + xi1)

Residuals:
       Min         1Q     Median         3Q        Max 
-1.974e-05 -5.407e-06 -3.446e-07  5.819e-06  1.489e-05 

Coefficients:
              Estimate Std. Error    t value Pr(>|t|)    
(Intercept) -7.385e-06  9.768e-06 -7.560e-01    0.454    
xi3          2.000e+00  3.309e-07  6.045e+06   <2e-16 ***
xi1         -1.000e+00  3.308e-07 -3.023e+06   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 8.061e-06 on 37 degrees of freedom
Multiple R-squared:     1,      Adjusted R-squared:     1 
F-statistic: 2.563e+13 on 2 and 37 DF,  p-value: < 2.2e-16 
------------------------

another equiv version (just because I have it)
 my strategy was to cut out the 40 rows in an editor
can you have parens() in a var name, I cut those out for caution
point here was just to have students create a data set without much typing

> casual = read.table("{path]/casualdat",  header=T)
> cor(casual)
          Xi1       Xi3       Xi5
Xi1 1.0000000 0.8421714 0.5357932
Xi3 0.8421714 1.0000000 0.9065112
Xi5 0.5357932 0.9065112 1.0000000

> attach(casual)
> lm1 = lm(Xi3 ~ Xi1)
> lm2 = lm(Xi5 ~ Xi3 + Xi1)
> summary(lm2)

Call:
lm(formula = Xi5 ~ Xi3 + Xi1)

Residuals:
       Min         1Q     Median         3Q        Max 
-1.974e-05 -5.407e-06 -3.446e-07  5.819e-06  1.489e-05 

Coefficients:
              Estimate Std. Error    t value Pr(>|t|)    
(Intercept) -7.385e-06  9.768e-06 -7.560e-01    0.454    
Xi3          2.000e+00  3.309e-07  6.045e+06   <2e-16 ***
Xi1         -1.000e+00  3.308e-07 -3.023e+06   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 8.061e-06 on 37 degrees of freedom
Multiple R-squared:     1,      Adjusted R-squared:     1 
F-statistic: 2.563e+13 on 2 and 37 DF,  p-value: < 2.2e-16 

> #so with Rsq = 1 (perfect fit) you get -1 and 2 for coeffs
> #which match 5-1/3-1, 3-5/3-1 results for coeff; standard errors 
   essentially zero
# to take the HW question literally--yes you can compute standard
errors but it seems odd
from a sample n=40 that these should be zero. Oh yeah, also this
path analysis has R^2 = 1 (perfect fit) with meaningless (data free)
coeffificients