Week 1 Problem 3

R version 2.15.2 (2012-10-26) -- "Trick or Treat"

> w1p4 = read.table(file = "http://www-stat.stanford.edu/~rag/stat222/rosvocabdata" , header = T) # change to statweb.stanford.edu
> w1p4
    Age Vocab
1  0.67     0
2  0.83     1
3  1.00     3
4  1.25    19
5  1.50    22
6  1.75   118
7  2.00   272
8  2.50   446
9  3.00   896
10 3.50  1222
11 4.00  1540
12 4.50  1870
13 5.00  2072
14 5.50  2289
15 6.00  2562
> attach(w1p4)
> plot( Age, Vocab)
 
 # to my eye curve was about at inflection at 6 or 7, turns out algorthm disgrees
> Asym = 6000; xmid = 7; scal = 3.5
> vlog = nls(Vocab ~ SSlogis(Age, Asym, xmid, scal), data = w1p4)
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  NA/NaN/Inf in 'x'
> ?nls
starting httpd help server ... done

# jitter (or prob just delete first obs), because logistic curve doesn't know
  to start at 0 (maybe baby knows something at age = 0)

> w1p4[1,2] = .25
> w1p4
    Age   Vocab
1  0.67    0.25
2  0.83    1.00
3  1.00    3.00
4  1.25   19.00
5  1.50   22.00
6  1.75  118.00
7  2.00  272.00
8  2.50  446.00
9  3.00  896.00
10 3.50 1222.00
11 4.00 1540.00
12 4.50 1870.00
13 5.00 2072.00
14 5.50 2289.00
15 6.00 2562.00
> vlog = nls(Vocab ~ SSlogis(Age, Asym, xmid, scal),  w1p4)
> summary(vlog)

Formula: Vocab ~ SSlogis(Age, Asym, xmid, scal)

Parameters:
      Estimate Std. Error t value Pr(>|t|)    
Asym 2.520e+03  9.461e+01   26.64 4.81e-12 ***
xmid 3.625e+00  9.730e-02   37.26 8.98e-14 ***
scal 7.465e-01  6.364e-02   11.73 6.23e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 89.02 on 12 degrees of freedom

Number of iterations to convergence: 0 
Achieved convergence tolerance: 3.004e-06 

# much diff starting values give same fit
> Asym = 3000; xmid = 4; scal = 2
> vlog2 = nls(Vocab ~ SSlogis(Age, Asym, xmid, scal),  w1p4)
> summary(vlog2)

Formula: Vocab ~ SSlogis(Age, Asym, xmid, scal)

Parameters:
      Estimate Std. Error t value Pr(>|t|)    
Asym 2.520e+03  9.461e+01   26.64 4.81e-12 ***
xmid 3.625e+00  9.730e-02   37.26 8.98e-14 ***
scal 7.465e-01  6.364e-02   11.73 6.23e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 89.02 on 12 degrees of freedom

Number of iterations to convergence: 0 
Achieved convergence tolerance: 3.004e-06 

> cbind(Vocab, predict(vlog, Time), predict(vlog2, Time))
Error in lapply(X = X, FUN = FUN, ...) : object 'Time' not found
> cbind(Vocab, predict(vlog, Age), predict(vlog2, Age))
      Vocab                      
 [1,]     0   47.22158   47.22158
 [2,]     1   58.24831   58.24831
 [3,]     3   72.71519   72.71519
 [4,]    19  100.48749  100.48749
 [5,]    22  138.26769  138.26769
 [6,]   118  189.14175  189.14175
 [7,]   272  256.71689  256.71689
 [8,]   446  457.16172  457.16172
 [9,]   896  761.46190  761.46190
[10,]  1222 1154.91912 1154.91912
[11,]  1540 1570.16877 1570.16877
[12,]  1870 1924.27986 1924.27986
[13,]  2072 2175.37671 2175.37671
[14,]  2289 2331.05927 2331.05927
[15,]  2562 2419.68882 2419.68882
> # lousy fit in early days (grows faster than logistic?) to asymp 2520

===========================================
# try starting values algorithm from Bates-Pinheiro App C.7

> voc = read.table(file = "http://statweb.stanford.edu/~rag/stat222/rosvocabdata" , header = T)
> voc
    Age Vocab
1  0.67     0
2  0.83     1
3  1.00     3
4  1.25    19
5  1.50    22
6  1.75   118
7  2.00   272
8  2.50   446
9  3.00   896
10 3.50  1222
11 4.00  1540
12 4.50  1870
13 5.00  2072
14 5.50  2289
15 6.00  2562

> voc[1,2] = .25
> voc$Vprime = voc$Vocab/4000  # put response in (0,1)
> voc
    Age   Vocab    Vprime
1  0.67    0.25 0.0000625
2  0.83    1.00 0.0002500
3  1.00    3.00 0.0007500
4  1.25   19.00 0.0047500
5  1.50   22.00 0.0055000
6  1.75  118.00 0.0295000
7  2.00  272.00 0.0680000
8  2.50  446.00 0.1115000
9  3.00  896.00 0.2240000
10 3.50 1222.00 0.3055000
11 4.00 1540.00 0.3850000
12 4.50 1870.00 0.4675000
13 5.00 2072.00 0.5180000
14 5.50 2289.00 0.5722500
15 6.00 2562.00 0.6405000

> attach(voc)
> z = log(Vprime/(1 - Vprime))  #transform scaled response

> init = lm(Age ~ z) # run this 'inverse' regression
> coef(init)
(Intercept)           z 
  4.3272362   0.4802443 
> # so phi2 = 4.33, phi3 = .48 for starting values  (phi2 Xmid and phi3 scale)