Week 1 Problem 3 R version 2.15.2 (2012-10-26) -- "Trick or Treat" > w1p4 = read.table(file = "http://www-stat.stanford.edu/~rag/stat222/rosvocabdata" , header = T) # change to statweb.stanford.edu > w1p4 Age Vocab 1 0.67 0 2 0.83 1 3 1.00 3 4 1.25 19 5 1.50 22 6 1.75 118 7 2.00 272 8 2.50 446 9 3.00 896 10 3.50 1222 11 4.00 1540 12 4.50 1870 13 5.00 2072 14 5.50 2289 15 6.00 2562 > attach(w1p4) > plot( Age, Vocab) # to my eye curve was about at inflection at 6 or 7, turns out algorthm disgrees > Asym = 6000; xmid = 7; scal = 3.5 > vlog = nls(Vocab ~ SSlogis(Age, Asym, xmid, scal), data = w1p4) Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'x' > ?nls starting httpd help server ... done # jitter (or prob just delete first obs), because logistic curve doesn't know to start at 0 (maybe baby knows something at age = 0) > w1p4[1,2] = .25 > w1p4 Age Vocab 1 0.67 0.25 2 0.83 1.00 3 1.00 3.00 4 1.25 19.00 5 1.50 22.00 6 1.75 118.00 7 2.00 272.00 8 2.50 446.00 9 3.00 896.00 10 3.50 1222.00 11 4.00 1540.00 12 4.50 1870.00 13 5.00 2072.00 14 5.50 2289.00 15 6.00 2562.00 > vlog = nls(Vocab ~ SSlogis(Age, Asym, xmid, scal), w1p4) > summary(vlog) Formula: Vocab ~ SSlogis(Age, Asym, xmid, scal) Parameters: Estimate Std. Error t value Pr(>|t|) Asym 2.520e+03 9.461e+01 26.64 4.81e-12 *** xmid 3.625e+00 9.730e-02 37.26 8.98e-14 *** scal 7.465e-01 6.364e-02 11.73 6.23e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 89.02 on 12 degrees of freedom Number of iterations to convergence: 0 Achieved convergence tolerance: 3.004e-06 # much diff starting values give same fit > Asym = 3000; xmid = 4; scal = 2 > vlog2 = nls(Vocab ~ SSlogis(Age, Asym, xmid, scal), w1p4) > summary(vlog2) Formula: Vocab ~ SSlogis(Age, Asym, xmid, scal) Parameters: Estimate Std. Error t value Pr(>|t|) Asym 2.520e+03 9.461e+01 26.64 4.81e-12 *** xmid 3.625e+00 9.730e-02 37.26 8.98e-14 *** scal 7.465e-01 6.364e-02 11.73 6.23e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 89.02 on 12 degrees of freedom Number of iterations to convergence: 0 Achieved convergence tolerance: 3.004e-06 > cbind(Vocab, predict(vlog, Time), predict(vlog2, Time)) Error in lapply(X = X, FUN = FUN, ...) : object 'Time' not found > cbind(Vocab, predict(vlog, Age), predict(vlog2, Age)) Vocab [1,] 0 47.22158 47.22158 [2,] 1 58.24831 58.24831 [3,] 3 72.71519 72.71519 [4,] 19 100.48749 100.48749 [5,] 22 138.26769 138.26769 [6,] 118 189.14175 189.14175 [7,] 272 256.71689 256.71689 [8,] 446 457.16172 457.16172 [9,] 896 761.46190 761.46190 [10,] 1222 1154.91912 1154.91912 [11,] 1540 1570.16877 1570.16877 [12,] 1870 1924.27986 1924.27986 [13,] 2072 2175.37671 2175.37671 [14,] 2289 2331.05927 2331.05927 [15,] 2562 2419.68882 2419.68882 > # lousy fit in early days (grows faster than logistic?) to asymp 2520 =========================================== # try starting values algorithm from Bates-Pinheiro App C.7 > voc = read.table(file = "http://statweb.stanford.edu/~rag/stat222/rosvocabdata" , header = T) > voc Age Vocab 1 0.67 0 2 0.83 1 3 1.00 3 4 1.25 19 5 1.50 22 6 1.75 118 7 2.00 272 8 2.50 446 9 3.00 896 10 3.50 1222 11 4.00 1540 12 4.50 1870 13 5.00 2072 14 5.50 2289 15 6.00 2562 > voc[1,2] = .25 > voc$Vprime = voc$Vocab/4000 # put response in (0,1) > voc Age Vocab Vprime 1 0.67 0.25 0.0000625 2 0.83 1.00 0.0002500 3 1.00 3.00 0.0007500 4 1.25 19.00 0.0047500 5 1.50 22.00 0.0055000 6 1.75 118.00 0.0295000 7 2.00 272.00 0.0680000 8 2.50 446.00 0.1115000 9 3.00 896.00 0.2240000 10 3.50 1222.00 0.3055000 11 4.00 1540.00 0.3850000 12 4.50 1870.00 0.4675000 13 5.00 2072.00 0.5180000 14 5.50 2289.00 0.5722500 15 6.00 2562.00 0.6405000 > attach(voc) > z = log(Vprime/(1 - Vprime)) #transform scaled response > init = lm(Age ~ z) # run this 'inverse' regression > coef(init) (Intercept) z 4.3272362 0.4802443 > # so phi2 = 4.33, phi3 = .48 for starting values (phi2 Xmid and phi3 scale)