Usual Honor Code procedures: You may use any of your own inanimate resources--no collaboration or assistance from others. This work is done under Stanford's Honor Code.

Any issues that come up (wording, interpretation) I will post a note here, so it would be good to check this page intermittently.

Given that these problems are untimed, some care should be taken in presentation, clarity, format.

Especially important is to give full and clear answers to questions, not just to submit unannotated computer output, although relevant output should be included.

PLEASE check that you have answered all the parts and subparts. Please start each problem on a new page and keep all material for a problem contiguous.

In prior years solutions for these problems sets were submitted in hard-copy form at the final class meeting.

Things are different this year.

I certainly will accept hard copy either mailed to my Palo Alto home or left on my front porch (as a student did last spring).

I expect most students will seek some form of electronic submision.

Last spring quarter, students used a variety of methods: posting to their Google drive (e.g. pdf files) and giving me access, or mailing me rendered html files. Most popular, posting an html file on rpubs was great for a project course, but doesn't provide adequate security for an exam file.

We will address submission options more towards end of quarter. I am quite interested in any useful suggestions.

Of course, and I probably don't need to say this, do not mail me a Word binary file (you shouldn't do that to anyone).

Course Problem Set, Item 1.

Change over time, measured outcomeA small subsample of data (16 respondents) from the National Youth Survey is obtained in long-form by

and in wide form by

Yearly observations from ages 11 to 15 on the tolerance measure (tolerance to deviant behavior e.g. cheat, drug, steal, beat; larger values indicates more tolerance on a 1to4 scale). Also in this data set are gender (is_male) and an

Course Problem Set, Item 2.

Time1-Time2 Clinical TrialThe 3 billion dollar (and counting) change score

(items below clipped from 2017 various press reports; we do not have the data)

Sage Therapeutics (NASDAQ:SAGE) surged in response to its announcement of positive results from a Phase 2 clinical trial assessing SAGE-217 for the treatment of adult patients with moderate-to-severe major depressive disorder (MDD), a Fast Track indication in the U.S.[2020 note: name, Zuranolone]

It is estimated that there are around 16 million people in the United States with MDD.

SAGE-217, a neuroactive steroid, is next-generation GABA modulator. The GABA system, the major inhibitory signaling pathway in the brain and central nervous system (CNS), plays a key role in regulating CNS function. The company intends to advance the program into Phase 3 development.

The phase 2 looked at the effect of the positive allosteric modulator of the gamma-aminobutyric acid (GABA) receptor as compared to placebo in 89 patients with MDD.

About the Placebo-controlled Phase 2 trial of SAGE-217 in MDD:

In the randomized, double-blind, parallel-group, placebo-controlled trial, 89 eligible patients (with a minimum total score of 22 on the Hamilton Rating Scale for Depression) were stratified based on use of antidepressant treatment (current/stable or not treated/withdrawn >= 30 days) and randomized in a 1:1 ratio to receive SAGE-217 Capsules (30mg) (n=45) or matching placebo (n=44). All doses of study drug were administered at night with food. The study consisted of a 14-day treatment period, and a 4-week follow-up period. The mean HAM-D total scores at baseline were 25.2 for the SAGE-217 group and 25.7 for the placebo group (overall range 22-33), representing patients with moderate to severe MDD. Approximately 90 percent of patients in each group completed the study.

Sage Therapeutics (NASDAQ: SAGE), a clinical-stage biopharmaceutical company developing novel medicines to treat life-altering central nervous system (CNS) disorders, today announced positive top-line results from the Phase 2, double-blind, placebo-controlled clinical trial of SAGE-217 in the treatment of 89 adult patients with moderate to severe major depressive disorder (MDD). In the trial, treatment for 14 days with SAGE-217 was associated with a statistically significant mean reduction in the Hamilton Rating Scale for Depression (HAM-D) 17-Item total score from baseline to Day 15 (the time of the primary endpoint) of 17.6 points for SAGE-217, compared to 10.7 for placebo (p<0.0001). Statistically significant improvements were observed in the HAM-D compared to placebo by the morning following the first dose through Week 4 and the effects of SAGE-217 remained numerically greater than placebo through the end of follow-up at Week 6. At Day 15, 64 percent of patients who received SAGE-217 achieved remission, defined as a score of 7 or less on the HAM-D scale, compared with 23 percent of patients who received placebo (p=0.0005).

The 89-subject study met its primary endpoint of a statistically significant average reduction in HAM-D score from baseline to day 15 (p<0.0001) versus placebo. HAM-D is a rating scale for depression. At day 15, 64% of patients in the treatment group achieved remission compared to 23% for placebo (p=0.0005).

There were a total of 89 patients recruited into the study who were either given SAGE-217 or a placebo compound. Patients were treated for a 14 day period and were then measured for clinical outcome using the Hamilton Rating Scale for Depression or HAM-D 17-item total score from baseline. It was shown that SAGE-217 achieved a statistically significant improvement over placebo according to the HAM-D scale. Patients that took SAGE-217 were shown to achieve a 17.6 point improvement at day 15, compared to only a 10.7 point improvement for placebo. That meant that the drug achieved a statistically significant p-value of p < 0.0001. It was also noted that 64% of patients who took SAGE-217 achieved MDD remission, compared to only 23% of placebo patients. MDD remission was classified as patients having a HAM-D score of 7 or less. This was the secondary endpoint of the study, which was also achieved.

Investigators saw a statistically significant improvement in SAGE-217 patients on a depression scale the day after the first dose. By the time the two-week treatment period came to an end, the mean score in the SAGE-217 arm had dropped 17.6 points, as compared to a 10.7 point decline in the control group. That seven-point placebo-adjusted improvement was enough for the trial to hit its primary endpoint with a p value of less than 0.0001.

The positive results continued beyond the end of the treatment period. The mean reduction on the depression scale in the treatment arm remained statistically superior to that of the placebo group two weeks after dosing stopped.

--------------------------------------------------------

Consider the remission outcome (secondary) at day 15 (after 14 days of dose).

> sage2 = matrix(c(29, 10, 16, 34), nr=2) # remission counts for the two groups > sage2 [,1] [,2] [1,] 29 16 [2,] 10 34 > prop.test(sage2) 2-sample test for equality of proportions with continuity correction data: sage2 X-squared = 14.078, df = 1, p-value = 0.0001754 alternative hypothesis: two.sided 95 percent confidence interval: 0.2079003 0.6264431 sample estimates: prop 1 prop 2 0.6444444 0.2272727 > chisq.test(sage2) Pearson's Chi-squared test with Yates' continuity correction data: sage2 X-squared = 14.078, df = 1, p-value = 0.0001754

In weeks 4 and 5 we conducted analysis of time1- time2 (and multiwave) outcome data for comparisons of experimental groups. For the SAGE study pretend we have long form data, with time coded 0 for baseline and 1 for Day 15 endpoint, and outcome HAM-D score at the timepoints (0,1) and group indicating T/P. So we have 178 rows, and columns HAM-D group time subj.

If we fit the model in R syntax

from the information you have, give the point estimate for the fixed effects, time and time:group .

Write out the level 1, level 2 model corresponding to the combined model in the lmer statement.

Course Problem Set, Item 3.

Comparison of experimental groups--placebo-controlled, randomized study

note: problem text reformatted for clarity

Treatment of Lead Exposed Children (TLC) Trial. Data (wide form) and description reside at Laird-Ware text site

Just wide-form data with no column headers

(i) Carry out a statistical analysis for estimating the relative effectiveness of chelation treatment (succimer) compared with placebo (A,P) using mixed-effects models or repeated measures anova.

(ii) Show the the equivalence from the Brogan-Kutner paper between the simple t-test on improvement(change) and your analysis in part (i).

Course Problem Set, Item 4.

Basic Survival analysis, right censoring.

> str(melanom) 'data.frame': 205 obs. of 6 variables: $ no : int 789 13 97 16 21 469 685 7 932 944 ... $ status: int 3 3 2 3 1 1 1 1 3 1 ... $ days : int 10 30 35 99 185 204 210 232 232 279 ... $ ulc : int 1 2 2 2 1 1 1 1 1 1 ... $ thick : int 676 65 134 290 1208 484 516 1288 322 741 ... $ sex : int 2 2 2 1 2 2 2 2 1 1 ... ,We are interested in

days: time on study after operation for malignant melanoma

status: the patient's status at the end of study

Documentation shows the possible values of status are: 1: dead from malignant melanoma 2: alive at end of study 3: dead from other causes. Consider 'dead from other causes' as censored (along with alive). Thus you can either recode status as (1,0) or use a logical for status vector to be status == 1 and the survival object is

a. How many survival times are censored? Obtain an estimate of the survival curve at each event time (along with CI) using the Kaplan-Meier estimate and plot the survival curve and confidence interval.

b. Does survival differ in men and women? Plot the male and female survival curves. Compare asymptotic (log-rank) and exact tests [see note below] for gender differences. Compare the exact test with a bootstrap approximation.

--------------------------------

In Week 7 class examples (aml/leukemia data from miller) and in Week 7 RQ 2 (rat data) we showed the use of an exact test for 2-group survival comparisons. These were relatively small data sets (rat has 40 subjects). The melanoma data is larger-- 205 subjects, not a giant data set. But depending on the size of your machine (such as a laptop) you may well not have enough memory to carry out the exact test. Even with a moderately large machine, the default maximum memory limit set in R may be too low (easy enough to change, but maybe not worth the trouble for this exercise). So if you run into problems conducting the exact test, it is entirely adequate to show the memory failure and then revert to the approximate bootstrap option shown in the class example and in Week 7 RQ2 (that's kind of what it is there for). Try the bootstrap approximation option with 1000 and then 10000 replications and see if the results match.

------------------------------

c. Use Cox regression to carry out the gender comparison of the survival curves in part b. Obtain a confidence interval for the effect of gender on the hazard.

a. Repeat the gender comparison in parts b or c in Ex 2, week 7, stratifying on ulceration of the tumor (or not). Compare with the result in Ex 2 week 7 and interpret.

b. Carry out a Cox regression using predictors log(thick) and the gender indicator, stratifying on ulceration. Interpret the results. Check the viability of the proportional hazards assumption for this Cox model.

Course Problem Set, Item 5.

Channing House is a retirement centre in Palo Alto, California. These data were collected between the opening of the house in 1964 until July 1, 1975. In that time 97 men and 365 women passed through the centre. For each of these, their age on entry and also on leaving or death was recorded. A large number of the observations were censored mainly due to the resident being alive on July 1, 1975 when the data was collected. Over the time of the study 130 women and 46 men died at Channing House. Differences between the survival of the sexes, taking age into account, was one of the primary concerns of this study.

sex A factor for the sex of each resident ("Male" or "Female"). entry The residents age (in months) on entry to the centre exit The age (in months) of the resident on death, leaving the centre or July 1, 1975 whichever event occurred first. time Length of time (in months) that the resident spent at Channing House.(time=exit-entry) cens Indicator of right censoring. 1 indicates that the resident died at Channing House, 0 indicates that they left the house prior to July 1, 1975 or that they were still alive and living in the centre at that date. > head(channing) sex entry exit time cens 1 Male 782 909 127 1 2 Male 1020 1128 108 1 3 Male 856 969 113 1 4 Male 915 957 42 1 5 Male 863 983 120 1 6 Male 906 1012 106 1 > 780/12 [1] 65 # in years just for reference #some quick descriptives > attach(channing) > table(cens, sex) sex cens Female Male 0 235 51 1 130 46 > tapply(time, sex, quantile) $Female $Male 0% 25% 50% 75% 100% 0% 25% 50% 75% 100% 0 36 87 137 137 0 33 72 120 137 > tapply(entry, sex, quantile) $Female $Male 0% 25% 50% 75% 100% 0% 25% 50% 75% 100% 733 851 899 948 1140 751 863 919 967 1073 #Kaplan-Meier fits and tests > kmfit = survfit(Surv(time, cens) ~ sex, data = channing) > kmfit Call: survfit(formula = Surv(time, cens) ~ sex, data = channing) records n.max n.start events median 0.95LCL 0.95UCL sex=Female 365 365 365 130 NA 124 NA sex=Male 97 97 97 46 108 82 NA > plot(kmfit, conf.int=TRUE) ## SEE PLOT > survdiff(Surv(time, cens) ~ sex, data = channing) Call: survdiff(formula = Surv(time, cens) ~ sex, data = channing) N Observed Expected (O-E)^2/E (O-E)^2/V sex=Female 365 130 143.1 1.19 6.41 sex=Male 97 46 32.9 5.18 6.41 Chisq= 6.4 on 1 degrees of freedom, p= 0.0113 > surv_test(Surv(time, cens) ~ sex, data = channing, distribution=approximate(B = 4000)) Approximative Logrank Test data: Surv(time, cens) by sex (Female, Male) Z = -2.4251, p-value = 0.015 alternative hypothesis: two.sided #Cox regressions > coxfit1 = coxph(Surv(time, cens) ~ sex, data = channing) > coxfit1 Call: coxph(formula = Surv(time, cens) ~ sex, data = channing) coef exp(coef) se(coef) z p sexMale 0.432 1.54 0.172 2.52 0.012 Likelihood ratio test=5.89 on 1 df, p=0.0153 n= 462, number of events= 176 > summary(coxfit1) Call: coxph(formula = Surv(time, cens) ~ sex, data = channing) n= 462, number of events= 176 coef exp(coef) se(coef) z Pr(>|z|) sexMale 0.4321 1.5406 0.1718 2.516 0.0119 * --- exp(coef) exp(-coef) lower .95 upper .95 sexMale 1.541 0.6491 1.1 2.157 Concordance= 0.541 (se = 0.016 ) Rsquare= 0.013 (max possible= 0.985 ) Likelihood ratio test= 5.89 on 1 df, p=0.01526 Wald test = 6.33 on 1 df, p=0.01187 Score (logrank) test = 6.43 on 1 df, p=0.01123 > coxfit2 = coxph(Surv(time, cens) ~ sex + entry, data = channing) > coxfit2 Call: coxph(formula = Surv(time, cens) ~ sex + entry, data = channing) coef exp(coef) se(coef) z p sexMale 0.38004 1.46 0.17191 2.21 2.7e-02 entry 0.00724 1.01 0.00105 6.90 5.4e-12 Likelihood ratio test=50.9 on 2 df, p=8.67e-12 n= 462, number of events= 176 > summary(coxfit2) Call: coxph(formula = Surv(time, cens) ~ sex + entry, data = channing) n= 462, number of events= 176 coef exp(coef) se(coef) z Pr(>|z|) sexMale 0.380045 1.462350 0.171914 2.211 0.0271 * entry 0.007239 1.007266 0.001050 6.896 5.36e-12 *** --- exp(coef) exp(-coef) lower .95 upper .95 sexMale 1.462 0.6838 1.044 2.048 entry 1.007 0.9928 1.005 1.009 Concordance= 0.65 (se = 0.023 ) Rsquare= 0.104 (max possible= 0.985 ) Likelihood ratio test= 50.94 on 2 df, p=8.671e-12 Wald test = 53.05 on 2 df, p=3.02e-12 Score (logrank) test = 54.2 on 2 df, p=1.703e-12 > anova(coxfit1, coxfit2) Analysis of Deviance Table Cox model: response is Surv(time, cens) Model 1: ~ sex Model 2: ~ sex + entry loglik Chisq Df P(>|Chi|) 1 -972.57 2 -950.04 45.056 1 1.915e-11 *** > cox.zph(coxfit2) rho chisq p sexMale -0.0304 0.162 0.687 entry 0.0481 0.400 0.527 GLOBAL NA 0.553 0.759 > plot(cox.zph(coxfit2)) > k = 1:462 > channing$ID = k > head(channing) sex entry exit time cens ID 1 Male 782 909 127 1 1 2 Male 1020 1128 108 1 2 3 Male 856 969 113 1 3 4 Male 915 957 42 1 4 5 Male 863 983 120 1 5 6 Male 906 1012 106 1 6 > coxfit4 = coxme(Surv(time, cens) ~ sex + entry + (1|ID), data = channing) > print(coxfit4) Cox mixed-effects model fit by maximum likelihood Data: channing events, n = 176, 462 Iterations= 100 503 NULL Integrated Fitted Log-likelihood -975.5084 -950.0293 -947.0135 Chisq df p AIC BIC Integrated loglik 50.96 3.00 4.9937e-11 44.96 35.45 Penalized loglik 56.99 4.97 4.8849e-11 47.05 31.29 Model: Surv(time, cens) ~ sex + entry + (1 | ID) Fixed coefficients coef exp(coef) se(coef) z p sexMale 0.382989266 1.466662 0.173129158 2.21 2.7e-02 entry 0.007286551 1.007313 0.001058856 6.88 5.9e-12 Random effects Group Variable Std Dev Variance ID Intercept 0.13172664 0.01735191 > fixed.effects(coxfit4) > exp(fixed.effects(coxfit4)) sexMale entry sexMale entry 0.382989266 0.007286551 1.466662 1.007313 > stem(exp(ranef(coxfit4)[[1]])) The decimal point is 2 digit(s) to the left of the | 96 | 34 96 | 8 97 | 1234 97 | 6689 98 | 01111222244 98 | 5555566666666677777778888888999999999999 99 | 00000000000011111111111111112222222222222222223333333333333333333444+4 99 | 55555555555555555666666666666666666667777777777777777777777778888888+33 100 | 00000000000000000000000000000000000011222223334444 100 | 555666677777778888999999999 101 | 00000000111111111111111222222222222222233333333333333344444444444444 101 | 5555555555555555556666666666666667777777777777777777777777 > quantile(exp(ranef(coxfit4)[[1]])) 0% 25% 50% 75% 100% 0.9633244 0.9928323 0.9986529 1.0106724 1.0174679

a. Compare median survival times for males and females obtained by ignoring censoring, with median survival times from Kaplan-Meier. Comment on similarities or differences. Why does KM show NA for female median survival time?

b. Are gender differences in survival statistically significant? Can differences in survival be attributed to gender or Channing House?

c. For the Cox regressions, cox2 adds entry age to the model in cox1. Is entry age a significant predictor in this model?

d. Does the proportional hazards assumption appear to be satisfactory for cox2?

e. What is is the point estimate of the proportional increase or decrease in hazard (give direction) for an increase of 1 year in entry age in the cox2 model?

f. Explain what I am doing in the formulation of the cox4 model.

g. Does cox4 appear to improve the fit compared to cox2? Give a rough test statistic.

h. For the cox4 fit, what is the magnitude of the individual effects--i.e. how much do individual effects impact the estimated hazards?