Problem 2
right-heart catheterization (rhc)
In September 1996, a NY Times headline proclaimed "Safety of Catheter Into the Heart Is Questioned, Startling Doctors" (NYT is good on science)
https://www.nytimes.com/1996/09/18/us/safety-of-catheter-into-the-heart-is-questioned-startling-doctors.html
Research article
Pubmed link: https://www.ncbi.nlm.nih.gov/pubmed/8782638
The effectiveness of right heart catheterization in the initial care of critically ill patients. JAMA. 1996 Sep 18;276(11):889-97
One posting of the JAMA paper: http://www.zirkin.com/em/articles/General%20Critical%20Care/Core/PAC/Connors1996.pdf
The publication (using SAS at the time) computed propensity scores using the 60 patient characteristics from logistic regression.
From those propensity scores 1008 matched pairs were constructed (using approximately half the RHC group and a third of the non-RHC group)
The data for 5735 critically ill adult patients receiving care in an ICU from five US teaching hospitals between 1989 and 1994.
are available from the Hmisc package and directly by
rhc <- read.csv("http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/rhc.csv")
A variable listing is at: http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/rhc.html
The JAMA publication has a quick listing (with medical acronyms) p.891
[aside: on checking yesterday there seems to be an issue with installing Hmisc (likely temporary);
if you have this collection of Frank Harrell routines already installed you can extract the rhc dataset by
Hmisc:::getHdata(rhc, what = "all")
should be identical to his online posting ]
In these data the treatment variable is swang1 --- Swang1 Right Heart Catheterization (RHC)
The outcome variable is death ----- Death Death at any time up to 180 Days
In our dataset we match the publication's numbers on treatment (RHC no RHC), but the outcome indicator (death) totals are little
different (perhaps revised definition).
Problem 2
a. If you could assume that critically ill patients at these hospitals chose rhc (swang1) or not
completely at random (i.e. unassociated with any characteristics, especially those that would influence death),
what would your assessment of the effect of rhc on survival be? Express in multiplicative odds with confidence interval.
b. Try out 1:1 pair matching for the full set of 2184 rhc patients. The full data set has 60 patient characteristics available
for matching methods. For some level of convenience in this exercise I reduced (with no medical wisdom) that list to 22 which I implemented by
p2data = rhc[, c("death", "swang1", "age" , "sex" , "edu" , "das2d3pc" , "dnr1", "ca", "surv2md1", "wtkilo1", "temp1", "meanbp1", "resp1", "hrt1", "pafi1", "crea1", "bili1", "resp", "card", "neuro", "gastr", "hema", "seps", "chfhx")]
Compare the 'success' in balancing these 22 characteristics using either traditional nearest neighbor or optimal matching. Implement via optmatch or through MatchIt wrapper.
For the more successful of those two methods conduct the outcome analysis using these matched pairs; give a point and interval estimate (for multiplicative odds of death).
c. A popular alternative to actual matching is IPW-- aka Propensity score weighting, Inverse Probability of Treatment Weighting (IPTW).
For the part b data, use these methods to give point and interval estimate (for multiplicative odds of death) for ATT.
Compare with part b. Why do I ask for ATT when comparing with part b?
d. Again with the reduced p2data in part b, try out full matching via optmatch or through MatchIt wrapper.
Compare the 'success' in balancing these 22 characteristics with the 1:1 matching in part b.
How many subclasses does the full matching produce? What is typical size of a subclass, what is the largest subclass?
Conduct the outcome analysis using these subclasses from full matching; give a point and interval estimate (for multiplicative odds of death),
and compare with part b results.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
END Problem 2