| rotterdam {survival} | R Documentation |
The rotterdam data set includes 2982 primary breast cancers patients
whose data whose records were included in the Rotterdam tumor bank.
data("rotterdam")
A data frame with 2982 observations on the following 15 variables.
pidpatient identifier
yearyear of cancer incidence
ageage
menomenopausal status (0= premenopausal, 1= postmenopausal)
sizetumor size, a factor with levels <=20 20-50 >50
gradetumor grade
nodesnumber of positive lymph nodes
pgrprogesterone receptors (fmol/l)
erestrogen receptors (fmol/l)
hormonhormonal treatment (0=no, 1=yes)
chemochemotherapy
rtimedays to recurrence or last follow-up
recur0= no recurrence, 1= recurrence
dtimedays to death or last follow-up
death0= alive, 1= dead
These data sets are used in the paper by Royston and Altman. The Rotterdam data is used to create a fitted model, and the GBSG data for validation of the model. The paper gives references for the data source.
Patrick Royston and Douglas Altman, External validation of a Cox prognostic model: principles and methods. BMC Medical Research Methodology 2013, 13:33
rfstime <- pmin(rotterdam$rtime, rotterdam$dtime)
status <- pmax(rotterdam$recur, rotterdam$death)
fit1 <- coxph(Surv(rfstime, status) ~ pspline(age) + meno + size +
pspline(nodes) + er,
data=rotterdam, subset = (nodes > 0))
# Royston and Altman used fractional polynomials for the nonlinear terms