16Death and Failure Data

Time-to-death data, and data on failure times, are often encountered in statistical modelling. The main problem is that the variance in such data is almost always non-constant, and so standard methods are inappropriate. If the errors are gamma distributed, then the variance is proportional to the square of the mean (recall that with Poisson errors, the variance is equal to the mean). It is straightforward to deal with such data using a generalized linear model (GLM) with gamma errors.

This case study has 50 replicates in each of three treatments: an untreated control, low dosage and high dosage of a novel cancer treatment. The response is the age at death for the rats (expressed as an integer number of months):

mortality <- read.csv("c:\\temp\\deaths.csv")
attach(mortality)
names(mortality)
[1] "death"     "treatment"    
tapply(death,treatment,mean)
	control   	 high   	  low 
	3.46   	 6.88   	 4.70

The animals receiving the high dose lived roughly twice as long as the untreated controls. The low dose increased life expectancy by more than 35%. The variance in age at death, however, is not constant:

tapply(death,treatment,var)
	  control  	    high	       low 
0.4167347 	2.4751020 	0.8265306

The variance is much greater for the longer-lived individuals, so we should not use standard statistical models which assume constant variance and normal errors. But we can use a GLM with gamma errors:

model <- glm(death~treatment,Gamma)
summary(model)
Coefficients:
	 Estimate Std. Error t value ...

Get Statistics: An Introduction Using R, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.