Sunday, July 19, 2009

Exponential density

I've been reading (and re-reading) An Introduction to Probability by Grinstead and Snell. It is a wonderful book, available from here as a pdf. It goes slowly, has lots of explanation and many problems, as well as interesting historical perspective. I like it so much that I bought a hard copy.

I'm trying to understand the exponential density better. In example 2.17 of the book they pose the problem of modeling the time-to-breakdown of a hard drive by the exponential density:



If the average time-to-breakdown is 30 months, and we have already run the computer for 15 months with no breakdown, what is the current expected time-to-breakdown?

We use R to explore the question. The R function rexp gives random samples from the exponential density with a rate parameter r (the inverse of lambda above). Think of each of these as a possible lifetime for our drive. Since we know that the lifetime exceeds 15 months, filter the vector x for values > 15 and save in y.

r = 1/30
x = rexp(100000,rate=r)
sel = x > 15
y = x[sel]


Plot histograms of the density (not counts, which R calls freq).

hist(y,breaks=100,xlim=c(0,150),freq=F)
hist(x,breaks=100,col='gray70',freq=F,add=T)


It is clear that the y distribution is the same as x, just shifted over by 15.



We confirm this by looking at summary statistics (adjusting y first by subtracting 15). We see they are essentially identical:

> summary(x)
Min. 1st Qu. Median
4.915e-05 8.580e+00 2.075e+01
Mean 3rd Qu. Max.
3.001e+01 4.170e+01 3.780e+02
> summary(y-15)
Min. 1st Qu. Median
1.309e-05 8.627e+00 2.087e+01
Mean 3rd Qu. Max.
3.004e+01 4.185e+01 3.630e+02
freq=F,add=T)