I thought I’d show some examples of solving some common statistical word problems using Python. Today I’ll look at exponential random variables; this is a continuous random variable used to model the waiting time between independent events. Sometimes this is posed as the waiting time for the first event in a Poisson process.
Suppose a clerk is helping people in a line, one at a time. Let be the number of minutes needed to help each person. Assume that has an exponential distribution, with a mean consultation time of 4 minutes.
Find the probability that a wait time is in some range
Find the probability that the clerk spends 3 to 5 minutes with any given person. For the scipy.stats implementation of the exponential CDF function, the scale parameter is the expected wait time for each consultation.
import scipy import scipy.stats scipy.stats.expon.cdf(5, scale=4) - scipy.stats.expon.cdf(3, scale=4)
Find the maximum wait time experienced 50% of the time
Find the 50th percentile, meaning, in 50% of interactions, the clerk spends minutes or less; find . To solve this, we use the percent point function, this is the inverse of the cumulative density function, and it works in percentiles.
So this means that half of the interactions take about 2.8 minutes or less.
Fit parameters to observed data
For this exercise we’ll fake some data:
X = scipy.stats.expon.rvs(scale=4, size=2000) scipy.stats.expon.fit(X) # returns (loc, scale)