In this post I’ll discuss how to perform incremental updates to a simple statistical model using PyMC. The short answer is that you have to create a new model each time. In this example, I’ll use a Bernoulli random variable from scipy.stats to generate coin flips, and I will use PyMC to model a prior and likelihood distribution, and produce a posterior distribution as output.
First we import some modules and initialize our variables. We’ll set the parameters to the beta prior distribution to one, which is equivalent to a uniform distribution between zero and one. The actual probability that the coin will come up heads is 2/3, or 66.66%. I chose this because it’s close enough to 50% to look like a fair coin after a few flips.
import pymc, scipy.stats, numpy as np # initialize variables flips = 0 heads = 0 alpha = 1.0 beta = 1.0 # the actual value of the proportion theta, i.e., # the probability that the coin will come up heads prob = 2.0/3.0
Next, we’ll perform 18 flips, updating the Beta distribution prior with the new information using the parameters alpha and beta after each flip. (An explanation of Beta priors may be found here.) Each model will use 50,000 iterations, and will burn-in (throw out) the first 10,000 iterations.
for i in range(18):
# flip a coin
flip = scipy.stats.bernoulli( prob ).rvs()
# update variables
flips += 1
heads += flip
# create a prior distribution for the parameter theta, the proportion of heads
theta = pymc.Beta( name="theta", alpha=alpha, beta=beta )
# create a likelihood distribution
# n - the number of trials/flips,
# p - prior distribution of theta
# value - the number of observed heads
like = pymc.Binomial( name="N", n=flips, p=theta, value=heads, observed=True )
# create a model object
model = pymc.Model( [ theta, like ] )
# pass the model to the Markov Chain Monte Carlo simulation
mcmc = pymc.MCMC( model )
# run the simulation for 50K iterations,
# "burning-in" (tossing out) the first 10K iterations
mcmc.sample( iter=50000, burn=10000 )
# update the parameters for the next prior
alpha = heads
if alpha == 0: alpha = 0.5
beta = flips - heads
if beta == 0: beta = 0.5
We can visualize the posterior, the distribution of the values of theta, using the trace() method.
hist( theta.trace(), bins=np.linspace( 0, 1, 21 ), facecolor=(0,0,1,0.5), edgecolor=(0.25,0.25,1), linewidth=3 ) ;
title('Iterative Bayesian Coin Flipping')
ylabel('Flips') ; xlabel('$\\theta$')
savefig('iterative_bayesian_coin_flipping_example.png', dpi=200 )

We can call the summary() method of theta to obtain percentiles of the posterior, the 95% highest density interval, the variance, and some quantiles.
theta.summary()
theta:
MC
Mean SD Error 95% HPD interval
-------------------------------------
0.704 0.068 0.0 [ 0.568 0.834]
Posterior quantiles:
2.5 25 50 75 97.5
|------|======|======|------|
0.562 0.66 0.708 0.753 0.828
We can also access specific parts of the summary using the dictionary returned by the stats() method of the theta distribution.
theta.stats()
{'95% HPD interval': array([ 0.56802781, 0.83376255]),
'mc error': 0.00033181076170894026,
'mean': 0.70441527658710223,
'n': 40000,
'quantiles': {2.5: 0.56198261307785313,
25: 0.65963730749639737,
50: 0.70767338838336302,
75: 0.75303639985853832,
97.5: 0.82848204538592407},
'standard deviation': 0.068484753569964044}
References
- Abraham Flaxman has some great PyMC resources in his blog, Healthy Algorithms
- Cameron Davidson-Pilon has a very thorough and interesting (free) book on Bayesian analysis using PyMC: Probabilistic Programming & Bayesian Methods for Hackers
- There is also some example code by the PyMC developers available here