How Many Trials Before the Nth Success

TLDR: the negative binomial counts the number of trials needed before the Nth success.

I had this problem where we were considering running some very expensive tests that had a known success rate, and we wanted to know, given the success rate and the cost, whether we should run them at all. To make things more interesting, we were only interested in a set number n of successes, and we could stop all testing after the first n successes. My initial thought was to use the binomial distribution, but the binomial doesn’t “cut off” after a set number of successes. It turns out that we needed to use a version of the negative binomial distribution.

Continue reading

Setting Up Sphinx Documentation

It took me a while to get Sphinx documentation set up correctly. Since it is highly configurable, it is highly easy to not configure correctly. In this guide I’ll assume that you’re using a Python virtual environment, and that you’ve placed the source code that you want to document in a directory called src/. I’ll walk through installing and configuring what you need to create documentation from inline comments using the Google or NumPy style, and create API documentation for a Flask server. I’ll be extra-explicit about what directory I’m in when I make calls that make assumptions about the working directory.

Continue reading

Estimation and the Approximate Geometric Mean

When estimating a value, it is often easier to start with an upper and lower bound on that value. Once you have an upper and lower bound, you can pick a representative point estimate in that interval. The first and most obvious candidate is the (arithmetic) mean of the upper and lower bounds, but this is only valid if the upper and lower bounds are close together, or have the same order of magnitude. If the upper and lower bounds span multiple orders of magnitude, then it is better to use the geometric mean.

Continue reading

Reinforcement Learning with Monte Carlo and Tabulation Methods

I was reading another blog post about reinforcement learning using Monte Carlo and tabulation methods that provided an example of the technique using Blackjack. I decided to implement my own method using Tic-Tac-Toe as an example. The basic idea is that you generate a random state of a game, and you generate a random action based on that state. Then you play the rest of the game through until the end and record the outcome. Then you should be able to store the state, action, and outcome as a key in a dictionary that refers to a count. Each time that state-action-outcome occurs again, you update the count by one. Over time, your dictionary will encode information about the relative strengths of different actions for a given state.

Continue reading

The Right Way to Summarize Performance

Summarizing the average performance of a set of things under different loads is a particularly tricky thing. The correct way to summarize performance is to use the geometric mean instead of the arithmetic mean. The tricky part is that the difference between the arithmetic and geometric mean is only significant under a certain condition, so the impact of using the arithmetic mean instead of the geometric may not be painfully obvious. Let’s start with an example.

Continue reading