# Modeling with Beta Distributions

The beta distribution requires two parameters, usually referred to as a and b, or alpha and beta. If you are considering a Bernoulli process, a sequence of binary outcomes (success or failure) with a constant probability of success, then you could use a beta distribution, setting the parameter `a` equal to the number of successes, and setting the parameter `b` equal to the number of failures. The neat thing about the Beta distribution is that the greater the total number of trials (the sum of the successes and failures) the more peaked, or narrow, the distribution becomes.

# JavaScript Notes

In this post I will share some running notes I’m compiling on JavaScript. I recommend the following free online resources for JavaScript. The first is a basic introduction, the second picks up where the first leaves off.

# Concatenating and Visualizing Data in Pandas

One of my favorite things about pandas is that you can easily combine temporal data sets using different time scales. Behind the scenes, pandas will fill in the empty gaps with null values, and then quietly ignore those null values when you want to make a scatter plot or do some other computation, like a rolling mean. It takes so much tedious book-keeping out of the data analysis process.

# Working with Excel Files using Python

I explained in a previous post how to quickly and easily grab data from Excel files using pandas. I use this approach when I know that there’s a ton of data in the Excel file and I want it in a pandas DataFrame. If I am only extracting a handful of values, I like to use a lower level module, `xlrd`.

# Two Dimensional Sequential Gaussian Simulation in Python

In this post I will discuss an implementation of sequential Gaussian simulation (SGS) from the field of geostatistics. Geostatistics is simply a statistical consideration of spatially distributed data. Sequential Gaussian simulation is a technique used to “fill in” a grid representing the area of interest using a smattering of observations, and a model of the observed trend. The basic workflow incorporates three steps:

1. Modeling the measured variation using a semivariogram
2. Using the semivariogram to perform interpolation by kriging
3. Running simulations to estimate the spatial distribution of the variable(s) of interest

# Controlling an RGB LED with a Potentiometer

In this post, I’ll describe how to change the color of an anode RGB LED with a potentiometer. I’ll be using an Arduino UNO, and components from this RadioShack components kit. The motivation for this post was to have an LED change color in response to the reading from a thermistor next to my stove, but when I read about how I’d first need to calibrate the thermistor with some kind of thermometer, my motivation scurried under the sofa like a terrier in a thunderstorm. As a compromise I substituted the thermistor with a trim-pot, reasoning that a variable resistance was a variable resistance.

# Using Matrices in Go(lang)

In this post I’ll describe how to get started using gonum/matrix package for using matrices for math and stats applications. (Documentation here.) I’ll begin with a bit about setting up the Go environment drawn from the How to Write Code page on the Go website. (I highly recommend reading this if you’re unfamiliar with Go.) Next I’ll provide a commented usage case.

# Z-score Transform for Geostatistics

In this post I’ll present the z-score forward and backward transforms used in Sequential Gaussian Simulation, to be discussed at a later date. Some geostatistical algorithms assume that data is distributed normally, but interesting data is generally never normally distributed? Solution: force normality, or quasi-normality. All of this is loosely based on Clayton V. Deutsche’s work on the GSLIB library, and his books.

# Traversing a Directory Tree in Python and Go(lang)

In this post I’ll discuss the basics of walking through a directory tree in Python and Go. If you are dealing with a smaller directory, it may be more convenient to use Python. If you are dealing with a larger directory containing hundreds of subdirectories and thousands of files, you may want to look into using Go, or another compiled language. I enjoy using Go because it compiles quickly, and it doesn’t use pointer arithmetic.

# An Iterative Closest Point Algorithm

In this post I’ll demonstrate an iterative closest point (ICP) algorithm that works reasonably well. An ICP algorithm seeks to find a transformation between two sets of points that minimizes the error between them, i.e., you are trying to find a transformation that will lay one set of points exactly on top of another.