The Pearson Chi-Squared Test with Python and R

In this post I’ll discuss how to use Python and R to calculate the Pearson Chi-Squared Test for goodness of fit. The chi-squared test for goodness of fit determines how well categorical variables fit some distribution. We assume that the categories are mutually exclusive, and completely cover the sample space. That means that the everything we can think of fits into exactly one category, with no exceptions. For example, suppose we flip a coin to determine if it is fair. The outcomes of this experiment fit into exactly two categories, head and tails. The same goes for rolling a die to determine its fairness; rolls of the die will result in (exactly) one of (exactly) six outcomes or categories. This test is only meaningful with mutually exclusive categories.

Continue reading

Creating a Flowchart with TikZ and LaTeX

In this post I’ll discuss how to make simple flowcharts in LaTeX using TikZ. Probably the best collection of TikZ examples can be found at TeXample.net, but there are other helpful examples like these two PDFs, here and here. In case you’re wondering, TikZ is a recursive acronym “TikZ ist kein Zeichenprogramm,” a reminder (in German) that it is not an interactive drawing program.

Continue reading

Adding Data Points to Shapefiles

In this post I’ll discuss how to plot data points on a shapefile. In a previous post I discussed how to install basemap using pip, the package manager for Python. Since basemap is an extension of matplotlib, we have a lot of familiar plotting functions and options at our disposal. Of particular importance is the ability to use projection data in plotting the shapefile, and plotting the data points.

Continue reading