When estimating a value, it is often easier to start with an upper and lower bound on that value. Once you have an upper and lower bound, you can pick a representative point estimate in that interval. The first and most obvious candidate is the (arithmetic) mean of the upper and lower bounds, but this is only valid if the upper and lower bounds are close together, or have the same order of magnitude. If the upper and lower bounds span multiple orders of magnitude, then it is better to use the geometric mean.
Continue reading Estimation and the Approximate Geometric Mean →
Summarizing the average performance of a set of things under different loads is a particularly tricky thing. The correct way to summarize performance is to use the geometric mean instead of the arithmetic mean. The tricky part is that the difference between the arithmetic and geometric mean is only significant under a certain condition, so the impact of using the arithmetic mean instead of the geometric may not be painfully obvious. Let’s start with an example.
Continue reading The Right Way to Summarize Performance →
In this post we’ll cover solving a system of linear equations using Swift and Accelerate. It get’s a little bit hairy, but it’s not so bad once you get the hang of it.
Continue reading Solving Linear Equations with Swift and Accelerate →
I found out how to invert a matrix on SO, but I didn’t understand the solution, so I thought I’d talk more about it here. First of all, there isn’t a one-off inverse function in the Accelerate framework. You need to calculate the LU facotrization first using
dgetrf_(), and then plug that data into
dgetri_() to calculate the inverse.
Continue reading Matrix Inversion Using Swift and Accelerate →
In this post I’ll describe how to get started using gonum/matrix package for using matrices for math and stats applications. (Documentation here.) I’ll begin with a bit about setting up the Go environment drawn from the How to Write Code page on the Go website. (I highly recommend reading this if you’re unfamiliar with Go.) Next I’ll provide a commented usage case.
Continue reading Using Matrices in Go(lang) →
In this post I will walk through the computation of principal components from a data set using Python. A number of languages and modules implement principal components analysis (PCA) but some implementations can vary slightly which may lead to confusion if you are trying to follow someone else’s code, or you are using multiple languages. Perhaps more importantly, as a data analyst you should at all costs avoid using a tool if you do not understand how it works. I will use data from The Handbook of Small Data Sets to illustrate this example. The data sets will be found in a zipped directory on site linked above.
Continue reading Computing Principal Components in Python →
In this post I will present a Python implementation of a new technique for fractal interpolation derived from a paper by Manousopoulos, Drakopoulos, and Theoharis. You may find my code on here on GitHub. Fractal interpolation is useful for data sets that exhibit self similarity at multiple scales, which are difficult to interpolate with polynomials.
Continue reading Fractal Interpolation →
In this post I’ll present a recipe for taking an integral over an arbitrary triangular region using the SciPy
integrate.dblquad() function. This is an important operation for implementing the Finite Elements method for solving partial differential equations. <!-more-->In school we are taught to perform a change of variables which involves splitting the triangle into two regions and performing the double integration on the simpler sub-domains after carefully calculating new limits of integration. This recipe maps the triangle to the unit square, and then calculates the double integral on the domain . I pieced this together after looking at this discussion on the MATLAB Central message board regarding the transformation of the triangle to the unit square, and this post on Paul’s Online Notes that touched on the calculation of the Jacobian, and this post by John D. Cook about choosing the correct error limits for quadrature integration.
Continue reading Integrals Over Arbitrary Triangular Regions for FEM →
In this post I will present a technique for generating a one dimensional (quasi) fractal data set using a modified Matérn point process, perform a simple box-couting procedure, and then calculate the lacunarity and fractal dimension using linear regression. Lacunarity is actually a pretty large topic, and we will only cover one accepted interpretation here. This material was motivated by an interesting paper on the fractal modelling of fractures in tight gas reservoirs. Tight gas reservoirs refer to reservoirs with very low permeability. To provide a sense of perspective, oil reservoirs typically have a permebility of ten to a hundred millidarcies, whereas shale gas reservoirs are usually less than 0.1 microdarcies, which is about the same permeability as a granite countertop.
Continue reading Fractal Dimension and Box Counting →
Here, I’ll introduce some ideas regarding spatial point processes using Python. First I’ll present the Poisson point process, and then I’ll cover two other processes: the Thomas point process and the Matérn point process. I’ll use these tools in two future posts regarding measuring fractal dimension, and kriging. An excellent resource for spatial statistics is the R package
spstat. The manual is a really great read. The
spstat package implements the Thomas and Matérn point processes as
Continue reading Spatial Point Processes →