One of my coworkers asked how to filter a list, but return the “pass” and “fail” items in separate lists. For example, filter takes some data and a condition, and returns a list of the items that pass the condition, but what if we returned two lists? One of my other coworkers came up with a neat solution using reduce, and I wrote something using generics. Here’s the operation I’m calling “fork”, implemented in Swift3.

import Foundation
func fork<T>(_ data: [T], completion: (T)->Bool) -> ([T],[T]) {
var pass = [T]()
var fail = [T]()
for item in data {
completion(item) ? pass.append(item) : fail.append(item)
}
return (pass,fail)
}
var (pass, fail) = fork([0,1,2,3,4]) { $0 % 2 == 0 }

I’ve been reading Understanding Computation by Tom Suart, and it reminds me of my first CS class in grad school where I learned about and struggled to implement finite automata. I think one major lesson from that experience was to forget about the pictures of circles and loops on the blackboard, and figure out the bare minimum needed to represent the idea. In the case of finite automata, the bare minimum was that there was a state, and a set of rules that modified the state; each rule was composed of a current state, an input, and a next state. I decided to implement a simple finite automata in Swift using a list of Ints as states, and a single Int as input.

This material was a teaching aid for a crash course I gave at work about cosine similarity. Cosine similarity is a blunt instrument used to compare two sets of text. If two the two texts have high numbers of common words, then the texts are assumed to be similar. The ultimate goal is to plug two texts into a function and get an easy to understand number out that describes how similar the texts are, and cosine similarity is one way to skin that cat.

Please note, there are plenty of other very fast implementations for cosine similarity, but this one was written for educational purposes.

C++ has changed a lot since I was first learned about it in college, so I’m going to do a series of posts outlining some of those changes as they come up for me. First off: range-based for-loops using auto type-inference. Here’s the TL;DR:

Use auto i when you want to work with copies of the array items

Use auto &i when you want to work with the original items and maybe modify them place

Use auto const &i when you want to work with the original items and not modify them

TLDR: the negative binomial counts the number of trials needed before the Nth success.

I had this problem where we were considering running some very expensive tests that had a known success rate, and we wanted to know, given the success rate and the cost, whether we should run them at all. To make things more interesting, we were only interested in a set number of successes, and we could stop all testing after the first successes. My initial thought was to use the binomial distribution, but the binomial doesn’t “cut off” after a set number of successes. It turns out that we needed to use a version of the negative binomial distribution.

It took me a while to get Sphinx documentation set up correctly. Since it is highly configurable, it is highly easy to not configure correctly. In this guide I’ll assume that you’re using a Python virtual environment, and that you’ve placed the source code that you want to document in a directory called src/. I’ll walk through installing and configuring what you need to create documentation from inline comments using the Google or NumPy style, and create API documentation for a Flask server. I’ll be extra-explicit about what directory I’m in when I make calls that make assumptions about the working directory.

I was working with a friend to grab comments from YouTube. We’d initially thought of using lynx or w3m, but the comments section always showed up as “Loading…”. Next, we tried using BeautifulSoup, but that didn’t work either, for similar reasons. Finally, we tried using Selenium, because it allows one to interact with the JavaScript on the page.

When estimating a value, it is often easier to start with an upper and lower bound on that value. Once you have an upper and lower bound, you can pick a representative point estimate in that interval. The first and most obvious candidate is the (arithmetic) mean of the upper and lower bounds, but this is only valid if the upper and lower bounds are close together, or have the same order of magnitude. If the upper and lower bounds span multiple orders of magnitude, then it is better to use the geometric mean.

I was reading another blog post about reinforcement learning using Monte Carlo and tabulation methods that provided an example of the technique using Blackjack. I decided to implement my own method using Tic-Tac-Toe as an example. The basic idea is that you generate a random state of a game, and you generate a random action based on that state. Then you play the rest of the game through until the end and record the outcome. Then you should be able to store the state, action, and outcome as a key in a dictionary that refers to a count. Each time that state-action-outcome occurs again, you update the count by one. Over time, your dictionary will encode information about the relative strengths of different actions for a given state.

In this post, I’ll be (slowly) building a UITableView application programmatically in a Playground. First we’ll add a simple UITableView, then we’ll add a UINavigationBar. In future edits, I’ll add some more views.