Earlier this year, when I was looking for work, I got the same recursive maze problem three interviews in a row. Recursion is cute and clever, but you generally want to use iterative solutions in production.

# Tag Archives: Python

# Very Very Verbose Cosine Similarity

This material was a teaching aid for a crash course I gave at work about cosine similarity. Cosine similarity is a blunt instrument used to compare two sets of text. If two the two texts have high numbers of common words, then the texts are assumed to be similar. The ultimate goal is to plug two texts into a function and get an easy to understand number out that describes how similar the texts are, and cosine similarity is one way to skin that cat.

Please note, there are plenty of other very fast implementations for cosine similarity, but this one was written for educational purposes.

# How Many Trials Before the Nth Success

TLDR: the negative binomial counts the number of trials needed before the Nth success.

I had this problem where we were considering running some very expensive tests that had a known success rate, and we wanted to know, given the success rate and the cost, whether we should run them at all. To make things more interesting, we were only interested in a set number of successes, and we could stop all testing after the first successes. My initial thought was to use the binomial distribution, but the binomial doesn’t “cut off” after a set number of successes. It turns out that we needed to use a version of the negative binomial distribution.

# Setting Up Sphinx Documentation

It took me a while to get Sphinx documentation set up correctly. Since it is highly configurable, it is highly easy to not configure correctly. In this guide I’ll assume that you’re using a Python virtual environment, and that you’ve placed the source code that you want to document in a directory called `src/`

. I’ll walk through installing and configuring what you need to create documentation from inline comments using the Google or NumPy style, and create API documentation for a Flask server. I’ll be extra-explicit about what directory I’m in when I make calls that make assumptions about the working directory.

# Use Selenium to Scrape YouTube Comments

I was working with a friend to grab comments from YouTube. We’d initially thought of using lynx or w3m, but the comments section always showed up as “Loading…”. Next, we tried using BeautifulSoup, but that didn’t work either, for similar reasons. Finally, we tried using Selenium, because it allows one to interact with the JavaScript on the page.

# Reinforcement Learning with Monte Carlo and Tabulation Methods

I was reading another blog post about reinforcement learning using Monte Carlo and tabulation methods that provided an example of the technique using Blackjack. I decided to implement my own method using Tic-Tac-Toe as an example. The basic idea is that you generate a random state of a game, and you generate a random action based on that state. Then you play the rest of the game through until the end and record the outcome. Then you should be able to store the state, action, and outcome as a key in a dictionary that refers to a count. Each time that state-action-outcome occurs again, you update the count by one. Over time, your dictionary will encode information about the relative strengths of different actions for a given state.

# Binary Search in Python

I’m reading Data Structures and Algorithms in Python, by Goodrich, Tamassia, and Goldwasser, and I think that it is very, very good. I particularly like the explanation of why binary search over a sorted list is .

# Sorting Algorithms in Python

Tonight I’m looking at some sorting algorithms in Python. First up are Bubble Sort and Selection sort, both with average and worst case runtime, and memory. Finally, I’ll look at an iterative and recursive implementation of Merge Sort.

# Simple Directed Graph in Python

This is a work in progress, there’s a lot of complex questions you can ask about graphs, but I though it was neat that you could produce an actual graphy looking thing in so few lines of code. This is a directed graph, and you use the graph object to first create nodes, and then define (unweighted) edges between them or to themselves. The `__repr__()`

method just lists the node data, whatever that is, with an ASCII arrow pointing to another node’s data.

# Another Simple Tree in Python

I’ve posted before about creating a tree in Python, but I like this implementation better. It uses a nested class to represent the nodes of the tree, and an interesting construction (line 11) that is a result of that nested class. Also, I do a simple pre-order traversal. I’ll flesh this guy out in later posts.