In this post I’ll look at creating a presentation using the R ecosystem. I’ve used beamer before, and I love it, but I haven’t used the knitr
R package yet. Incidentally, the creator of knitr, Yihui Xie, does not like beamer. This is fine, I have been wrong about technology before–I recall thinking in college that facebook was for losers and that it would never catch on. Anyway, Yihui’s work is really impressive and I strongly suggest checking it out.
Monthly Archives: August 2014
Installing Debian Packages
If you are running a Debian Linux system, like Ubuntu or CrunchBang, you’ll occasionally need to install a .deb
file, which is a binary file used to install packages. Usually you can simply run,
sudo dpkg -i ./path/to/file.deb
Modeling in R with the caret Package
Decision Trees in R using the C50 Package
In this post I’ll walk through an example of using the C50 package for decision trees in R. This is an extension of the C4.5 algorithm. We’ll use some totally unhelpful credit data from the UCI Machine Learning Repository that has been sanitized and anonymified beyond all recognition.
tidyr and pandas: Gather and Melt
In this post I’ll look at replicating Hadley Wickham‘s gather()
tool from his tidyr
package using the pandas melt()
function. Why would anyone want to do this? Well, Dr. Wickham’s work is beautiful, and the pandas.melt()
function is not as elegant as the tidyr::gather()
function. You may read Dr. Wickham’s pre-print paper here.
Updating R from the Command Line
This is a tiny post, but if I lumped it as an aside into a longer post I might never find it again. If you’re trying to keep up with Hadley Wickham you might need to update R from time to time. The installr
package is there to help you keep up with the Wickhams. To update R, just follow the following steps:
install.packages("installr"); library(installr); updateR();
For further infromation, check out this r-statistics post on the topic.
Getting Started with MongoDB and Python
In this post I’ll walk through getting started with MongoDB using the Python PyMongo module. I’ll go through the installation process, and then walk through an example of entering data into a MongoDB through Python. (In a future post I’ll cover querying documents.) For the installation, I’ll assume that you’re running Ubuntu, but there are instructions for all major operating systems on the link that I have provided.
A Quick Note on Using Git
In his post I will summarize what I gathered from the Atlassian Git tutorial that had lots of really great examples and explanations. I’ll assume you’ve already configured your Git installation and account, but aren’t sure where to go from there. I’ll also assume you’re using linux.
Hypothesis Testing in R
In this post I’ll look at different statistical hypothesis tests in R. Statistical tests can be tricky because they all have different assumptions that must be met before you can use them. Some tests require samples to be normally distributed, others require two samples to have the same variance, while others are not as restrictive.
We’ll begin with testing for normality. Then we’ll look at testing for equality of variance, with and without an assumption of normality. Finally we’ll look at testing for equality of mean, under different assumptions regarding normality and equal variance.