In this post I’ll walk through getting started with MongoDB using the Python PyMongo module. I’ll go through the installation process, and then walk through an example of entering data into a MongoDB through Python. (In a future post I’ll cover querying documents.) For the installation, I’ll assume that you’re running Ubuntu, but there are instructions for all major operating systems on the link that I have provided.
Installation
My first inclination was to just run sudo apt-get install mongodb
from the command line. This installed something, but it wasn’t what I needed to reproduce the Python tutorials I had found on the internet. Then I found this page on the MongoDB website that did have what I needed. I’ll reproduce their installation process here.
Import the public key used by the package management system
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
Create a list file for MongoDB
echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list
Reload the local package database
sudo apt-get update
Install the MongoDB packages
sudo apt-get install mongodb-org
Start/Stop/Restart MongoDB
sudo service mongod start sudo service mongod stop sudo service mongod restart
Check the MongoDB log file
less /var/log/mongodb/mongod.log
An Example with Python
First, we’ll need to install PyMongo using pip
.
sudo pip install pymongo
Now, we can create a database, and a collection, which is like a table in a traditional RDBMS. First we’ll start a MongoDB instance by running the following line at the terminal.
sudo service mongod start
Next, we’ll fire up python (or ipython, or an ipython notebook) import some stuff, and connect to the mongod instance we just started. This can be done several ways, the first line after the import connects to the default host and port. The next two lines show alternate ways to explicitly specify the default host and port.
from pymongo import MongoClient client = MongoClient() # alternatively.. client = MongoClient("localhost",27017) client = MongoClient("mongodb://localhost:27017/")
At this point we can create a database using attribute notation, or dictionary style notation. Here, the MongoDB database is is called “mydb”, and the Python variable describing that database is “db”. (This is useful if you’re opening up a database and it has a ridiculously long name.)
db = client.mydb db = client["mydb"]
Next we can create collections in the database. Collections are analogous to tables in a traditional RDBMS. (Collections and databases aren’t actually created until you start adding documents, which are analogous to rows or records.) Again, we have the attribute and dictionary styles for creating collections.
coll = db.mycollection coll = db['mycollection']
We can create a document as a Python dict. A document can hold strings, numbers, and lists. We can insert this into a collection using the insert()
method.
document = {"fname":"connor","weight":170.5,"height":[5,10]} coll.insert( document )
Since databases and collections aren’t created until we insert a document, we can now see the collection by calling the collection_names()
method on the database.
db.collection_names()
If we have a lot of documents we’d like to put into a collection, we can inset them all at once using a list of dictionaries. Remember, since we’re dealing with documents and not tables, we can all sorts of fields.
connor_doc = {"fname":"connor","weight":170.5,"height":[5,10]} roger_doc = {"name":"roger","species":"dog","breed":"awesome","weight":20.2} docs = [ connor_doc, roger_doc ] coll.insert( docs )