I am taking a course on MongoDB development with Node.js from Mongo University. In the second week we covered a thing that I thought was very interesting. They walked you through how to grab the JSON data out of a Reddit page. Reddit apparently offers its data up as a JSON if you pass it a .json path. Here is the coffeescript that produces the code provided in the development course.
MongoClient = require('mongodb').MongoClient
request = require('request')
MongoClient.connect 'mongodb://localhost:27017/reddit', ( err, db ) ->
throw err if err
request 'http://www.reddit.com/r/technology/.json', ( err, response, body ) ->
if !err && response.statusCode == 200
obj = JSON.parse body
stories = obj.data.children.map (story) ->
story.data
db.collection 'reddit'
.insert stories, ( err, data ) ->
throw err if err
console.dir data
db.close()
In order to run this, you will need to have a MongoDB server running in another terminal. Simply run,
mongo --dbpath /data/db
This will grab the data from the front page of the technology subreddit, and then store it in a collection named reddit.