Chapter 4. Databases

In this chapter, we present a few examples of Tornado web applications that make use of a database. We’ll begin with a simple RESTful API example, then move on to creating a fully functional version of the Burt’s Books website introduced in Templates in Practice: Burt’s Books.

The examples in this chapter use MongoDB as the database, and pymongo as the Python driver to connect to MongoDB. There are, of course, many database systems that make sense for use in a web application: Redis, CouchDB, and MySQL are a few well-known options, and Tornado itself ships with a library for wrapping MySQL requests. We choose to use MongoDB due to its simplicity and convenience: it’s easy to install and integrates well with Python code. Its schemaless nature makes it unnecessary to predefine your data structures, which is great for prototyping.

We’re assuming in this chapter that you have a MongoDB installation running on the machine where you’re running the example code, but it’s easy to adapt the code to use with MongoDB running on a remote server. If you don’t want to install MongoDB on your machine, or if there isn’t a MongoDB binary for your operating system, there are a number of hosted MongoDB services you can use instead. We recommend MongoHQ. In our initial examples, we’ll assume that you have MongoDB running locally on your machine, but it’s easy to adapt the code to use with MongoDB running on a remote server (including MongoHQ).

We’re also assuming you have some experience with databases, though not necessarily any experience with MongoDB in particular. Of course, we’re only able to scratch the surface of what’s possible with MongoDB here; be sure to consult the MongoDB documentation (http://www.mongodb.org/display/DOCS/Home) for more information. Let’s begin!

Basic MongoDB Operations with PyMongo

Before we can write a web application that uses MongoDB, we need to learn how to use MongoDB from Python. In this section, you’ll learn how to connect to MongoDB with PyMongo, then how to use pymongo to create, retrieve, and update documents in a MongoDB collection.

PyMongo is a simple Python library that wraps the MongoDB client API. You can download it here: http://api.mongodb.org/python/current/. Once you have it installed, open an interactive Python session and follow along.

Establishing a Connection

First of all, you need to import the PyMongo library and create a connection to a MongoDB database.

>>> import pymongo
>>> conn = pymongo.Connection("localhost", 27017)

The preceding example shows how to connect to a MongoDB server running on your local machine, on the default MongoDB port (27017). If you’re using a remote MongoDB server, replace localhost and 27017 as appropriate. You can also connect to MongoDB using a MongoDB URI, like so:

>>> conn = pymongo.Connection(
... "mongodb://user:password@staff.mongohq.com:10066/your_mongohq_db")

The preceding code would connect to a database called your_mongohq_db hosted on MongoHQ, using user as the username and password as the password. Read more about MongoDB URIs here: http://www.mongodb.org/display/DOCS/Connections.

A MongoDB server can have any number of databases, and the Connection object lets you access any of the databases on the server you’ve connected to. You can get an object representing a particular database either with an attribute of the object, or by using the object like a dictionary. If the database doesn’t already exist, it will be automatically created.

>>> db = conn.example or: db = conn['example']

A database can have any number of collections. A collection is just a place to put related documents. Most of the operations that we perform with MongoDB (finding documents, saving documents, deleting documents) will be performed on a collection object. You can get a list of collections in a database by calling the collection_names method on the database object:

>>> db.collection_names()
[]

Of course, we haven’t added any collections to our database yet, so this list is empty. MongoDB will automatically create a collection when we insert our first document. You can get an object representing a collection by accessing an attribute with the name of the collection on the database object, then insert a document by calling the object’s insert method, specifying a Python dictionary. For example, in the following code, we insert a document into a collection called widgets. Because it didn’t already exist, it is created automatically when the document is added:

>>> widgets = db.widgets or: widgets = db['widgets'] (see below)
>>> widgets.insert({"foo": "bar"})
ObjectId('4eada0b5136fc4aa41000000')
>>> db.collection_names()
[u'widgets', u'system.indexes']

(The system.indexes collection is for MongoDB’s internal use. For the purposes of this chapter, you can ignore it.)

As an earlier example showed, you can access a collection both as an attribute of a database object, and by accessing the database object as though it was a dictionary and using the collection name as a key. For example, if db is a pymongo database object, both db.widgets and db['widgets'] evaluate to the same collection.

Dealing with Documents

MongoDB collections store data as documents, a term that indicates the relatively free structure of data. MongoDB is a “schemaless” database: documents in the same collection usually have the same structure, but no structure is enforced by MongoDB. Internally, MongoDB stores documents in a binary JSON-like format called BSON. Pymongo allows us to write and retrieve our documents as Python dictionaries.

To create a new document in a collection, call the insert method of the document, with a dictionary as a parameter:

>>> widgets.insert({"name": "flibnip", "description": "grade-A industrial flibnip", »
"quantity": 3})
ObjectId('4eada3a4136fc4aa41000001')

Now that the document is in the database, we can retrieve it using the collection object’s find_one method. You can tell find_one to find a particular document by passing it a dictionary that has a document field name as a key, and the expression you want to match in that field as the value. For example, to return the document whose name field is equal to flibnip (i.e., the document just created), call the find_one method like so:

>>> widgets.find_one({"name": "flibnip"})
{u'description': u'grade-A industrial flibnip',
 u'_id': ObjectId('4eada3a4136fc4aa41000001'),
 u'name': u'flibnip', u'quantity': 3}

Note the _id field. MongoDB automatically adds this field to any document you create. Its value is an ObjectID, special kind of BSON object that is guaranteed to be unique to the document in question. This ObjectID value, you might have noticed, is also what the insert method returns when successfully creating a new document. (You can override the automatic creation of an ObjectID by putting an _id key in the document when you create it.)

The value returned from find_one is a simple Python dictionary. You can access individual items from it, iterate over its key/value pairs, and modify values in it just as you would any other Python dictionary:

>>> doc = db.widgets.find_one({"name": "flibnip"})
>>> type(doc)
<type 'dict'>
>>> print doc['name']
flibnip
>>> doc['quantity'] = 4

However, changes to the dictionary aren’t automatically saved back to the database. If you want to save changes to the dictionary, call the collection’s save method, passing in the modified dictionary as a parameter:

>>> doc['quantity'] = 4
>>> db.widgets.save(doc)
>>> db.widgets.find_one({"name": "flibnip"})
{u'_id': ObjectId('4eb12f37136fc4b59d000000'),
 u'description': u'grade-A industrial flibnip',
 u'quantity': 4, u'name': u'flibnip'}

Let’s add a few more documents to our collection:

>>> widgets.insert({"name": "smorkeg", "description": "for external use only", »
"quantity": 4})
ObjectId('4eadaa5c136fc4aa41000002')
>>> widgets.insert({"name": "clobbasker", "description": »
"properties available on request", "quantity": 2})
ObjectId('4eadad79136fc4aa41000003')

We can get a list of all documents in a collection by calling the collection’s find method, then iterating over the results:

>>> for doc in widgets.find():
...     print doc
...
{u'_id': ObjectId('4eada0b5136fc4aa41000000'), u'foo': u'bar'}
{u'description': u'grade-A industrial flibnip',
 u'_id': ObjectId('4eada3a4136fc4aa41000001'),
 u'name': u'flibnip', u'quantity': 4}
{u'description': u'for external use only',
 u'_id': ObjectId('4eadaa5c136fc4aa41000002'),
 u'name': u'smorkeg', u'quantity': 4}
{u'description': u'properties available on request',
 u'_id': ObjectId('4eadad79136fc4aa41000003'),
 u'name': u'clobbasker',
 u'quantity': 2}

If we want only a subset of documents, we can pass a dictionary parameter to the find method, just as we did with the find_one method. For example, to find only those documents whose quantity key is equal to 4:

>>> for doc in widgets.find({"quantity": 4}):
...     print doc
...
{u'description': u'grade-A industrial flibnip',
 u'_id': ObjectId('4eada3a4136fc4aa41000001'),
 u'name': u'flibnip', u'quantity': 4}
{u'description': u'for external use only',
 u'_id': ObjectId('4eadaa5c136fc4aa41000002'),
 u'name': u'smorkeg',
 u'quantity': 4}

Finally, we can delete a document from a collection using the collection’s remove method. The remove method takes a dictionary argument just like find and find_one, specifying which documents to delete. For example, to remove all documents whose name key is equal to flipnip, enter:

>>> widgets.remove({"name": "flibnip"})

Listing all documents in the collection confirms that the document in question has been removed:

>>> for doc in widgets.find():
...     print doc
...
{u'_id': ObjectId('4eada0b5136fc4aa41000000'),
 u'foo': u'bar'}
{u'description': u'for external use only',
 u'_id': ObjectId('4eadaa5c136fc4aa41000002'),
 u'name': u'smorkeg', u'quantity': 4}
{u'description': u'properties available on request',
 u'_id': ObjectId('4eadad79136fc4aa41000003'),
 u'name': u'clobbasker',
 u'quantity': 2}

MongoDB Documents and JSON

When working with web applications, you’ll often want to take a Python dictionary and serialize it as a JSON object (as, for example, a response to an AJAX request). Since a document retrieved from MongoDB with PyMongo is simply a dictionary, you might assume that you could convert it to JSON simply by passing it to the json module’s dumps function. There’s a snag, though:

>>> doc = db.widgets.find_one({"name": "flibnip"})
>>> import json
>>> json.dumps(doc)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    [stack trace omitted]
TypeError: ObjectId('4eb12f37136fc4b59d000000') is not JSON serializable

The problem here is that Python’s json module doesn’t know how to convert MongoDB’s special ObjectID type to JSON. There are several methods of dealing with this. The simplest method (and the method we’ll be adopting in this chapter) is to simply delete the _id key from the dictionary before we serialize it:

>>> del doc["_id"]
>>> json.dumps(doc)
'{"description": "grade-A industrial flibnip", "quantity": 4, "name": "flibnip"}'

A more sophisticated solution would be to use json_util library included with PyMongo, which will also help you serialize other MongoDB-specific data types to JSON. Read more about the library here: http://api.mongodb.org/python/current/api/bson/json_util.html.

Get Introduction to Tornado now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.