Chapter 1. Getting Started

Introduction

First released in 2009, MongoDB is relatively new on the database scene compared to contemporary giants like Oracle which trace their first releases to the 1970’s. As a document-oriented database generally grouped into the NoSQL category, it stands out among distributed key value stores, Amazon Dynamo clones and Google BigTable reimplementations. With a focus on rich operator support and high performance Online Transaction Processing (OLTP), MongoDB is in many ways closer to MySQL than to batch-oriented databases like HBase.

The key differences between MongoDB’s document-oriented approach and a traditional relational database are:

  1. MongoDB does not support joins.

  2. MongoDB does not support transactions. It does have some support for atomic operations, however.

  3. MongoDB schemas are flexible. Not all documents in a collection must adhere to the same schema.

1 and 2 are a direct result of the huge difficulties in making these features scale across a large distributed system while maintaining acceptable performance. They are tradeoffs made in order to allow for horizontal scalability. Although MongoDB lacks joins, it does introduce some alternative capabilites, e.g. embedding, which can be used to solve many of the same data modeling problems as joins. Of course, even if embedding doesn’t quite work, you can always perform your join in application code, by making multiple queries.

The lack of transactions can be painful at times, but fortunately MongoDB supports a fairly decent set of atomic operations. From the basic atomic increment and decrement operators to the richer “findAndModify”, which is essentially an atomic read-modify-write operator.

It turns out that a flexible schema can be very beneficial, especially when you expect to be iterating quickly. While up front schema design—as used in the relational model—has its place, there is often a heavy cost in terms of maintenance. Handling schema updates in the relational world is of course doable, but comes with a price.

In MongoDB, you can add new properties at any time, dynamically, without having to worry about ALTER TABLE statements that can take hours to run and complicated data migration scripts. However, this approach does come with its own tradeoffs. For example, type enforcement must be carefully handled by the application code. Custom document versioning might be desirable to avoid large conditional blocks to handle heterogeneous documents in the same collection.

The dynamic nature of MongoDB lends itself quite naturally to working with a dynamic language such as Python. The tradeoffs between a dynamically typed language such as Python and a statically typed language such as Java in many respects mirror the tradeoffs between the flexible, document-oriented model of MongoDB and the up-front and statically typed schema definition of SQL databases.

Python allows you to express MongoDB documents and queries natively, through the use of existing language features like nested dictionaries and lists. If you have worked with JSON in Python, you will immediately be comfortable with MongoDB documents and queries.

For these reasons, MongoDB and Python make a powerful combination for rapid, iterative development of horizontally scalable backend applications. For the vast majority of modern Web and mobile applications, we believe MongoDB is likely a better fit than RDBMS technology.

Get MongoDB and Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.