Posted on by & filed under Content - Highlights and Reviews, Programming & Development, Web Development.

In a previous article, we covered what MongoDB is and how it enables scalability to be achieved due to its unique architecture. One key component of that architecture is the data model, which is based upon schema-less documents and collections. In this article, we will explore this unique model to see how it can be used to store “real world” data. We will also cover a rough translation scheme from SQL to Mongo’s data model to aid in understanding, and as a guide for migrating data from any SQL database to MongoDB.

Understanding documents

In traditional SQL databases, data is stored in the form of rows organized into tables. Each table (and thus row) has a fixed number of columns, each with a predefined data type. This is a rigid structure defined by the database schema. In MongoDB, none of these constraints apply. A Mongo document is the basic unit of storage, which is roughly equivalent to a row in SQL. However, unlike the latter, documents do not have any predefined schema. Each document can have any number of fields and each of these fields can take any name and store any data type. It is similar to a JSON document familiar to JavaScript developers. These documents are grouped together into collections, much like rows are grouped into tables. A database is composed of a number of collections just like an SQL database is composed of multiple tables.

As you can see, there is a one-to-one correspondence between SQL and a MongoDB data model. However, unlike SQL, MongoDB has no notion of “relations” between documents. This is a powerful concept in SQL where a query can “join” multiple tables together to form one coherent result set. This feature has been deliberately left out to compensate for the scalability issues that are inherent in the SQL model. Most real world data, however, requires some notion of ‘relation’ to be present in the database. Next we discuss how relational modeling can be emulated in MongoDB.

Relations in MongoDB

There are three basic forms of relations: One-to-One, One-to-Many and Many-to-Many. We will discuss each of these in the context of MongoDB.

Before we go any further, we need to discuss one particular feature in MongoDB that enables relations to be easily defined. This is MongoDB’s support for nested documents. This means that one document can be nested in another MongoDB document, which can itself be nested in another document and so on, enabling complex document relations to be stored. In the following, we will see examples of this nesting structure in action. This powerful feature is the basis on which the entire data modeling in MongoDB is based.

With this concept in mind, we can readily see how to map the various relations from SQL to Mongo. For one-to-one relations, we can nest (or embed) the related document within the main document. For example, a user who is also an administrator can be represented as:

The user “ali” has a field ‘administrator’, which contains a nested document for administrator data. The fields ‘role’ and ‘privileges’ belong to this nested document. The user document is “related” one-to-one with the administrator document.

Similarly, one-to-many relations can be expressed using an array of nested documents. For example, a blogpost can have many authors:

The square brackets [] are used to create a Javascript array (remember, documents are JSON documents) and each element of the array is a document. This is stored in the field ‘authors’, and in this way many authors are related to a blogpost.

The last relation type is many-to-many. This is a tricky one since we can nest multiple documents to express the many part from one side but in many-to-many both sides need to be related to multiple entities.

To emulate this type of relation, we need to create two collections. Then within documents in each collection we can store the links to the other collections’ documents that the particular document is related to. For example, a student may be taking many courses and each course can have many students. The documents in collection “student” will be:

The documents in collection “courses” will be:

The two students are linked with the courses using an array of Object IDs of the courses’ documents. As was discussed in the previous article, Document-oriented databases with MongoDB, each document has a unique Object ID that can be used to identify it uniquely within the collection. This Object ID is stored in the “_id” field of each document. The Object IDs are used here to link the students’ documents with the courses’ documents (and vice versa).

Since MongoDB has no notion of a join, where does the linking take place. The linking has to be handled at the application level. The MongoDB database can only store this representation of data. Any handling of data, including fetching related documents from another collection has to be manually coded in the application.

Conclusion

We have covered the basics of data modeling in MongoDB. The three basic types of relations were discussed and illustrated with the help of examples. The result of this discussion has been summarized in the following table.

Database One-to-one One-to-many Many-to-many
SQL No. of tables required: 2 No. of tables required: 2 No. of tables required: 3
MongoDB No. of collections required: 1
Method: Nested document
No. of collections required: 1

Method: Array of nested documents

No. of collections required: 2

Method: Array of Object IDs stored in the document

Safari Books Online has the content you need

Below are some MongoDB books to help you develop applications, or you can check out all of the MongoDB books and training videos available from Safari Books Online. You can browse the content in preview mode or you can gain access to more information with a free trial or subscription to Safari Books Online.

MongoDB in Action is a comprehensive guide to MongoDB for application developers. The book begins by explaining what makes MongoDB unique and describing its ideal use cases. A series of tutorials designed for MongoDB mastery then leads into detailed examples for leveraging MongoDB in e-commerce, social networking, analytics, and other common applications.
MongoDB and Python is a cookbook-style text to help Python programmers work with MongoDB. It is full of useful, practical recipes for solving real-world problems ranging from how to do fast geo queries for location-based apps to efficiently indexing your user documents for social-graph lookups to how best to integrate MongoDB with the Pyramid Web framework.
Learn how to create large MongoDB clusters! Scaling MongoDB shows you how to use MongoDB efficiently for very large databases. It Covers sharding, cluster setup, and administration.

About the authors

Salman Ul Haq is a techpreneur, co-founder and CEO of TunaCode, Inc., a startup that delivers GPU-accelerated computing solutions to time-critical application domains. He holds a degree is Computer Systems Engineering. His current focus is on delivering the right solution for cloud security. He can be reached at salman@tunacode.com.
Shaneeb Kamran is a Computer Engineer from one of the leading universities of Pakistan. His programming journey started at the age of 12 and ever since he has dabbled himself in every new and shiny software technology he could get his hands on. He is currently involved in a startup that is working on cloud computing products.

Tags: data modeling, document-oriented, Javascript, JSON, MongoDB,

Comments are closed.