You are previewing The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing.
O'Reilly logo
The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing

Book Description

MongoDB, a cross-platform NoSQL database, is the fastest-growing new database in the world. MongoDB provides a rich document orientated structure with dynamic queries that you'll recognize from RDMBS offerings such as MySQL. In other words, this is a book about a NoSQL database that does not require the SQL crowd to re-learn how the database world works!

MongoDB has reached 1.0 and already boasts 50,000+ users. The community is strong and vibrant and MongoDB is improving at a fast rate. With scalable and fast databases becoming critical for today's applications, this book shows you how to install, administer and program MongoDB *without* pretending SQL never existed.

Table of Contents

  1. Copyright
  2. About the Authors
  3. About the Technical Reviewer
  4. Acknowledgments
    1. A Special "Thanks" to MongoDB Beijing
  5. Introduction
    1. Our Approach
  6. I. Basics
    1. 1. Introduction to MongoDB
      1. 1.1. Reviewing the MongoDB Philosophy
        1. 1.1.1. Using the Right Tool for the Right Job
        2. 1.1.2. Lacking Innate Support for Transactions
        3. 1.1.3. Drilling Down on JSON and How It Relates to MongoDB
        4. 1.1.4. Adopting a Non-Relational Approach
        5. 1.1.5. Opting for Performance vs. Features
        6. 1.1.6. Running the Database Anywhere
      2. 1.2. Fitting Everything Together
        1. 1.2.1. Generating or Creating a Key
        2. 1.2.2. Using Keys and Values
        3. 1.2.3. Implementing Collections
        4. 1.2.4. Understanding Databases
      3. 1.3. Reviewing the Feature List
        1. 1.3.1. Using Document-Orientated Storage (BSON)
        2. 1.3.2. Supporting Dynamic Queries
        3. 1.3.3. Indexing Your Documents
        4. 1.3.4. Leveraging Geospatial Indexes
        5. 1.3.5. Profiling Queries
        6. 1.3.6. Updating Information In-Place
        7. 1.3.7. Storing Binary Data
        8. 1.3.8. Replicating Data
        9. 1.3.9. Implementing Auto Sharding
        10. 1.3.10. Using Map and Reduce Functions
      4. 1.4. Getting Help
        1. 1.4.1. Visiting the Website
        2. 1.4.2. Chatting with the MongoDB Developers
        3. 1.4.3. Cutting and Pasting MongoDB Code
        4. 1.4.4. Finding Solutions on Google Groups
        5. 1.4.5. Leveraging the JIRA Tracking System
      5. 1.5. Summary
    2. 2. Installing MongoDB
      1. 2.1. Choosing Your Version
        1. 2.1.1. Understanding the Version Numbers
      2. 2.2. Installing MongoDB on Your System
        1. 2.2.1. Installing MongoDB Under Linux
          1. 2.2.1.1. Installing MongoDB Through the Repositories
          2. 2.2.1.2. Installing MongoDB Manually
        2. 2.2.2. Installing MongoDB Under Windows
      3. 2.3. Running MongoDB
        1. 2.3.1. Prerequisites
        2. 2.3.2. Surveying the Installation Layout
        3. 2.3.3. Using the MongoDB Shell
      4. 2.4. Installing Additional Drivers
        1. 2.4.1. Installing the PHP driver
          1. 2.4.1.1. Getting MongoDB for PHP
          2. 2.4.1.2. Installing PHP on Unix-based Platforms Automatically
          3. 2.4.1.3. Installing PHP on Unix-Based Platforms Manually
          4. 2.4.1.4. Installing PHP on Windows
        2. 2.4.2. Confirming Your PHP Installation Works
          1. 2.4.2.1. Connecting to and Disconnecting from the PHP Driver
        3. 2.4.3. Installing the Python Driver
          1. 2.4.3.1. Installing PyMongo under Linux
          2. 2.4.3.2. Installing PyMongo Automatically
          3. 2.4.3.3. Installing PyMongo Manually
          4. 2.4.3.4. Installing PyMongo Under Windows
        4. 2.4.4. Confirming Your PyMongo Installation Works
      5. 2.5. Summary
    3. 3. The Data Model
      1. 3.1. Designing the Database
        1. 3.1.1. Drilling Down on Collections
        2. 3.1.2. Using Documents
          1. 3.1.2.1. Embedding vs. Referencing Information in Documents
        3. 3.1.3. Creating the _id Field
      2. 3.2. Building Indexes
        1. 3.2.1. Impacting Performance with Indexes
      3. 3.3. Implementing Geospatial Indexing
        1. 3.3.1. Querying Geospatial Information
      4. 3.4. Using MongoDB in the Real World
      5. 3.5. Summary
    4. 4. Working with Data
      1. 4.1. Navigating Your Databases
        1. 4.1.1. Viewing Available Databases and Collections
      2. 4.2. Inserting Data into Collections
      3. 4.3. Querying for Data
        1. 4.3.1. Using the Dot Notation
        2. 4.3.2. Using the Sort, Limit, and Skip Functions
        3. 4.3.3. Working with Capped Collections, Natural Order, and $natural
        4. 4.3.4. Retrieving a Single Document
        5. 4.3.5. Using the Aggregation Commands
          1. 4.3.5.1. Returning the Number of Documents with Count()
          2. 4.3.5.2. Retrieving Unique Values with Distinct()
          3. 4.3.5.3. Grouping Your Results
        6. 4.3.6. Working with Conditional Operators
          1. 4.3.6.1. Performing Greater and Less Than Comparisons
          2. 4.3.6.2. Retrieving All Documents but Those Specified
          3. 4.3.6.3. Specifying an Array of Matches
          4. 4.3.6.4. Finding a Value Not in an Array
          5. 4.3.6.5. Matching all Attributes in a Document
          6. 4.3.6.6. Searching for Multiple Expressions in a Document
          7. 4.3.6.7. Retrieving a Document with $slice
          8. 4.3.6.8. Searching for Odd/Even Integers
          9. 4.3.6.9. Filtering Results with $size
          10. 4.3.6.10. Returning a Specific Field Object
          11. 4.3.6.11. Matching Results Based on the BSON Type
          12. 4.3.6.12. Matching an Entire Array
          13. 4.3.6.13. $not (meta-operator)
          14. 4.3.6.14. Specifying Additional Query Expressions
        7. 4.3.7. Leveraging Regular Expressions
      4. 4.4. Updating Data
        1. 4.4.1. Updating with update()
        2. 4.4.2. Implementing an Upsert with the save() Command
        3. 4.4.3. Updating Information Automatically
          1. 4.4.3.1. Incrementing a Value with $inc
          2. 4.4.3.2. Setting a Field's Value
          3. 4.4.3.3. Deleting a Given Field
          4. 4.4.3.4. Appending a Value to a Specified Field
          5. 4.4.3.5. Specifying Multiple Values in an Array
          6. 4.4.3.6. Adding Data to an Array with $addToSet
          7. 4.4.3.7. Removing Elements from an Array
          8. 4.4.3.8. Removing Each Occurrence of a Specified Value
          9. 4.4.3.9. Removing Multiple Elements from an Array
        4. 4.4.4. Specifying the Position of a Matched Array
        5. 4.4.5. Atomic Operations
          1. 4.4.5.1. Using the Update if Current Method
        6. 4.4.6. Modifying and Returning a Document Atomically
      5. 4.5. Renaming a Collection
      6. 4.6. Removing Data
      7. 4.7. Referencing a Database
        1. 4.7.1. Referencing Data Manually
        2. 4.7.2. Referencing Data with DBRef
      8. 4.8. Implementing Index-Related Functions
        1. 4.8.1. Surveying Index-Related Commands
        2. 4.8.2. Forcing a Specified Index to Query Data
        3. 4.8.3. Constraining Query Matches
      9. 4.9. Summary
    5. 5. GridFS
      1. 5.1. Filling in Some Background
      2. 5.2. Working with GridFS
      3. 5.3. Getting Started with the Command-Line Tools
        1. 5.3.1. Using the _id Key
        2. 5.3.2. Working with Filenames
        3. 5.3.3. Determining a File's Length
        4. 5.3.4. Working with Chunk Sizes
        5. 5.3.5. Tracking the Upload Date
        6. 5.3.6. Hashing Your Files
      4. 5.4. Looking Under MongoDB's Hood
        1. 5.4.1. Using the Search Command
        2. 5.4.2. Deleting
        3. 5.4.3. Retrieving Files from MongoDB
        4. 5.4.4. Summing up mongofiles
      5. 5.5. Exploiting the Power of Python
        1. 5.5.1. Connecting to the Database
        2. 5.5.2. Accessing the Words
      6. 5.6. Putting Files into MongoDB
      7. 5.7. Retrieving Files from GridFS
      8. 5.8. Deleting Files
      9. 5.9. Summary
  7. II. Developing
    1. 6. PHP and MongoDB
      1. 6.1. Comparing Documents in MongoDB and PHP
      2. 6.2. MongoDB Classes
      3. 6.3. Connecting and Disconnecting
      4. 6.4. Inserting Data
      5. 6.5. Listing Your Data
        1. 6.5.1. Returning a Single Document
        2. 6.5.2. Listing All Documents
        3. 6.5.3. Using Query Operators
        4. 6.5.4. Querying for Specific Information
        5. 6.5.5. Sorting, Limiting, and Skipping Items
        6. 6.5.6. Counting the Number of Matching Results
        7. 6.5.7. Grouping Data with Map/Reduce
        8. 6.5.8. Specifying the Index with Hint
        9. 6.5.9. Refining Queries with Conditional Operators
          1. 6.5.9.1. Using the $lt, $gt, $lte, and $gte Operators
          2. 6.5.9.2. Finding Documents that Don't Match a Value
          3. 6.5.9.3. Matching Any of Multiple Values with $in
          4. 6.5.9.4. Matching All Criteria in a Query with $all
          5. 6.5.9.5. Searching for Multiple Expressions with $or
          6. 6.5.9.6. Retrieving a Specified Number of Items with $slice
          7. 6.5.9.7. Determining Whether a Field Has a Value
        10. 6.5.10. Regular Expressions
      6. 6.6. Modifying Data with PHP
        1. 6.6.1. Updating via update()
        2. 6.6.2. Saving Time with Modifier Operators
          1. 6.6.2.1. Increasing the Value of a Specific Key with $inc
          2. 6.6.2.2. Changing the Value of a Key with $set
          3. 6.6.2.3. Deleting a Field with $unset
          4. 6.6.2.4. Appending a Value to a Specified Field with $push
          5. 6.6.2.5. Adding Multiple Values to a Key with $pushAll
          6. 6.6.2.6. Adding Data to an Array with $addToSet
          7. 6.6.2.7. Removing an Element from an Array with $pop
          8. 6.6.2.8. Removing Each Occurrence of a Value with $pull
          9. 6.6.2.9. Removing Each Occurrence of Multiple Elements
        3. 6.6.3. Upserting Data with save()
        4. 6.6.4. Modifying a Document Atomically
      7. 6.7. Deleting Data
      8. 6.8. DBRef
        1. 6.8.1. Retrieving the Information
      9. 6.9. GridFS and the PHP Driver
        1. 6.9.1. Storing Files
        2. 6.9.2. Adding More Metadata to Stored Files
        3. 6.9.3. Retrieving Files
        4. 6.9.4. Deleting Data
      10. 6.10. Summary
    2. 7. Python and MongoDB
      1. 7.1. Working with Documents in Python
      2. 7.2. Using PyMongo Modules
      3. 7.3. Connecting and Disconnecting
      4. 7.4. Inserting Data
      5. 7.5. Finding Your Data
        1. 7.5.1. Finding a Single Document
        2. 7.5.2. Finding Multiple Documents
        3. 7.5.3. Using Dot Notation
        4. 7.5.4. Returning Fields
        5. 7.5.5. Simplifying Queries with Sort, Limit, and Skip
        6. 7.5.6. Aggregating Queries
          1. 7.5.6.1. Counting Items with Count()
          2. 7.5.6.2. Counting Unique Items with Distinct()
          3. 7.5.6.3. Grouping Data with map_reduce()
        7. 7.5.7. Specifying an Index with Hint()
        8. 7.5.8. Refining Queries with Conditional Operators
          1. 7.5.8.1. Using the $lt, $gt, $lte, and $gte Operators
          2. 7.5.8.2. Searching for Non-Matching Values with $ne
          3. 7.5.8.3. Specifying an Array of Matches with $in
          4. 7.5.8.4. Specifying Against an Array of Matches with $nin
          5. 7.5.8.5. Finding Documents that Match an Array's Values
          6. 7.5.8.6. Specifying Multiple Expressions to Match with $or
          7. 7.5.8.7. Retrieving Items from an Array with $slice
        9. 7.5.9. Conducting Searches with Regular Expression
      6. 7.6. Modifying the Data
        1. 7.6.1. Updating Your Data
        2. 7.6.2. Modifier Operators
          1. 7.6.2.1. Increasing an Integer Value with $inc
          2. 7.6.2.2. Changing an Existing Value with $set
          3. 7.6.2.3. Removing a Key/Value Field with $unset
          4. 7.6.2.4. Adding a Value to an Array with $push
          5. 7.6.2.5. Adding Multiple Values to an Array with $pushAll
          6. 7.6.2.6. Adding a Value to an Existing Array with $addToSet
          7. 7.6.2.7. Removing an Element from an Array with $pop
          8. 7.6.2.8. Removing a Specific Value with $pull
        3. 7.6.3. Saving Documents Quickly with Save()
        4. 7.6.4. Modifying a Document Atomically
        5. 7.6.5. Putting the Parameters to Work
      7. 7.7. Deleting Data
      8. 7.8. Creating a Link Between Two Documents
        1. 7.8.1. Retrieving the Information
      9. 7.9. Summary
    3. 8. Creating a Blog Application with the PHP Driver
      1. 8.1. Designing the Application
      2. 8.2. Listing the Posts
        1. 8.2.1. Paging with PHP and MongoDB
      3. 8.3. Looking at a Single Post
        1. 8.3.1. Specifying Additional Variables
        2. 8.3.2. Viewing and Adding Comments
      4. 8.4. Searching the Posts
      5. 8.5. Adding, Deleting, and Modifying Posts
        1. 8.5.1. Adding a New Post
        2. 8.5.2. Editing a Post
        3. 8.5.3. Deleting a Post
      6. 8.6. Creating the Index Pages
      7. 8.7. Recapping the blog Application
      8. 8.8. Summary
  8. III. Advanced
    1. 9. Database Administration
      1. 9.1. Using Administrative Tools
        1. 9.1.1. mongo, the MongoDB Console
        2. 9.1.2. Using Third-Party Administration Tools
      2. 9.2. Backing up the MongoDB Server
        1. 9.2.1. Creating a Backup 101
        2. 9.2.2. Backing up a Single Database
        3. 9.2.3. Backing up a Single Collection
      3. 9.3. Digging Deeper into Backups
      4. 9.4. Restoring Individual Databases or Collections
        1. 9.4.1. Restoring a Single Database
        2. 9.4.2. Restoring a Single Collection
      5. 9.5. Automating Backups
        1. 9.5.1. Using a Local Datastore
          1. 9.5.1.1. Installing the Script
        2. 9.5.2. Using a Remote (Cloud-Based) Datastore
      6. 9.6. Backing up Large Databases
        1. 9.6.1. Using a Slave Server for Backups
        2. 9.6.2. Creating Snapshots with a Journaling Filesystem
        3. 9.6.3. Disk Layout to Use with Volume Managers
      7. 9.7. Importing Data into MongoDB
      8. 9.8. Exporting Data from MongoDB
      9. 9.9. Securing Your Data
        1. 9.9.1. Restricting Access to a MongoDB Server
      10. 9.10. Protecting Your Server with Authentication
        1. 9.10.1. Adding an Admin User
        2. 9.10.2. Enabling Authentication
        3. 9.10.3. Authenticating in the mongo Console
        4. 9.10.4. Changing a User's Credentials
        5. 9.10.5. Adding a Read-Only User
        6. 9.10.6. Deleting a User
        7. 9.10.7. Using Authenticated Connections in a PHP Application
      11. 9.11. Managing Servers
        1. 9.11.1. Starting a Server
        2. 9.11.2. Reconfiguring a Server
        3. 9.11.3. Getting the Server's Version
        4. 9.11.4. Getting the Server's Status
        5. 9.11.5. Shutting Down a Server
      12. 9.12. Using MongoDB Logfiles
      13. 9.13. Validating and Repairing Your Data
        1. 9.13.1. Repairing a Server
        2. 9.13.2. Validating a Single Collection
        3. 9.13.3. Repairing Collection Validation Faults
          1. 9.13.3.1. Repairing a Collection's Indexes
        4. 9.13.4. Repairing a Collection's Datafiles
      14. 9.14. Upgrading MongoDB
      15. 9.15. Monitoring MongoDB
        1. 9.15.1. Rolling Your Own Stat Monitoring Tool
      16. 9.16. Using the mongod Web Interface
      17. 9.17. Summary
    2. 10. Optimization
      1. 10.1. Optimizing Your Server Hardware for Performance
        1. 10.1.1. Understanding How MongoDB Uses Memory
        2. 10.1.2. Choosing the Right Database Server Hardware
      2. 10.2. Evaluating Query Performance
      3. 10.3. MongoDB Profiler
        1. 10.3.1. Enabling and Disabling the DB Profiler
          1. 10.3.1.1. Finding Slow Queries
        2. 10.3.2. Analyzing a Specific Query with explain()
        3. 10.3.3. Using Profile and explain() to Optimize a Query
      4. 10.4. Managing Indexes
        1. 10.4.1. Listing Indexes
        2. 10.4.2. Creating a Simple Index
        3. 10.4.3. Creating a Compound Index
          1. 10.4.3.1. Creating Subdocument Compound Indexes
          2. 10.4.3.2. Constructing a Compound Index Manually
      5. 10.5. Specifying Index Options
        1. 10.5.1. Creating an Index in the Background with {background:true}
          1. 10.5.1.1. Killing the Indexing Process
        2. 10.5.2. Creating an Index with a Unique Key {unique:true}
        3. 10.5.3. Dropping Duplicates Automatically with {dropdups:true}
        4. 10.5.4. Dropping an Index
        5. 10.5.5. Re-Indexing a Collection
      6. 10.6. How MongoDB Selects Which Indexes It Will Use
      7. 10.7. Using Hint() to Force Using a Specific Index
      8. 10.8. Optimizing the Storage of Small Objects
      9. 10.9. Summary
    3. 11. Replication
      1. 11.1. Spelling Out MongoDB's Replication Goals
        1. 11.1.1. Improving Scalability
        2. 11.1.2. Improving Durability/Reliability
        3. 11.1.3. Providing Isolation
      2. 11.2. Drilling Down on the Oplog
      3. 11.3. Implementing Single Master/Single Slave Replication
        1. 11.3.1. Setting Up a Master/Slave Replication Configuration
          1. 11.3.1.1. Examining the Slave
      4. 11.4. Implementing Single Master/Multiple Slave Replication
      5. 11.5. Configuring a Master/Slave Replication System
      6. 11.6. Resynchronizing a Master/Slave Replication System
        1. 11.6.1. Issuing a Manual Resync Command to the Slave
        2. 11.6.2. Resyncing by Deleting the Slaves Datafiles
        3. 11.6.3. Resyncing a Slave with the --fastsync Option
      7. 11.7. Implementing Multiple Master/Single Slave Replication
        1. 11.7.1. Setting up a Multiple Master/Slave Replication Configuration
      8. 11.8. Exploring Various Replication Scenarios
        1. 11.8.1. Implementing Cascade Replication
        2. 11.8.2. Implementing Master/Master Replication
        3. 11.8.3. Implementing Interleaved Replication
      9. 11.9. Using Replica Pairs
        1. 11.9.1.
          1. 11.9.1.1. Setting up a Replica Pair
          2. 11.9.1.2. Coping with Failure
          3. 11.9.1.3. Connecting Your Application to a Replica Pair
        2. 11.9.2. Resolving Server Disputes with an Arbiter
      10. 11.10. Implementing Advanced Clustering with Replica Sets
        1. 11.10.1. Creating a Replica Set
        2. 11.10.2. Getting a Replica Set Member Up and Running
        3. 11.10.3. Adding a Server to a Replica Set
        4. 11.10.4. Managing Replica Sets
          1. 11.10.4.1. Inspecting an Instance's Status with rs.status()
          2. 11.10.4.2. Forcing a New Election with rs.stepDown()
          3. 11.10.4.3. Determining If a Member is the Primary Server
        5. 11.10.5. Configuring the Options for Replica Set Members
          1. 11.10.5.1. Organization of the Members Structure
          2. 11.10.5.2. Exploring the Options Available in the Settings Structure
        6. 11.10.6. Determining the Status of Replica Sets
        7. 11.10.7. Connecting to a Replica Set from Your Application
          1. 11.10.7.1. Viewing Replica Set Status with the Web Interface
      11. 11.11. Summary
    4. 12. Sharding
      1. 12.1. Exploring the Need for Sharding
      2. 12.2. Partitioning Horizontal and Vertical Data
        1. 12.2.1. Partitioning Data Vertically
        2. 12.2.2. Partitioning Data Horizontally
      3. 12.3. Analyzing a Simple Sharding Scenario
      4. 12.4. Implementing Sharding with MongoDB
      5. 12.5. Setting Up a Sharding Configuration
        1. 12.5.1. Adding a New Shard to the Cluster
      6. 12.6. Removing a Shard from the Cluster
      7. 12.7. Determining How You're Connected
      8. 12.8. Listing the Status of a Sharded Cluster
      9. 12.9. Using Replica Sets to Implement Shards
      10. 12.10. Sharding to Improve Performance
      11. 12.11. Summary