You are previewing MongoDB: The Definitive Guide.

MongoDB: The Definitive Guide

Cover of MongoDB: The Definitive Guide by Michael Dirolf... Published by O'Reilly Media, Inc.
  1. MongoDB: The Definitive Guide
  2. Foreword
  3. Preface
    1. How This Book Is Organized
      1. Getting Up to Speed with MongoDB
      2. Developing with MongoDB
      3. Advanced Usage
      4. Administration
      5. Developing Applications with MongoDB
      6. Appendixes
    2. Conventions Used in This Book
    3. Using Code Examples
    4. Safari® Books Online
    5. How to Contact Us
    6. Acknowledgments
      1. Acknowledgments from Kristina
      2. Acknowledgments from Michael
  4. 1. Introduction
    1. A Rich Data Model
    2. Easy Scaling
    3. Tons of Features…
    4. …Without Sacrificing Speed
    5. Simple Administration
    6. But Wait, That’s Not All…
  5. 2. Getting Started
    1. Documents
    2. Collections
      1. Schema-Free
      2. Naming
    3. Databases
    4. Getting and Starting MongoDB
    5. MongoDB Shell
      1. Running the Shell
      2. A MongoDB Client
      3. Basic Operations with the Shell
      4. Tips for Using the Shell
    6. Data Types
      1. Basic Data Types
      2. Numbers
      3. Dates
      4. Arrays
      5. Embedded Documents
      6. _id and ObjectIds
  6. 3. Creating, Updating, and Deleting Documents
    1. Inserting and Saving Documents
      1. Batch Insert
      2. Inserts: Internals and Implications
    2. Removing Documents
      1. Remove Speed
    3. Updating Documents
      1. Document Replacement
      2. Using Modifiers
      3. Upserts
      4. Updating Multiple Documents
      5. Returning Updated Documents
    4. The Fastest Write This Side of Mississippi
      1. Safe Operations
      2. Catching “Normal” Errors
    5. Requests and Connections
  7. 4. Querying
    1. Introduction to find
      1. Specifying Which Keys to Return
      2. Limitations
    2. Query Criteria
      1. Query Conditionals
      2. OR Queries
      3. $not
      4. Rules for Conditionals
    3. Type-Specific Queries
      1. null
      2. Regular Expressions
      3. Querying Arrays
      4. Querying on Embedded Documents
    4. $where Queries
    5. Cursors
      1. Limits, Skips, and Sorts
      2. Avoiding Large Skips
      3. Advanced Query Options
      4. Getting Consistent Results
    6. Cursor Internals
  8. 5. Indexing
    1. Introduction to Indexing
      1. Scaling Indexes
      2. Indexing Keys in Embedded Documents
      3. Indexing for Sorts
      4. Uniquely Identifying Indexes
    2. Unique Indexes
      1. Dropping Duplicates
      2. Compound Unique Indexes
    3. Using explain and hint
    4. Index Administration
      1. Changing Indexes
    5. Geospatial Indexing
      1. Compound Geospatial Indexes
      2. The Earth Is Not a 2D Plane
  9. 6. Aggregation
    1. count
    2. distinct
    3. group
      1. Using a Finalizer
      2. Using a Function as a Key
    4. MapReduce
      1. Example 1: Finding All Keys in a Collection
      2. Example 2: Categorizing Web Pages
      3. MongoDB and MapReduce
  10. 7. Advanced Topics
    1. Database Commands
      1. How Commands Work
      2. Command Reference
    2. Capped Collections
      1. Properties and Use Cases
      2. Creating Capped Collections
      3. Sorting Au Naturel
      4. Tailable Cursors
    3. GridFS: Storing Files
      1. Getting Started with GridFS: mongofiles
      2. Working with GridFS from the MongoDB Drivers
      3. Under the Hood
    4. Server-Side Scripting
      1. db.eval
      2. Stored JavaScript
      3. Security
    5. Database References
      1. What Is a DBRef?
      2. Example Schema
      3. Driver Support for DBRefs
      4. When Should DBRefs Be Used?
  11. 8. Administration
    1. Starting and Stopping MongoDB
      1. Starting from the Command Line
      2. File-Based Configuration
      3. Stopping MongoDB
    2. Monitoring
      1. Using the Admin Interface
      2. serverStatus
      3. mongostat
      4. Third-Party Plug-Ins
    3. Security and Authentication
      1. Authentication Basics
      2. How Authentication Works
      3. Other Security Considerations
    4. Backup and Repair
      1. Data File Backup
      2. mongodump and mongorestore
      3. fsync and Lock
      4. Slave Backups
      5. Repair
  12. 9. Replication
    1. Master-Slave Replication
      1. Options
      2. Adding and Removing Sources
    2. Replica Sets
      1. Initializing a Set
      2. Nodes in a Replica Set
      3. Failover and Primary Election
    3. Performing Operations on a Slave
      1. Read Scaling
      2. Using Slaves for Data Processing
    4. How It Works
      1. The Oplog
      2. Syncing
      3. Replication State and the Local Database
      4. Blocking for Replication
    5. Administration
      1. Diagnostics
      2. Changing the Oplog Size
      3. Replication with Authentication
  13. 10. Sharding
    1. Introduction to Sharding
    2. Autosharding in MongoDB
      1. When to Shard
    3. The Key to Sharding: Shard Keys
      1. Sharding an Existing Collection
      2. Incrementing Shard Keys Versus Random Shard Keys
      3. How Shard Keys Affect Operations
    4. Setting Up Sharding
      1. Starting the Servers
      2. Sharding Data
    5. Production Configuration
      1. A Robust Config
      2. Many mongos
      3. A Sturdy Shard
      4. Physical Servers
    6. Sharding Administration
      1. config Collections
      2. Sharding Commands
  14. 11. Example Applications
    1. Chemical Search Engine: Java
      1. Installing the Java Driver
      2. Using the Java Driver
      3. Schema Design
      4. Writing This in Java
      5. Issues
    2. News Aggregator: PHP
      1. Installing the PHP Driver
      2. Using the PHP Driver
      3. Designing the News Aggregator
      4. Trees of Comments
      5. Voting
    3. Custom Submission Forms: Ruby
      1. Installing the Ruby Driver
      2. Using the Ruby Driver
      3. Custom Form Submission
      4. Ruby Object Mappers and Using MongoDB with Rails
    4. Real-Time Analytics: Python
      1. Installing PyMongo
      2. Using PyMongo
      3. MongoDB for Real-Time Analytics
      4. Schema
      5. Handling a Request
      6. Using Analytics Data
      7. Other Considerations
  15. A. Installing MongoDB
    1. Choosing a Version
    2. Windows Install
      1. Installing as a Service
    3. POSIX (Linux, Mac OS X, and Solaris) Install
      1. Installing from a Package Manager
  16. B. mongo: The Shell
    1. Shell Utilities
  17. C. MongoDB Internals
    1. BSON
    2. Wire Protocol
    3. Data Files
    4. Namespaces and Extents
    5. Memory-Mapped Storage Engine
  18. Index
  19. About the Authors
  20. Colophon
  21. Copyright
O'Reilly logo

Chapter 10. Sharding

Sharding is MongoDB’s approach to scaling out. Sharding allows you to add more machines to handle increasing load and data size without affecting your application.

Introduction to Sharding

Sharding refers to the process of splitting data up and storing different portions of the data on different machines; the term partitioning is also sometimes used to describe this concept. By splitting data up across machines, it becomes possible to store more data and handle more load without requiring large or powerful machines.

Manual sharding can be done with almost any database software. It is when an application maintains connections to several different database servers, each of which are completely independent. The application code manages storing different data on different servers and querying against the appropriate server to get data back. This approach can work well but becomes difficult to maintain when adding or removing nodes from the cluster or in the face of changing data distributions or load patterns.

MongoDB supports autosharding, which eliminates some of the administrative headaches of manual sharding. The cluster handles splitting up data and rebalancing automatically. Throughout the rest of this book (and most MongoDB documentation in general), the terms sharding and autosharding are used interchangeably, but it’s important to note the difference between that and manual sharding in an application.

Autosharding in MongoDB

The basic concept behind MongoDB’s ...

The best content for your career. Discover unlimited learning on demand for around $1/day.