Chapter 6. Data Access

Like any web server, Node needs access to data stores for persistent storage; without persistence, all you have is a brochure website, which would make using Node pointless. In this chapter, we’ll run through the basic ways to connect to common open source database choices and to store and retrieve data.

NoSQL and Document Stores

The following NoSQL and document stores are increasingly popular for web-facing applications and are easy to use with Node.

CouchDB

CouchDB provides MVCC-based[15] document storage in a JavaScript environment. When documents (records) are added or updated in CouchDB, the entire dataset is saved to storage and older versions of that data marked obsolete. Older versions of the record can still be merged into the newest version, but in every case a whole new version is created and written to contiguous memory for faster read times. CouchDB is said to be “eventually consistent.” In a large, scalable deployment, multiple instances can sometimes serve older, unsynced versions of records to clients with the expectation that any changes to those records will eventually be merged into the master.

Installation

Specific CouchDB libraries are not required to access the database, but they are useful for providing a high level of abstraction and making code easier to work with. A CouchDB server is needed to test any examples, but it does not require a lot of work to get it running.

Installing CouchDB

The most recent version of CouchDB can be installed from the Apache project page. Installation instructions for a wide array of platforms can be found on the wiki.

If you’re running Windows, you will find a number of binary installers as well as instructions for building from source. As with many of the NoSQL options, installation is easiest and best supported on a Linux-based system, but don’t be dissuaded.

Installing CouchDB’s Node module

Additional modules are not strictly necessary, because CouchDB exposes all of its services through REST, as described in more detail later.

Using CouchDB over HTTP

One of the nice things about CouchDB is that its API is actually all just HTTP. Because Node is great at interacting with HTTP, this means it is really easy to work with CouchDB. Exploiting this fact, it is possible to perform database operations directly without any additional client libraries.

Example 6-1 shows how to generate a list of databases in the current CouchDB installation. In this case, there is no authentication or administrative permission on the CouchDB server—a decidedly bad idea for a database connected to the Internet, but suitable for demonstration purposes.

Example 6-1. Retrieving a list of CouchDB stores via HTTP

var http = require('http');

http.createServer(function (req, res) {
  var client = http.createClient(5984, "127.0.0.1");
  var request = client.request("GET", "/_all_dbs");
  request.end();

  request.on("response", function(response) {
    var responseBody = "";

    response.on("data", function(chunk) {
      responseBody += chunk;
    });

    response.on("end", function() {
      res.writeHead(200, {'Content-Type': 'text/plain'});
      res.write(responseBody);
      res.end();
    });
  });
}).listen(8080);

A client connection is created with the http library. Nothing distinguishes this connection from any other http connection; because CouchDB is RESTful, no additional communication protocol is needed. Of special note is the request.end() line inside the createServer method. If this line is omitted, the request will hang.

As mentioned earlier, all CouchDB methods are exposed in HTTP calls. Creating and deleting databases, therefore, involves making the appropriate PUT and DELETE statements against the server, as demonstrated in Example 6-2.

Example 6-2. Creating a CouchDB database

  var client = http.createClient(5984, "127.0.0.1")
  var request = client.request("PUT", "/dbname");
  request.end();

  request.on("response", function(response) {
    response.on("end", function() {
      if ( response.statusCode == 201 ) {
        console.log("Database successfully created.");
      } else {
        console.log("Could not create database.");
      }
    });
  });

Here, /dbname refers to the resource being accessed. Combined with a PUT command, CouchDB is instructed to create a new database called dbname. An HTTP response code of 201 confirms that the database was created.

As shown in Example 6-3, deleting the resource is the reverse of a PUT: the DELETE command. An HTTP response code of 200 confirms the request was completed successfully.

Example 6-3. Deleting a CouchDB database

  var client = http.createClient(5984, "127.0.0.1")
  var request = client.request("DELETE", "/dbname");
  request.end();

  request.on("response", function(response) {
    response.on("end", function() {
      if ( response.statusCode == 200 ) {
        console.log("Deleted database.");
      } else {
        console.log("Could not delete database.");
      }
    });
  });

These elements aren’t very useful on their own, but they can be put together to form a very basic (if unfriendly) database manager using the methods shown in Example 6-4.

Example 6-4. A simple CouchDB database creation form

var http = require('http');
var qs = require('querystring');
var url = require('url');

var dbHost = "127.0.0.1";
var dbPort = 5984;

deleteDb = function(res, dbpath) {
  var client = http.createClient(dbPort, dbHost)
  var request = client.request("DELETE", dbpath);
  request.end();

  request.on("response", function(response) {
    response.on("end", function() {
      if ( response.statusCode == 200 ) {
        showDbs(res, "Deleted database.");
      } else {
        showDbs(res, "Could not delete database.");
      }
    });
  });
}

createDb = function(res, dbname) {
  var client = http.createClient(dbPort, dbHost)
  var request = client.request("PUT", "/" + dbname);
  request.end();

  request.on("response", function(response) {
    response.on("end", function() {
      if ( response.statusCode == 201 ) {
        showDbs(res, dbname + " created.");
      } else {
        showDbs(res, "Could not create " + dbname);
      }
    });
  });
}

showDbs = function(res, message) {
  var client = http.createClient(dbPort, dbHost);
  var request = client.request("GET", "/_all_dbs");
  request.end();

  request.on("response", function(response) {
    var responseBody = "";

    response.on("data", function(chunk) {
      responseBody += chunk;
    });

    response.on("end", function() {
      res.writeHead(200, {'Content-Type': 'text/html'});
      res.write("<form method='post'>");
      res.write("New Database Name: <input type='text' name='dbname' />");
      res.write("<input type='submit' />");
      res.write("</form>");
      if ( null != message ) res.write("<h1>" + message + "</h1>");

      res.write("<h1>Active databases:</h1>");
      res.write("<ul>");
      var dblist = JSON.parse(responseBody);
      for ( i = 0; i < dblist.length; i++ ) {
        var dbname = dblist[i];
        res.write("<li><a href='/" + dbname + "'>"+dbname+"</a></li>");
      }
      res.write("</ul>");
      res.end();
    });
  });
};

http.createServer(function (req, res) {
  if ( req.method == 'POST' ) {
    // Parse the request
    var body = '';
    req.on('data', function (data) {
      body += data;
    });
    req.on('end', function () {
      var POST = qs.parse(body);
      var dbname = POST['dbname'];
      if ( null != dbname ) {
        // Create the DB
        createDb(res,dbname);
      } else {
        showDbs(res, "Bad DB name, cannot create database.");
      }
    });
  } else {
    var path = url.parse(req.url).pathname;
    if ( path != "/" ) {
      deleteDb(res,path);
    } else {
      showDbs(res);
    }
  }
}).listen(8080);

Using node-couchdb

Knowing how to work with CouchDB over HTTP is useful, but this approach is verbose. Although it has the advantage of not needing external libraries, most developers opt for higher-level abstraction layers, regardless of how simple their database’s native driver implementation is. In this section, we look at the node-couchdb package, which simplifies the interface between Node and CouchDB.

You can install the drivers for CouchDB using npm:

npm install felix-couchdb

Working with databases

The module’s first obvious benefit is succinct program code, as demonstrated in Example 6-5.

Example 6-5. Creating a table in CouchDB

var dbHost = "127.0.0.1";
var dbPort = 5984;
var dbName = 'users';

var couchdb = require('felix-couchdb');
var client = couchdb.createClient(dbPort, dbHost);
var db = client.db(dbName);

db.exists(function(err, exists) {
  if (!exists) {
    db.create();
    console.log('Database ' + dbName + ' created.');
  } else {
    console.log('Database ' + dbName + ' exists.');
  }
});

This example checks for a database called users, creating one if it doesn’t already exist. Notice the similarities between the createClient function call here and the one from the http module demonstrated earlier. This is no accident; even though the module makes CouchDB’s interfaces easier to work with, in the end you are using HTTP to transmit data.

Creating documents

In Example 6-6, we’ll save a document into the CouchDB database created in the previous example.

Example 6-6. Creating a document in CouchDB

var dbHost = "127.0.0.1";
var dbPort = 5984;
var dbName = 'users';

var couchdb = require('felix-couchdb');
var client = couchdb.createClient(dbPort, dbHost);

var user = {
  name: {
    first: 'John',
    last: 'Doe'
  }
}

var db = client.db(dbName);

db.saveDoc('jdoe', user, function(err, doc) {
  if( err) {
    console.log(JSON.stringify(err));
  } else {
    console.log('Saved user.');
  }
});

This example creates a user named John Doe in the database with the username jdoe as its identity. Notice the user is created as a JSON object and passed directly into the client. No more work is needed to parse the information.

After running this example, the user can be accessed in the web browser at http://127.0.0.1:5984/users/jdoe.

Reading documents

Once documents are stored in CouchDB, they can be retrieved again as objects, as shown in Example 6-7.

Example 6-7. Retrieving a record from CouchDB

var dbHost = "127.0.0.1";
var dbPort = 5984;
var dbName = 'users';

var couchdb = require('felix-couchdb');
var client = couchdb.createClient(dbPort, dbHost);

var db = client.db(dbName);

db.getDoc('jdoe', function(err,doc) {
  console.log(doc);
});

The output from this query is:

{ _id: 'jdoe',
  _rev: '3-67a7414d073c9ebce3d4af0a0e49691d',
  name: { first: 'John', last: 'Doe' }
}

There are three steps happening here:

  1. Connect to the database server using createClient.

  2. Select the document store using the client’s db command.

  3. Get the document using the database’s getDoc command.

In this case, the record with ID jdoe—created in the previous example—is retrieved from the database. If the record did not exist (because it was deleted or not yet inserted), the callback’s error parameter would contain data about the error.

Updating documents

Updating documents uses the same saveDoc command as creating documents. If CouchDB detects an existing record with the same ID, it will overwrite the old one.

Example 6-8 demonstrates how to update a document after reading it from the data store.

Example 6-8. Updating a record in CouchDB

var dbHost = "127.0.0.1";
var dbPort = 5984;
var dbName = 'users';

var couchdb = require('felix-couchdb');
var client = couchdb.createClient(dbPort, dbHost);

var db = client.db(dbName);

db.getDoc('jdoe', function(err,doc) {
  doc.name.first = 'Johnny';
  doc.email = 'jdoe@johndoe.com';

  db.saveDoc('jdoe', doc );

  db.getDoc('jdoe', function(err,revisedUser) {
    console.log(revisedUser);
  });
});

The output from this operation is:

{ _id: 'jdoe',
  _rev: '7-1fb9a3bb6db27cbbbf1c74b2d601ccaa',
  name: { first: 'Johnny', last: 'Doe' },
  email: 'jdoe@johndoe.com'
}

This example reads information about the jdoe user from the data store, gives it an email address and a new first name, and saves it back into CouchDB.

Notice that saveDoc and getDoc follow the initial read, instead of putting getDoc inside saveDoc’s callback. The CouchDB drivers queue commands and execute them sequentially, so this example will not result in a race condition where the document read completes before the updates are saved.

Deleting documents

To delete a document from CouchDB, you need to supply both an ID and a revision number. Fortunately, this is easy after a read, as shown in Example 6-9.

Example 6-9. Deleting from CouchDB

var dbHost = "127.0.0.1";
var dbPort = 5984;
var dbName = 'users';

var couchdb = require('felix-couchdb');
var client = couchdb.createClient(dbPort, dbHost);

var db = client.db(dbName);

db.getDoc('jdoe', function(err,doc) {
  db.removeDoc(doc._id, doc._rev);
});

After connecting to the CouchDB datastore, a getDoc command is issued here to get the internal ID (the _id field) and revision number (_rev field) for that document. Once this information has been obtained, a removeDoc command is issued, which sends a DELETE request to the database.

Redis

Redis is a memory-centric key-value store with persistence that will feel very familiar if you have experience with key-value caches such as Memcache. Redis is used when performance and scaling are important; in many cases, developers choose to use it as a cache for data retrieved from a relational database such as MySQL, although it is capable of much more.

Beyond its key-value storage capabilities, Redis provides network-accessible shared memory, is a nonblocking event bus, and exposes subscription and publishing capabilities.

Installation

As with many of the rest of the database engines, using Redis requires installing the database application as well as the Node drivers to communicate with it.

Installing Redis

Redis is available in source form. There isn’t anything to do in the way of configuration; just download and compile per the instructions on the website.

If you are using Windows, you are on your own at the time of this writing because Redis is not supported on Windows. Fortunately, there is a passionate community behind Redis development, and several ports have been made available for both Cygwin and native compilation. The port at https://github.com/dmajkic/redis compiles to a native Windows binary using MinGW.

Installing Redis’s Node module

The redis module is available from GitHub, but can be installed using npm:

npm install redis

Optionally, you may install the mimimalist hiredis library along with Node’s redis module.

Basic usage

Example 6-10 demonstrates a basic set and get operation against Redis by Node.

Example 6-10. A basic get and set operation against Redis

var redis = require('redis'),
    client = redis.createClient();

client.on("error", function (err) {
    console.log("Error " + err);
});

console.log("Setting key1");
client.set("key1", "My string!", redis.print);
console.log("Getting key1");
client.get("key1", function (err, reply) {
    console.log("Results for key1:");
    console.log(reply);
    client.end();
});

This example begins by creating a connection to the Redis database and setting a callback to handle errors. If you are not running an instance of the Redis server, you will receive an error like this:

Error Error: Redis connection to 127.0.0.1:6379 failed - ECONNREFUSED, 
Connection refused

Tip

Note the lack of callbacks in this example. If you need to perform database reads immediately after writing, it is safer to use a callback, to ensure your code is executed in the correct sequence.

After the connection is opened, the client sets basic data for a string key and hash key, and then reads those values back from the store. Library calls have the same names as basic Redis commands (set, hset, get). Redis treats data coming through the set command as strings, and allows for values up to 512 MB in size.

Hashes

Hashes are objects that contain multiple keys. Example 6-11 sets a single key at a time.

Example 6-11. Setting hash values one key at a time

var redis = require('redis'),
    client = redis.createClient();

client.on("error", function (err) {
    console.log("Error " + err);
});

console.log("Setting user hash");
client.hset("user", "username", "johndoe");
client.hset("user", "firstname", "john");
client.hset("user", "lastname", "doe");

client.hkeys("user", function(err,replies) {
    console.log("Results for user:");
    console.log(replies.length + " replies:");
    replies.forEach(function (reply, i) {
        console.log(i + ": " + reply );
    });
    client.end();
});

Example 6-12 shows how to set multiple keys at the same time.

Example 6-12. Setting multiple hash values simultaneously

var redis = require('redis'),
    client = redis.createClient();

client.on("error", function (err) {
    console.log("Error " + err);
});

console.log("Setting user hash");
client.hmset("user", "username", "johndoe", "firstname", "john", "lastname", "doe");

client.hkeys("user", function(err,replies) {
    console.log("Results for user:");
    console.log(replies.length + " replies:");
    replies.forEach(function (reply, i) {
        console.log(i + ": " + reply );
    });
    client.end();
});

We could accomplish the same thing by providing a more developer-friendly object, rather than breaking it out into a list, as shown in Example 6-13.

Example 6-13. Setting multiple hash values using an object

var redis = require('redis'),
    client = redis.createClient();

client.on("error", function (err) {
    console.log("Error " + err);
});

var user = {
   username: 'johndoe',
   firstname: 'John',
   lastname: 'Doe',
   email: 'john@johndoe.com',
   website: 'http://www.johndoe.com'
}

console.log("Setting user hash");
client.hmset("user", user);

client.hkeys("user", function(err,replies) {
    console.log("Results for user:");
    console.log(replies.length + " replies:");
    replies.forEach(function (reply, i) {
        console.log(i + ": " + reply );
    });
    client.end();
});

Instead of manually supplying each field to Redis, you can pass an entire object into hmset, which will parse the fields and send the correct information to Redis.

Warning

Be careful to use hmset and not hset when adding multiple objects. Forgetting that a single object contains multiple values is a common pitfall.

Lists

The list type can be thought of as multiple values inside one key (see Example 6-14). Because it’s possible to push content to the beginning or end of a list, these collections are ideal for showing ordered events, such as lists of users who have recently received an honor.

Example 6-14. Using a list in Redis

var redis = require('redis'),
    client = redis.createClient();

client.on("error", function (err) {
    console.log("Error " + err);
});

client.lpush("pendingusers", "user1" );
client.lpush("pendingusers", "user2" );
client.lpush("pendingusers", "user3" );
client.lpush("pendingusers", "user4" );

client.rpop("pendingusers", function(err,username) {
  if( !err ) {
    console.log("Processing " + username);
  }
  client.end();
});

The output from this example is:

Processing user1

This example demonstrates a first-in-first-out (FIFO) queue using Redis’s list commands. A real-world use for FIFO is in registration systems: the quantity of incoming registration requests is too great to handle in real time, so registration data is hived off to a queue for processing outside the main application. Registrations will be processed in the order they were received, but the primary application is not slowed down by handling the actual record creation and introductory tasks such as welcome emails.

Sets

Sets are used in situations where it is desirable to have lists of nonrepeated items, as in Example 6-15.

Example 6-15. Using Redis’s set commands

var redis = require('redis'),
    client = redis.createClient();

client.on("error", function (err) {
    console.log("Error " + err);
});

client.sadd( "myteam", "Neil" );
client.sadd( "myteam", "Peter" );
client.sadd( "myteam", "Brian" );
client.sadd( "myteam", "Scott" );
client.sadd( "myteam", "Brian" );

client.smembers( "myteam", function(err, members) {
  console.log( members );
  client.end();
});

The output is:

[ 'Brian', 'Scott', 'Neil', 'Peter' ]

Even though “Brian” was given to the list twice, he was added only once. In a real-world situation, it would be entirely possible to have two team members named Brian; this highlights the importance of ensuring that your values are unique when they need to be. Otherwise, the set can cause unintended behavior when you expect more elements than are actually present due to the removal of repeated items.

Sorted sets

Like regular sets, sorted sets do not allow duplicate members. Sorted sets add the concept of weighting, enabling score-based operations on data such as leaderboards, top scores, and content tables.

The producers of the American weight-loss reality show The Biggest Loser are real-world fans of sorted sets. In the 11th season of the series, the contestants were split into three groups based upon their age. On air, they had to perform a crude sorting operation by checking a number printed on everyone’s shirts and then line up in ascending order under the hot sun. If one of the contestants had brought her Node- and Redis-equipped laptop to the competition, she might have made a small program to do the work for them, such as the one in Example 6-16.

Example 6-16. Ranking a sorted list using Redis

var redis = require('redis'),
    client = redis.createClient();

client.on("error", function (err) {
    console.log("Error " + err);
});

client.zadd( "contestants", 60, "Deborah" );
client.zadd( "contestants", 65, "John" );
client.zadd( "contestants", 26, "Patrick" );
client.zadd( "contestants", 62, "Mike" );
client.zadd( "contestants", 24, "Courtney" );
client.zadd( "contestants", 39, "Jennifer" );
client.zadd( "contestants", 26, "Jessica" );
client.zadd( "contestants", 46, "Joe" );
client.zadd( "contestants", 63, "Bonnie" );
client.zadd( "contestants", 27, "Vinny" );
client.zadd( "contestants", 27, "Ramon" );
client.zadd( "contestants", 51, "Becky" );
client.zadd( "contestants", 41, "Sunny" );
client.zadd( "contestants", 47, "Antone" );
client.zadd( "contestants", 40, "John" );

client.zcard( "contestants", function( err, length ) {
  if( !err ) {
    var contestantCount = length;
    var membersPerTeam = Math.ceil( contestantCount / 3 );
    client.zrange( "contestants", membersPerTeam * 0, membersPerTeam * 1 - 1,
      function(err, values) {
        console.log('Young team: ' + values);
      });
    client.zrange( "contestants", membersPerTeam * 1, membersPerTeam * 2 - 1,
      function(err, values) {
        console.log('Middle team: ' + values);
      });
    client.zrange( "contestants", membersPerTeam * 2, contestantCount,
      function(err, values) {
        console.log('Elder team: ' + values);
        client.end();
      });
  }
});

The output is:

Young team: Courtney,Jessica,Patrick,Ramon,Vinny
Middle team: Jennifer,John,Sunny,Joe,Antone
Elder team: Becky,Deborah,Mike,Bonnie

Adding members to a sorted set follows a pattern similar to the one for adding members to a normal set, with the addition of a rank. This allows for interesting slicing and dicing, as in this example. Knowing that each team consists of similarly aged individuals, getting three teams from a sorted list is a matter of pulling three equal groups straight out of the set. The number of contestants (14) is not perfectly divisible by 3, so the final group has only 4 members.

Subscriptions

Redis supports the publish-subscribe (or pub-sub) messaging pattern, allowing senders (publishers) to issue messages into channels for use by receivers (subscribers) whom they know nothing about (see Example 6-17). Subscribers register their areas of interests (channels), and Redis pushes all relevant messages to them. Publishers do not need to be registered to specific channels, nor do subscribers need to be listening when messages are sent. Redis takes care of the brokering, which allows for a great deal of flexibility, as neither the publisher nor the subscriber needs to be aware of the other.

Example 6-17. Subscribing and publishing with Redis

var redis = require("redis"),
    talkativeClient = redis.createClient(),
    pensiveClient = redis.createClient();

pensiveClient.on("subscribe", function (channel, count) {
  talkativeClient.publish( channel, "Welcome to " + channel );
  talkativeClient.publish( channel, "You subscribed to " + count + " channels!" );
});

pensiveClient.on("unsubscribe", function(channel, count) {
  if (count === 0) {
    talkativeClient.end();
    pensiveClient.end();
  }
});

pensiveClient.on("message", function (channel, message) {
  console.log(channel + ': ' + message);
});

pensiveClient.on("ready", function() {
  pensiveClient.subscribe("quiet channel", "peaceful channel", "noisy channel" );
  setTimeout(function() {
    pensiveClient.unsubscribe("quiet channel", "peaceful channel", "noisy channel" );
  }, 1000);
});

The output is:

quiet channel: Welcome to quiet channel
quiet channel: You subscribed to 1 channels!
peaceful channel: Welcome to peaceful channel
peaceful channel: You subscribed to 2 channels!
noisy channel: Welcome to noisy channel
noisy channel: You subscribed to 3 channels!

This example tells the story of two clients. One is quiet and thoughtful, while the other broadcasts inane details about its surroundings to anyone who will listen. The pensive client subscribes to three channels: quiet, peaceful, and noisy. The talkative client responds to each subscription by welcoming the newcomer to the channel and counting the number of active subscriptions.

About one second after subscribing, the pensive client unsubscribes from all three channels. When the unsubscribe command detects no more active subscriptions, both clients end their connection to Redis, and the program execution stops.

Securing Redis

Redis supports password authentication. To add a password, edit Redis’s configuration file and include a line for requirepass, as shown in Example 6-18.

Example 6-18. Snippet from Redis password configuration

################################## SECURITY ###################################

# Require clients to issue AUTH <PASSWORD> before processing any other
# commands.  This might be useful in environments in which you do not trust
# others with access to the host running redis-server.
#
# This should stay commented out for backward compatibility and because most
# people do not need auth (e.g., they run their own servers).
#
requirepass hidengoseke

Once Redis is restarted, it will perform commands only for clients who authenticate using “hidengoseke” as their password (Example 6-19).

Example 6-19. Authenticating Redis

var redis = require('redis'),
    client = redis.createClient();

client.auth("hidengoseke");

The auth command must occur before any other queries are issued. The client will store the password and use it on reconnection attempts.

Notice the lack of usernames and multiple passwords. Redis does not include user management functionality, because of the overhead it would incur. Instead, system administrators are expected to secure their servers using other means, such as port-blocking Redis from the outside world so that only internal, trusted users may access it.

Some “dangerous” commands can be renamed or removed entirely. For example, you may never need to use the CONFIG command. In that case, you can update the configuration file to either change its name to something obscure, or you can fully disable it to protect against unwanted access; both options are shown in Example 6-20.

Example 6-20. Renaming Redis commands

# Change CONFIG command to something obscure
rename-command CONFIG 923jfiosflkja98rufadskjgfwefu89awtsga09nbhsdalkjf3p49

# Clear CONFIG command, so no one can use it
rename-command CONFIG ""

MongoDB

Because Mongo supplies a JavaScript environment with BSON object storage (a binary adaption of JSON), reading and writing data from Node is extremely efficient. Mongo stores incoming records in memory, so it is ideal in high-write situations. Each new version adds improved clustering, replication, and sharding.

Because incoming records are stored in memory, inserting data into Mongo is nonblocking, making it ideal for logging operations and telemetry data. Mongo supports JavaScript functions inside queries, making it very powerful in read situations, including MapReduce queries.

Using MongoDB’s document-based storage allows you to store child records inside parent records. For example, a blog article and all of its associated comments can be stored inside a single record, allowing for incredibly fast retrieval.

MongoDB native driver

The native MongoDB driver by Christian Kvaleim provides nonblocking access to MongoDB. Previous versions of the module included a C/C++ BSON parser/serializer, which has been deprecated due to improvements in the JavaScript parser/serializer.

The native MongoDB driver is a good choice when you need precise control over your MongoDB connection.

Installation

To install the driver, run the following command:

npm install mongodb

Warning

“mongodb” is not to be confused with “mongo,” discussed later in this chapter.

Data types

Node’s MongoDB driver supports the data types listed in Table 6-1.

Table 6-1. Data types supported for MongoDB

TypeDescriptionExample
ArrayA list of itemscardsInHand: [9,4,3]
BooleanA true/false conditionhasBeenRead: false
CodeRepresents a block of JavaScript code that is runnable inside the databasenew BSON.Code('function quotient( dividend, divisor ) { return divisor == 0 ? 0 : divident / divisor; }');
DateRepresents the current date and timelastUpdated: new Date()
DBRefDatabase reference[a]bestFriendId: new BSON.DBRef('users', friendObjectId)
IntegerAn integer (nondecimal) numberpageViews: 50
LongA long integer valuestarsInUniverse = new BSON.Long("10000000000000000000000000");
HashA key-value dictionaryuserName: {'first': 'Sam', 'last': 'Smith'}
NullA null valuebestFriend: null
Object IDA 12-byte code used by MongoDB to index objects, represented as 24-digit hexadecimal stringsmyRecordId: new BSON.ObjectId()
StringA JavaScript stringfullName: 'Sam Smith'

[a] Because MongoDB is a nonrelational database, it does not support joins. The data type DBRef is used by client libraries to implement logical relational joins.

Writing records

As mentioned, writing records to a MongoDB collection involves creating a JSON object inside Node and printing it directly into Mongo. Example 6-21 demonstrates building a user object and saving it into MongoDB.

Example 6-21. Connecting to a MongoDB database and writing a record

var mongo = require('mongodb');
var host = "localhost";
var port = mongo.Connection.DEFAULT_PORT;
var db = new mongo.Db('node-mongo-examples', new mongo.Server(host, port, {}), {});

db.open(function(err,db) {
  db.collection('users', function(err,collection) {
    collection.insert({username:'Bilbo',firstname:'Shilbo'}, function(err, docs) {
      console.log(docs);
      db.close();
    });
  });
});

The output is:

[ { username: 'Bilbo',
    firstname: 'Shilbo',
    _id: 4e9cd8204276d9f91a000001 } ]

Mongoose

Node has a tremendous base of support for Mongo through its Mongoose library. Compared to the native drivers, Mongoose is an expressive environment that makes models and schemas more intuitive.

Installation

The fastest way to get up and running with Mongoose is by installing it with npm:

npm install mongo

Alternatively, you can download the most recent version from source and compile it yourself using instructions from the Mongoose project’s home page at http://mongoosejs.com.

Defining schemas

When you use MongoDB, you don’t need to define a data schema as you would with a relational database. Whenever requirements change or you need to store a new piece of information, you just save a new record containing the information you need, and you can query against it immediately. You can transform old data to include default or empty values for the new field, but MongoDB does not require that step.

Even though schemas aren’t important to MongoDB, they are useful because they help humans understand the contents of the database and implicit rules for working with domain data. Mongoose is useful because it works using human-readable schemas, providing a clean interface to communicate with the database.

What is a schema? Many programmers tend to think in terms of models that define data structures, but don’t think much about the underlying databases those models represent. A table inside an SQL database needs to be created before you can write data to it, and the fields inside that table probably closely match the fields in your model. The schema—that is, the definition of the model inside the database—is created separately from your program; therefore, the schema predates your data.

MongoDB—as well as the other NoSQL datastores—is often said to be schemaless because it doesn’t require explicitly defined structure for stored data. In reality, MongoDB does have a schema, but it is defined by the data as it gets stored. You may add a new property to your model months after you begin work on your application, but you don’t have to redefine the schema of previously entered information in order to search against the new field.

Example 6-22 illustrates how to define a sample schema for an article database and what information should be stored in each type of model. Once again, Mongo does not enforce schemas, but programmers need to define consistent access patterns in their own programs.

Example 6-22. Defining schemas with Mongoose

var mongoose = require('mongoose')

var Schema   = mongoose.Schema,
    ObjectId = Schema.ObjectId

var AuthorSchema = new Schema({
    name: {
        first   : String,
        last    : String,
        full    : String
    },
    contact: {
        email   : String,
        twitter : String,
        google  : String
    },
    photo       : String
});

var CommentSchema = new Schema({
    commenter   : String,
    body        : String,
    posted      : Date
});

var ArticleSchema = new Schema({
    author      : ObjectId,
    title       : String,
    contents    : String,
    published   : Date,
    comments    : [CommentSchema]
});

var Author = mongoose.model('Author', AuthorSchema);
var Article = mongoose.model('Article', ArticleSchema);

Manipulating collections

Mongoose allows direct manipulation of object collections, as illustrated in Example 6-23.

Example 6-23. Reading and writing records using Mongoose

mongoose.connect('mongodb://localhost:27017/upandrunning', function(err){
  if (err) {
    console.log('Could not connect to mongo');
  }
});

newAuthor.save(function(err) {
  if (err) {
    console.log('Could not save author');
  } else {
    console.log('Author saved');
  }
});

Author.find(function(err,doc){
  console.log(doc);
});

This example saves an author into the database and logs all authors to the screen.

Performance

When you work with Mongoose, you don’t need to maintain a connection to MongoDB, because all of your schema definitions and queries are buffered until you connect. This is a big deal, and an important way Mongoose serves Node’s methodology. By issuing all of the “live” commands at once against Mongo, you limit the amount of time and the number of callbacks to work with your data and greatly increase the number of operations your application is able to perform.



[15] MVCC stands for multi-version concurrency control.

Get Node: Up and Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.