You are previewing CouchDB: The Definitive Guide.

CouchDB: The Definitive Guide

Cover of CouchDB: The Definitive Guide by J. Chris Anderson... Published by O'Reilly Media, Inc.
  1. CouchDB: The Definitive Guide
  2. Dedication
  3. SPECIAL OFFER: Upgrade this ebook with O’Reilly
  4. Foreword
  5. Preface
    1. Using Code Examples
    2. Conventions Used in This Book
    3. Safari® Books Online
    4. How to Contact Us
    5. Acknowledgments
      1. J. Chris
      2. Jan
      3. Noah
  6. I. Introduction
    1. 1. Why CouchDB?
      1. Relax
      2. A Different Way to Model Your Data
      3. A Better Fit for Common Applications
      4. Building Blocks for Larger Systems
      5. Local Data Is King
      6. Wrapping Up
    2. 2. Eventual Consistency
      1. Working with the Grain
      2. The CAP Theorem
      3. Local Consistency
      4. Distributed Consistency
      5. Wrapping Up
    3. 3. Getting Started
      1. All Systems Are Go!
      2. Welcome to Futon
      3. Your First Database and Document
      4. Running a Query Using MapReduce
      5. Triggering Replication
      6. Wrapping Up
    4. 4. The Core API
      1. Server
      2. Databases
      3. Documents
      4. Replication
      5. Wrapping Up
  7. II. Developing with CouchDB
    1. 5. Design Documents
      1. Document Modeling
      2. The Query Server
      3. Applications Are Documents
      4. A Basic Design Document
      5. Looking to the Future
    2. 6. Finding Your Data with Views
      1. What Is a View?
      2. Efficient Lookups
      3. The View to Get Comments for Posts
      4. Reduce/Rereduce
      5. Wrapping Up
    3. 7. Validation Functions
      1. Document Validation Functions
      2. Validation’s Context
      3. Writing One
      4. Wrapping Up
    4. 8. Show Functions
      1. The Show Function API
      2. Side Effect–Free
      3. Design Documents
      4. Querying Show Functions
      5. Etags
      6. Functions and Templates
      7. Learning Shows
      8. Using Templates
      9. Writing Templates
    5. 9. Transforming Views with List Functions
      1. Arguments to the List Function
      2. An Example List Function
      3. List Theory
      4. Querying Lists
      5. Lists, Etags, and Caching
  8. III. Example Application
    1. 10. Standalone Applications
      1. Use the Correct Version
      2. Portable JavaScript
      3. Applications Are Documents
      4. Standalone
      5. In the Wild
      6. Wrapping Up
    2. 11. Managing Design Documents
      1. Working with the Example Application
      2. Installing CouchApp
      3. Using CouchApp
      4. Download the Sofa Source Code
      5. Deploying Sofa
      6. Set Up Your Admin Account
      7. Configuring CouchApp with .couchapprc
    3. 12. Storing Documents
      1. JSON Document Format
      2. Beyond _id and _rev: Your Document Data
      3. The Edit Page
      4. Saving a Document
      5. Wrapping Up
    4. 13. Showing Documents in Custom Formats
      1. Rendering Documents with Show Functions
      2. Dynamic Dates
    5. 14. Viewing Lists of Blog Posts
      1. Map of Recent Blog Posts
      2. Rendering the View as HTML Using a List Function
  9. IV. Deploying CouchDB
    1. 15. Scaling Basics
      1. Scaling Read Requests
      2. Scaling Write Requests
      3. Scaling Data
      4. Basics First
    2. 16. Replication
      1. The Magic
      2. Simple Replication with the Admin Interface
      3. Replication in Detail
      4. Continuous Replication
      5. That’s It?
    3. 17. Conflict Management
      1. The Split Brain
      2. Conflict Resolution by Example
      3. Working with Conflicts
      4. Deterministic Revision IDs
      5. Wrapping Up
    4. 18. Load Balancing
      1. Having a Backup
    5. 19. Clustering
      1. Introducing CouchDB Lounge
      2. Consistent Hashing
      3. Growing the Cluster
  10. V. Reference
    1. 20. Change Notifications
      1. Polling for Changes
      2. Long Polling
      3. Continuous Changes
      4. Filters
      5. Wrapping Up
    2. 21. View Cookbook for SQL Jockeys
      1. Using Views
      2. Look Up by Key
      3. Look Up by Prefix
      4. Aggregate Functions
      5. Get Unique Values
      6. Enforcing Uniqueness
    3. 22. Security
      1. The Admin Party
      2. Basic Authentication
      3. Cookie Authentication
      4. Network Server Security
    4. 23. High Performance
      1. Good Benchmarks Are Non-Trivial
      2. High Performance CouchDB
      3. Bulk Inserts and Mostly Monotonic DocIDs
      4. Bulk Document Inserts
      5. Batch Mode
      6. Single Document Inserts
      7. Hovercraft
      8. Trade-Offs
    5. 24. Recipes
      1. Banking
      2. Ordering Lists
      3. Pagination
  11. VI. Appendixes
    1. A. Installing on Unix-like Systems
      1. Debian GNU/Linux
      2. Ubuntu
      3. Gentoo Linux
      4. Problems
    2. B. Installing on Mac OS X
      1. CouchDBX
      2. Homebrew
      3. MacPorts
    3. C. Installing on Windows
    4. D. Installing from Source
      1. Dependencies
      2. Installing
      3. Security Considerations
      4. Running Manually
      5. Running As a Daemon
      6. Troubleshooting
    5. E. JSON Primer
      1. Data Types
    6. F. The Power of B-trees
  12. Index
  13. About the Authors
  14. Colophon
  15. SPECIAL OFFER: Upgrade this ebook with O’Reilly
  16. Copyright

Chapter 5. Design Documents

Design documents are a special type of CouchDB document that contains application code. Because it runs inside a database, the application API is highly structured. We’ve seen JavaScript views and other functions in the previous chapters. In this section, we’ll take a look at the function APIs, and talk about how functions in a design document are related within applications.

This part (Part II, Chapters Chapter 5 through Chapter 9) lays the foundation for Part III, where we take what we’ve learned and build a small blog application to further develop an understanding of how CouchDB applications are built. The application is called Sofa, and on a few occasions we discuss it this part. If you are unclear on what we are referring to, do not worry, we’ll get to it in Part III.

Document Modeling

In our experience, there are two main kinds of documents. The first kind is like something a word processor would save, or a user profile. With that sort of data, you want to denormalize as much as you possibly can. Basically, you want to be able to load the document in one request and get something that makes sense enough to display.

A technique exists for creating “virtual” documents by using views to collate data together. You could use this to store each attribute of your user profiles in a different document, but I wouldn’t recommend it. Virtual documents are useful in cases where the presented view will be created by merging the work of different authors; for instance, the reference example, a blog post, and its comments in one query. A blog post titled “CouchDB Joins,” by Christopher Lenz, covers this in more detail.[2]

This virtual document idea takes us to the other kind of document—the event log. Use this in cases where you don’t trust user input or where you need to trigger an asynchronous job. This records the user action as an event, so only minimal validation needs to occur at save time. It’s when you load the document for further work that you’d check for complex relational-style constraints.

You can treat documents as state machines, with a combination of user input and background processing managing document state. You’d use a view by state to pull out the relevant document—changing its state would move it in the view.

This approach is also useful for logging—combined with the batch=ok performance hint, CouchDB should make a fine log store, and reduce views are ideal for finding things like average response time or highly active users.

The Query Server

CouchDB’s default query server (the software package that executes design document functions) is written in JavaScript, but there are views servers available for nearly any language you can imagine. Implementing a new language is a matter of handling a few JSON commands from a simple line-based program.

In this section, we’ll review existing functionality like MapReduce views, update validation functions, and show and list transforms. We’ll also briefly describe capabilities available on CouchDB’s roadmap, like replication filters, update handlers for parsing non-JSON input, and a rewrite handler for making application URLs more palatable. Since CouchDB is an open source project, we can’t really say when each planned feature will become available, but it’s our hope that everything described here is available by the time you read this. We’ll make it clear in the text when we’re talking about things that aren’t yet in the CouchDB trunk.

Applications Are Documents

CouchDB is designed to work best when there is a one-to-one correspondence between applications and design documents.

A design document is a CouchDB document with an id that begins with _design/. For instance, the example blog application, Sofa, is stored in a design document with the ID _design/sofa (see Figure 5-1). Design documents are just like any other CouchDB document—they replicate along with the other documents in their database and track edit conflicts with the rev parameter.

As we’ve seen, design documents are normal JSON documents, denoted by the fact that their DocID is prefixed with _design/.

CouchDB looks for views and other application functions here. The static HTML pages of our application are served as attachments to the design document. Views and validations, however, aren’t stored as attachments; rather, they are directly included in the design document’s JSON body.

Anatomy of our design document

Figure 5-1. Anatomy of our design document

CouchDB’s MapReduce queries are stored in the views field. This is how Futon displays and allows you to edit MapReduce queries. View indexes are stored on a per–design document basis, according to a fingerprint of the function’s text contents. This means that if you edit attachments, validations, or any other non-view (or language) fields on the design document, the views will not be regenerated. However, if you change a map or a reduce function, the view index will be deleted and a new index built for the new view functions.

CouchDB has the capability to render responses in formats other than raw JSON. The design doc fields show and list contain functions used to transform raw JSON into HTML, XML, or other Content-Types. This allows CouchDB to serve Atom feeds without any additional middleware. The show and list functions are a little like “actions” in traditional web frameworks—they run some code based on a request and render a response. However, they differ from actions in that they may not have side effects. This means that they are largely restricted to handling GET requests, but it also means they can be cached by HTTP proxies like Varnish.

Because application logic is contained in a single document, code upgrades can be accomplished with CouchDB replication. This also opens the possibility for a single database to host multiple applications. The interface a newspaper editor needs is vastly different from what a reader desires, although the data is largely the same. They can both be hosted by the same database, in different design documents.

A CouchDB database can contain many design documents. Example design DocIDs are:


In the full CouchDB URL structure, you’d be able to GET the design document JSON at URLs like:


We show this to note that design documents have a special case, as they are the only documents whose URLs can be used with a literal slash. We’ve done this because nobody likes to see %2F in their browser’s location bar. In all other cases, a slash in a DocID must be escaped when used in a URL. For instance, the DocID movies/jaws would appear in the URL like this:

We’ll build the first iteration of the example application without using show or list, because writing Ajax queries against the JSON API is a better way to teach CouchDB as a database. The APIs we explore in the first iteration are the same APIs you’d use to analyze log data, archive assets, or manage persistent queues.

In the second iteration, we’ll upgrade our example blog so that it can function with client-side JavaScript turned off. For now, sticking to Ajax queries gives more transparency into how CouchDB’s JSON/HTTP API works. JSON is a subset of JavaScript, so working with it in JavaScript keeps the impedance mismatch low, while the browser’s XMLHttpRequest (XHR) object handles the HTTP details for us.

CouchDB uses the validate_doc_update function to prevent invalid or unauthorized document updates from proceeding. We use it in the example application to ensure that blog posts can be authored only by logged-in users. CouchDB’s validation functions also can’t have any side effects, and they have the opportunity to block not only end user document saves, but also replicated documents from other nodes. We’ll talk about validation in depth in Part III.

The raw images, JavaScript, CSS, and HTML assets needed by Sofa are stored in the _attachments field, which is interesting in that by default it shows only the stubs, rather than the full content of the files. Attachments are available on all CouchDB documents, not just design documents, so asset management applications have as much flexibility as they could need. If a set of resources is required for your application to run, they should be attached to the design document. This means that a new user can easily bootstrap your application on an empty database.

The other fields in the design document shown in Figure 5-1 (and in the design documents we’ll be using) are used by CouchApp’s upload process (see Chapter 10 for more information on CouchApp). The signatures field allows us to avoid updating attachments that have not changed between the disk and the database. It does this by comparing file content hashes. The lib field is used to hold additional JavaScript code and JSON data to be inserted at deploy time into view, show, and validation functions. We’ll explain CouchApp in the next chapter.

A Basic Design Document

In the next section we’ll get into advanced techniques for working with design documents, but before we finish here, let’s look at a very basic design document. All we’ll do is define a single view, but it should be enough to show you how design documents fit into the larger system.

First, add the following text (or something like it) to a text file called mydesign.json using your editor:

  "_id" : "_design/example",
  "views" : {
    "foo" : {
      "map" : "function(doc){ emit(doc._id, doc._rev)}"

Now use curl to PUT the file to CouchDB (we’ll create a database first for good measure):

curl -X PUT
curl -X PUT -d @mydesign.json

From the second request, you should see a response like:


Now we can query the view we’ve defined, but before we do that, we should add a few documents to the database so we have something to view. Running the following command a few times will add empty documents:

curl -X POST -d '{}'

Now to query the view:


This should give you a list of all the documents in the database (except the design document). You’ve created and used your first design document!

Looking to the Future

There are other design document functions that are being introduced at the time of this writing, including _update and _filter that we aren’t covering in depth here. Filter functions are covered in Chapter 20. Imagine a web service that POSTs an XML blob at a URL of your choosing when particular events occur. PayPal’s instant payment notification is one of these. With an _update handler, you can POST these directly in CouchDB and it can parse the XML into a JSON document and save it. The same goes for CSV, multi-part form, or any other format.

The bigger picture we’re working on is like an app server, but different in one crucial regard: rather than let the developer do whatever he wants (loop a list of DocIDs and make queries, make queries based on the results of other queries, etc.), we’re defining “safe” transformations, such as view, show, list, and update. By safe, we mean that they have well-known performance characteristics and otherwise fit into CouchDB’s architecture in a streamlined way.

The goal here is to provide a way to build standalone apps that can also be easily indexed by search engines and used via screen readers. Hence, the push for plain old HTML. You can pretty much rely on JavaScript getting executed (except when you can’t). Having HTML resources means CouchDB is suitable for public-facing web apps.

On the horizon are a rewrite handler and a database event handler, as they seem to flesh out the application capabilities nicely. A rewrite handler would allow your application to present its own URL space, which would make integration into existing systems a bit easier. An event handler would allow you to run asynchronous processes when the database changes, so that, for instance, a document update can trigger a workflow, multi-document validation, or message queue.

The best content for your career. Discover unlimited learning on demand for around $1/day.