O'Reilly logo

Writing and Querying MapReduce Views in CouchDB by Bradley Holt

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Grouping

CouchDB allows you to group by exact keys or by parts of keys. With exact grouping, your keys can be arbitrary JSON values. To group by parts of keys, your keys must be JSON arrays.

Exact Grouping

Let’s get a list of authors whose names begin with the letter “j” again, this time reduced:

curl -X GET http://localhost:5984/books/_design/default/_view/authors -G \
--data-urlencode startkey=\
'"j"' \
--data-urlencode endkey=\
'"j\ufff0"'

The response:

{
   "rows":[
      {
         "key":null,
         "value":{
            "sum":544,
            "count":2,
            "min":272,
            "max":272,
            "sumsqr":147968
         }
      }
   ]
}

See Table 4-10 for the row in tabular format.

Table 4-10. Reduced row from the authors view with no grouping, filtered by start and end keys

keyvalue
null{"sum":544,"count":2,"min":272,"max":272,"sumsqr":147968}

Now let’s see this query grouped by key:

curl -X GET http://localhost:5984/books/_design/default/_view/authors -G \
-d group=true \
--data-urlencode startkey=\
'"j"' \
--data-urlencode endkey=\
'"j\ufff0"'

The response:

{
   "rows":[
      {
         "key":"J. Chris Anderson",
         "value":{
            "sum":272,
            "count":1,
            "min":272,
            "max":272,
            "sumsqr":73984
         }
      },
      {
         "key":"Jan Lehnardt",
         "value":{
            "sum":272,
            "count":1,
            "min":272,
            "max":272,
            "sumsqr":73984
         }
      }
   ]
}

See Table 4-11 for the rows in tabular format.

Table 4-11. Reduced rows from the authors view, grouped and filtered by start and end keys

keyvalue
"J. Chris Anderson"{"sum":272,"count":1,"min":272,"max":272,"sumsqr":73984}
"Jan Lehnardt"{"sum":272,"count":1,"min":272,"max":272,"sumsqr":73984}

Group Levels

Let’s create a new view that allows us to see when books were released by year, month, and day. We will call this the releases view and add it to our _design/default design document. In this new view, we will map book documents to key/value pairs of release dates and number of pages. Instead of a string, the release date will be a JSON array containing the date’s year, month, and day. We will do this by using the split method of JavaScript’s String object on each book document’s released field. If we split a string representation of a date, e.g., 2007-05-08, by the "-" character, we end up with a JSON array, e.g., ["2007","05","08"]. Group levels require JSON arrays to group on, and they work best when the array elements are ordered from least specific to most specific. Our date example meets these criteria because it is a JSON array with the first element representing the year (least specific), the second element representing the month (more specific), and the third element representing the day (most specific).

For this example, let’s add a fourth book document that was published in January 2010, the same year and month as CouchDB: The Definitive Guide. Add the following document using Futon:

{
   "_id":"978-0-596-80579-1",
   "title":"Building iPhone Apps with HTML, CSS, and JavaScript",
   "subtitle":"Making App Store Apps Without Objective-C or Cocoa",
   "authors":[
      "Jonathan Stark"
   ],
   "publisher":"O'Reilly Media",
   "formats":[
      "Print",
      "Ebook",
      "Safari Books Online"
   ],
   "released":"2010-01-08",
   "pages":192
}

Or add the document using cURL:

curl -X PUT http://localhost:5984/books/978-0-596-80579-1 -d \
"{
   \"_id\":\"978-0-596-80579-1\",
   \"title\":\"Building iPhone Apps with HTML, CSS, and JavaScript\",
   \"subtitle\":\"Making App Store Apps Without Objective-C or Cocoa\",
   \"authors\":[
      \"Jonathan Stark\"
   ],
   \"publisher\":\"O'Reilly Media\",
   \"formats\":[
      \"Print\",
      \"Ebook\",
      \"Safari Books Online\"
   ],
   \"released\":\"2010-01-08\",
   \"pages\":192
}"

The response:

{"ok":true,"id":"978-0-596-80579-1","rev":"1-09ce09fef75068834da99957c7b14cf2"}

Now let’s create a new view. In Futon, navigate to the books database, select “Temporary view…” from the “View” drop-down menu, and paste the following JavaScript function into the “Map Function” text box, replacing the existing function:

function(doc) {
  if (doc.released) {
      emit(doc.released.split("-"), doc.pages);
  }
}

Enter the name of the built-in _stats Reduce function in the “Reduce Function” text box:

_stats

Next test your Map and Reduce functions by clicking “Run”. When you have verified that the output is as you’d expect, click the “Save As…” button. Enter default as the “Design Document” name, enter releases as the “View Name”, and then click the “Save” button. See Figure 4-1.

Alternatively, you can update the default design document to add the releases view using cURL:

curl -X PUT http://localhost:5984/books/_design/default -d \
'{
   "_id": "_design/default",
   "_rev": "3-9f27b209caab2d19cf07aac6d406568c",
   "language": "javascript",
   "views": {
       "titles": {
           "map":
"function(doc) {
   if (doc.title) {
       emit(doc.title, doc.pages);
   }
}",
           "reduce": "_stats"
       },
       "formats": {
           "map":
"function(doc) {
   if (doc.formats) {
       for (var i in doc.formats) {
           emit(doc.formats[i], doc.pages);
       }
   }
}",
           "reduce": "_stats"
       },
       "authors": {
           "map":
"function(doc) {
   if (doc.authors) {
       for (var i in doc.authors) {
           emit(doc.authors[i], doc.pages);
       }
   }
}",
           "reduce": "_stats"
       },
       "releases": {
           "map":
"function(doc) {
   if (doc.released) {
       emit(doc.released.split(\"-\"), doc.pages);
   }
}",
           "reduce": "_stats"
       }
   }
}'
Saving the releases view in the default design document using Futon

Figure 4-1. Saving the releases view in the default design document using Futon

The response:

{"ok":true,"id":"_design/default","rev":"4-7591c2802e00785281be2b4e408de52c"}

In Futon, select “releases” from the “View” drop-down menu under “default” (if you’re not already there), check the “Reduce” checkbox (if it’s not already checked), and select “exact” from the “Grouping” drop-down menu. See Figure 4-2.

Querying the releases view in the default design document using Futon with exact grouping

Figure 4-2. Querying the releases view in the default design document using Futon with exact grouping

To run this same query using cURL:

curl -X GET http://localhost:5984/books/_design/default/_view/releases -G \
-d group=true

The response:

{
   "rows":[
      {
         "key":[
            "1999",
            "10",
            "28"
         ],
         "value":{
            "sum":648,
            "count":1,
            "min":648,
            "max":648,
            "sumsqr":419904
         }
      },
      {
         "key":[
            "2007",
            "05",
            "08"
         ],
         "value":{
            "sum":448,
            "count":1,
            "min":448,
            "max":448,
            "sumsqr":200704
         }
      },
      {
         "key":[
            "2010",
            "01",
            "08"
         ],
         "value":{
            "sum":192,
            "count":1,
            "min":192,
            "max":192,
            "sumsqr":36864
         }
      },
      {
         "key":[
            "2010",
            "01",
            "19"
         ],
         "value":{
            "sum":272,
            "count":1,
            "min":272,
            "max":272,
            "sumsqr":73984
         }
      }
   ]
}

See Table 4-12 for the rows in tabular format.

Table 4-12. Reduced rows from the releases view with grouping

keyvalue
["1999","10","28"]{"sum":648,"count":1,"min":648,"max":648,"sumsqr":419904}
["2007","05","08"]{"sum":448,"count":1,"min":448,"max":448,"sumsqr":200704}
["2010","01","08"]{"sum":192,"count":1,"min":192,"max":192,"sumsqr":36864}
["2010","01","19"]{"sum":272,"count":1,"min":272,"max":272,"sumsqr":73984}

So far this isn’t a whole lot different than what we’ve seen before; we’re still grouping by the exact keys. Next, let’s group by just the year that books were released.

In Futon, select “releases” from the “View” drop-down menu under “default” (if you’re not already there), check the “Reduce” checkbox (if it’s not already checked), and select “level 1” from the “Grouping” drop-down menu. See Figure 4-3.

Querying the releases view in the default design document using Futon with level one grouping

Figure 4-3. Querying the releases view in the default design document using Futon with level one grouping

To run this same query using cURL:

curl -X GET http://localhost:5984/books/_design/default/_view/releases -G \
-d group=true \
-d group_level=1

The response:

{
   "rows":[
      {
         "key":[
            "1999"
         ],
         "value":{
            "sum":648,
            "count":1,
            "min":648,
            "max":648,
            "sumsqr":2500
         }
      },
      {
         "key":[
            "2007"
         ],
         "value":{
            "sum":448,
            "count":1,
            "min":448,
            "max":448,
            "sumsqr":200704
         }
      },
      {
         "key":[
            "2010"
         ],
         "value":{
            "sum":464,
            "count":2,
            "min":192,
            "max":272,
            "sumsqr":110848
         }
      }
   ]
}

See Table 4-13 for the rows in tabular format.

Table 4-13. Reduced rows from the releases view with level one grouping

keyvalue
["1999"]{"sum":648,"count":1,"min":648,"max":648,"sumsqr":419904}
["2007"]{"sum":448,"count":1,"min":448,"max":448,"sumsqr":200704}
["2010"]{"sum":464,"count":2,"min":192,"max":272,"sumsqr":110848}

Based on these results, we can see that in 1999 there was a total of 648 pages inside books released, 1 book was released, and the minimum and maximum pages of any book released was 648. In 2007, there was a total of 448 pages inside books released, 1 book was released, and the minimum and maximum number of pages in any book released was 448. In 2010, there was a total of 464 pages inside books released, 2 books were released, the minimum number of pages in any book released was 192, and the maximum number of pages in any book released was 272. Finally, let’s group by the year and month that books were released.

In Futon, select “releases” from the “View” drop-down menu under “default” (if you’re not already there), check the “Reduce” checkbox (if it’s not already checked), and select “level 2” from the “Grouping” drop-down menu (see Figure 4-4).

Querying the releases view in the default design document using Futon with level two grouping

Figure 4-4. Querying the releases view in the default design document using Futon with level two grouping

To run this same query using cURL:

curl -X GET http://localhost:5984/books/_design/default/_view/releases -G \
-d group=true \
-d group_level=2

The response:

{
   "rows":[
      {
         "key":[
            "1999",
            "10"
         ],
         "value":{
            "sum":648,
            "count":1,
            "min":648,
            "max":648,
            "sumsqr":419904
         }
      },
      {
         "key":[
            "2007",
            "05"
         ],
         "value":{
            "sum":448,
            "count":1,
            "min":448,
            "max":448,
            "sumsqr":200704
         }
      },
      {
         "key":[
            "2010",
            "01"
         ],
         "value":{
            "sum":464,
            "count":2,
            "min":192,
            "max":272,
            "sumsqr":110848
         }
      }
   ]
}

See Table 4-14 for the rows in tabular format.

Table 4-14. Reduced rows from the releases view with level two grouping

keyvalue
["1999","10"]{"sum":648,"count":1,"min":648,"max":648,"sumsqr":419904}
["2007","05"]{"sum":448,"count":1,"min":448,"max":448,"sumsqr":200704}
["2010","01"]{"sum":464,"count":2,"min":192,"max":272,"sumsqr":110848}

We could also group by year, month, and day (a level three grouping) but—for this particular example—this would effectively be the same as an exact grouping.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required