O'Reilly logo

Writing and Querying MapReduce Views in CouchDB by Bradley Holt

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4. Querying Views

In Chapter 3 we saw how to save views to a design document. Now that you have created views, you can query the data that is held in them. Here are several of the things you can do when querying views in CouchDB:

  • If a Reduce function is defined for your view, you can specify whether to reduce the results.

  • You can query for all rows within a view, a single contiguous range of rows within a view, or even a row or rows matching a specified key within a view. You cannot query for multiple ranges.

  • You can limit results to a specified number of rows, and you can specify a number of rows to be skipped.

  • You can return results in ascending or descending order.

  • You can group rows by keys or by parts of keys.

  • You can ask CouchDB to include the original document with each row from which that row was emitted.

  • You can tell CouchDB that you’re OK with stale results. This means that CouchDB may not refresh any of the view’s data, potentially giving you outdated results. This option can be used to improve performance.

When querying views from Futon, you can choose whether to run the Reduce step. If the Reduce step is run, you can choose whether you want grouping, and whether the grouping should be based on the “exact” key or only part of the key. Results are paginated, so Futon effectively lets you limit and skip rows in your results. Futon lets you reverse the order of results. Futon also lets you tell CouchDB that you’re OK with stale results.

The current version of Futon doesn’t let you specify a range for your query, nor does it allow you to ask CouchDB to include the original document in your results, although Futon does provide a hyperlink to a representation of that document. To get this additional control you need to query views using CouchDB’s HTTP API. You can do this using cURL, so most of the examples in this chapter will only be provided in cURL. Your view query options are controlled by query parameters added to your view’s URL. See Table 4-1 for a list of available query parameters. All parameters are optional.

Table 4-1. View query options

ParameterDescription
reduceIf a Reduce function is defined, this parameter lets you choose whether or not to run the Reduce step. The default value is true.
startkeyA URL encoded JSON value indicating the key at which to start the range.
startkey_docidThe ID of the document with which to start the range. This parameter is, for all intents and purposes, ignored if it is not used in conjunction with the startkey parameter. CouchDB will first look at the startkey parameter and then use the startkey_docid parameter to further refine the beginning of the range if multiple potential starting rows have the same key but different document IDs.
endkeyA URL encoded JSON value indicating the key at which to end the range.
endkey_docidThe ID of the document with which to end the range. This parameter is, for all intents and purposes, ignored if it is not used in conjunction with the endkey parameter. CouchDB will first look at the endkey parameter and then use the endkey_docid parameter to further refine the end of the range if multiple potential ending rows have the same key but different document IDs.
inclusive_endIndicates whether to include the endkey and endkey_docid (if set) in the range. The default value is true.
keyA URL encoded JSON value indicating an exact key on which to match.
limitThe maximum number or rows to include in the output.
skipThe number of rows to skip. The default value is 0.
descendingIndicates whether to reverse the output to be in descending order. The default value is false. This option is applied before rows are filtered, so you will likely need to swap your startkey/startkey_docid and endkey/endkey_docid parameter values.
groupIndicates whether to group results by keys. This parameter can only be true if a Reduce function is defined for your view and the reduce parameter is set to true (its default). The default value of this parameter is false.
group_levelIf your keys are JSON arrays, this parameter will specify how many elements in those arrays to use for grouping purposes. If your keys are not JSON arrays, this parameter’s value will effectively be ignored.
include_docsIndicates whether to fetch the original document from which the row was emitted. This parameter can only be true if a Reduce function is not defined for your view, or the reduce parameter is set to false. The default value of this parameter is false.
staleSet the value of this parameter to ok if you are OK with possibly getting outdated results. Set the value to update_after if you are OK with possibly getting outdated results, but would like to trigger a view update after your results have been retrieved.

Range Queries

A range query allows you to control resultant rows by starting and/or ending key and, optionally, document ID. If no startkey or endkey is defined, all rows from the view will be included in your results. You can also specify an exact key on which to match.

Rows by Start and End Keys

Let’s get a list of authors whose names begin with the letter “j”:

Warning

The --data-urlencode switch was added to cURL in version 7.18.0. If you are using a version of cURL that is older than 7.18.0, you will need to replace the --data-urlencode switch with the -d switch and manually URL encode the data on the right side of the equals sign. You can find out which version of cURL you are using by running curl -V. Check out Eric Meyer’s online URL Decoder/Encoder. To apply this to the example to follow, you could replace --data-urlencode startkey='"j"' with -d startkey='%22j%22', and replace --data-urlencode endkey='"j\ufff0"' with -d endkey='%22j%5Cufff0%22'.

curl -X GET http://localhost:5984/books/_design/default/_view/authors -G \
-d reduce=false \
--data-urlencode startkey=\
'"j"' \
--data-urlencode endkey=\
'"j\ufff0"'

Note

As you may remember, string comparison in CouchDB is implemented according to the Unicode Collation Algorithm. This has a couple of practical implications when you are searching for a range of strings. Strings are case sensitive, and the lower case version of a letter will sort before the upper case version. This is why we used a lower case “j” instead of an upper case “J” as the startkey in the previous example. We could have used “jz” as the endkey, but \ufff0 represents a high value unicode character. Using “j\ufff0” as the endkey ensures that we account for non-Latin characters.

The response:

{
   "total_rows":7,
   "offset":0,
   "rows":[
      {
         "id":"978-0-596-15589-6",
         "key":"J. Chris Anderson",
         "value":272
      },
      {
         "id":"978-0-596-15589-6",
         "key":"Jan Lehnardt",
         "value":272
      }
   ]
}

See Table 4-2 for the rows in tabular format.

Table 4-2. Rows from the authors view, filtered by start and end keys

keyidvalue
"J. Chris Anderson""978-0-596-15589-6"272
"Jan Lehnardt""978-0-596-15589-6"272

You can optionally use cURL’s -v switch to see the details of the request:

curl -v -X GET http://localhost:5984/books/_design/default/_view/authors -G \
-d reduce=false \
--data-urlencode startkey=\
'"j"' \
--data-urlencode endkey=\
'"j\ufff0"'

This will let you see that cURL is URL encoding the JSON values for you:

?reduce=false&startkey=%22j%22&endkey=%22j%5Cufff0%22

Rows by Key

Let’s get a book format by key:

curl -X GET http://localhost:5984/books/_design/default/_view/formats -G \
-d reduce=false \
--data-urlencode key=\
'"Print"'

The response:

{
   "total_rows":7,
   "offset":2,
   "rows":[
      {
         "id":"978-0-596-15589-6",
         "key":"Print",
         "value":272
      },
      {
         "id":"978-0-596-52926-0",
         "key":"Print",
         "value":448
      },
      {
         "id":"978-1-565-92580-9",
         "key":"Print",
         "value":648
      }
   ]
}

See Table 4-3 for the rows in tabular format.

Table 4-3. Rows from the formats view, filtered by key

keyidvalue
"Print""978-0-596-15589-6"272
"Print""978-0-596-52926-0"448
"Print""978-1-565-92580-9"648

Let’s see the same query reduced:

curl -X GET http://localhost:5984/books/_design/default/_view/formats -G \
--data-urlencode key=\
'"Print"'

The response:

{
   "rows":[
      {
         "key":null,
         "value":{
            "sum":1368,
            "count":3,
            "min":272,
            "max":648,
            "sumsqr":694592
         }
      }
   ]
}

See Table 4-4 for the row in tabular format.

Table 4-4. Reduced row from the formats view with no grouping, filtered by key

keyvalue
null{"sum":1368,"count":3,"min":272,"max":648,"sumsqr":694592}

Rows by Start and End Keys and Document IDs

Let’s get a list of book formats within a range of keys and document IDs. If you remember, the document IDs are the books’ ISBNs:

curl -X GET http://localhost:5984/books/_design/default/_view/formats -G \
-d reduce=false \
--data-urlencode startkey=\
'"Ebook"' \
--data-urlencode startkey_docid=\
'978-0-596-52926-0' \
--data-urlencode endkey=\
'"Print"' \
--data-urlencode endkey_docid=\
'978-0-596-52926-0'

The response:

{
   "total_rows":7,
   "offset":1,
   "rows":[
      {
         "id":"978-0-596-52926-0",
         "key":"Ebook",
         "value":448
      },
      {
         "id":"978-0-596-15589-6",
         "key":"Print",
         "value":272
      },
      {
         "id":"978-0-596-52926-0",
         "key":"Print",
         "value":448
      }
   ]
}

See Table 4-5 for the rows in tabular format.

Table 4-5. Rows from the formats view, filtered by start and end keys and document IDs

keyidvalue
"Ebook""978-0-596-52926-0"448
"Print""978-0-596-15589-6"272
"Print""978-0-596-52926-0"448

Note

The actual key in CouchDB’s B-tree index is not just the key emitted from your Map function, but a combination of the key and the document’s ID. You may have multiple rows with the same key in a view, as is the case with the book formats view. The startkey_docid and endkey_docid parameters allow you to be more specific about the starting and ending rows of your range. Think of the startkey_docid parameter as a way to add specificity to the startkey parameter, and the endkey_docid parameter as a way to add specificity to the endkey parameter.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required