Document databases

Document databases, such as Apache CouchDB and MongoDB, employ a document data model to store semi-structured and unstructured data. In this model, a document is used to encapsulate all the information pertaining to an object, usually in JavaScript Object Notation (JSON) format, meaning that a single document is self-describing. Since they are self-describing, different documents may have different schema. For example a document describing a movie item, as illustrated in the following JSON file, would have a different schema from a document describing a book item:

[   {        "title" : "The Imitation Game",        "year": 2014        "metadata" : {            "directors" : [ "Morten Tyldum"],            "release_date" : "2014-11-14T00:00:00Z",            "rating" : 8.0, "genres" ...

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.