You are previewing Professional NoSQL.

Professional NoSQL

Cover of Professional NoSQL by Shashank Tiwari Published by Wrox
  1. Cover
  2. Contents
  3. Introduction
  4. Part I: Getting Started
    1. Chapter 1: NoSQL: What It Is and Why You Need it
      1. Definition and Introduction
      2. Sorted Ordered Column-Oriented Stores
      3. Key/Value Stores
      4. Document Databases
      5. Graph Databases
      6. Summary
    2. Chapter 2: Hello NoSQL: Getting Initial Hands-on Experience
      1. First Impressions — Examining Two Simple Examples
      2. Working with Language Bindings
      3. Summary
    3. Chapter 3: Interfacing and Interacting with NoSQL
      1. If No SQL, Then What?
      2. Language Bindings for NoSQL Data Stores
      3. Summary
  5. Part II: Learning the NoSQL Basics
    1. Chapter 4: Understanding the Storage Architecture
      1. Working with Column-Oriented Databases
      2. HBase Distributed Storage Architecture
      3. Document Store Internals
      4. Understanding Key/Value Stores in Memcached and Redis
      5. Eventually Consistent Non-relational Databases
      6. Summary
    2. Chapter 5: Performing CRUD Operations
      1. Creating Records
      2. Accessing Data
      3. Updating and Deleting Data
      4. Summary
    3. Chapter 6: Querying NoSQL Stores
      1. Similarities Between SQL and MongoDB Query Features
      2. Accessing Data from Column-Oriented Databases Like HBase
      3. Querying Redis Data Stores
      4. Summary
    4. Chapter 7: Modifying Data Stores and Managing Evolution
      1. Changing Document Databases
      2. Schema Evolution in Column-Oriented Databases
      3. HBase Data Import and Export
      4. Data Evolution in Key/Value Stores
      5. Summary
    5. Chapter 8: Indexing and Ordering Data Sets
      1. Essential Concepts Behind a Database Index
      2. Indexing and Ordering in MongoDB
      3. Creating and Using Indexes in MongoDB
      4. Indexing and Ordering in CouchDB
      5. Indexing in Apache Cassandra
      6. Summary
    6. Chapter 9: Managing Transactions and Data Integrity
      1. RDBMS AND ACID
      2. Distributed ACID Systems
      3. Upholding CAP
      4. Consistency Implementations in a Few NoSQL Products
      5. Summary
  6. Part III: Gaining Proficiency with NoSQL
    1. Chapter 10: Using NoSQL in the Cloud
      1. Google App Engine Data Store
      2. Amazon SimpleDB
      3. Summary
    2. Chapter 11: Scalable Parallel Processing with MapReduce
      1. Understanding MapReduce
      2. MapReduce with HBase
      3. MapReduce Possibilities and Apache Mahout
      4. Summary
    3. Chapter 12: Analyzing Big Data with Hive
      1. Hive Basics
      2. Back to Movie Ratings
      3. Good Old SQL
      4. JOIN(s) in Hive QL
      5. Summary
    4. Chapter 13: Surveying Database Internals
      1. MongoDB Internals
      2. Membase Architecture
      3. Hypertable Under the Hood
      4. Apache Cassandra
      5. Berkeley DB
      6. Summary
  7. Part IV: Mastering NoSQL
    1. Chapter 14: Choosing Among NoSQL Flavors
      1. Comparing NoSQL Products
      2. Benchmarking Performance
      3. Contextual Comparison
      4. Summary
    2. Chapter 15: Coexistence
      1. Using MySQL as a NoSQL Solution
      2. Mostly Immutable Data Stores
      3. Web Frameworks and NoSQL
      4. Migrating from RDBMS to NoSQL
      5. Summary
    3. Chapter 16: Performance Tuning
      1. Goals of Parallel Algorithms
      2. Influencing Equations
      3. Partitioning
      4. Scheduling in Heterogeneous Environments
      5. Additional MapReduce Tuning
      6. HBase Coprocessors
      7. Leveraging Bloom Filters
      8. Summary
    4. Chapter 17: Tools and Utilities
      1. RRDTool
      2. Nagios
      3. Scribe
      4. Flume
      5. Chukwa
      6. Pig
      7. Nodetool
      8. OpenTSDB
      9. Solandra
      10. Hummingbird and C5t
      11. GeoCouch
      12. Alchemy Database
      13. Webdis
      14. Summary
  8. Appendix: Installation and Setup Instructions
O'Reilly logo

Chapter 7

Modifying Data Stores and Managing Evolution

WHAT’S IN THIS CHAPTER?

  • Managing data schema in document databases, column-oriented stores, and key/value databases
  • Maintaining data stores as the attributes of a data set evolves
  • Importing and exporting data

Over time, data evolves and changes; sometimes drastically and other times at a slower pace and in less radical ways. In addition, data often outlives a single application. Probably designed and structured with a specific use case in mind, data often gets consumed in ways never thought of originally.

The world of relational databases, however, doesn’t normally pay much heed to the evolution of data. It does provide ways to alter schema definitions and data types but presumes that, for the most part, the metadata remains static. It also assumes uniformity of structure is common across most types of data sets and believes in getting the schema right up front. Relational databases focus on effective storage of structured and dense data sets where normalization of data records is important.

Although the debate in this chapter isn’t whether RDBMS can adapt to change, it’s worth noting that modifying schemas and data types and merging data from two versions of a schema in an RDBMS is generally complex and involves many workarounds. For example, something as benign as adding a new column to an existing table (that has some data) could pose serious issues, especially if the new column needs to have unique values. Workarounds ...

The best content for your career. Discover unlimited learning on demand for around $1/day.