You are previewing Planning for Big Data.

Planning for Big Data

Cover of Planning for Big Data by Edd Dumbill Published by O'Reilly Media, Inc.
  1. Planning for Big Data
  2. Introduction
  3. 1. The Feedback Economy
    1. Data-Obese, Digital-Fast
    2. The Big Data Supply Chain
      1. Data collection
      2. Ingesting and cleaning
      3. Hardware
      4. Platforms
      5. Machine learning
      6. Human exploration
      7. Storage
      8. Sharing and acting
      9. Measuring and collecting feedback
    3. Replacing Everything with Data
    4. A Feedback Economy
  4. 2. What Is Big Data?
    1. What Does Big Data Look Like?
      1. Volume
      2. Velocity
      3. Variety
    2. In Practice
      1. Cloud or in-house?
      2. Big data is big
      3. Big data is messy
      4. Culture
      5. Know where you want to go
  5. 3. Apache Hadoop
    1. The Core of Hadoop: MapReduce
    2. Hadoop’s Lower Levels: HDFS and MapReduce
    3. Improving Programmability: Pig and Hive
    4. Improving Data Access: HBase, Sqoop, and Flume
      1. Getting data in and out
    5. Coordination and Workflow: Zookeeper and Oozie
    6. Management and Deployment: Ambari and Whirr
    7. Machine Learning: Mahout
    8. Using Hadoop
  6. 4. Big Data Market Survey
    1. Just Hadoop?
    2. Integrated Hadoop Systems
      1. EMC Greenplum
      2. IBM
      3. Microsoft
      4. Oracle
      5. Availability
    3. Analytical Databases with Hadoop Connectivity
      1. Quick facts
    4. Hadoop-Centered Companies
      1. Cloudera
      2. Hortonworks
      3. An overview of Hadoop distributions (part 1)
      4. An overview of Hadoop distributions (part 2)
    5. Notes
  7. 5. Microsoft’s Plan for Big Data
    1. Microsoft’s Hadoop Distribution
    2. Developers, Developers, Developers
    3. Streaming Data and NoSQL
    4. Toward an Integrated Environment
    5. The Data Marketplace
    6. Summary
  8. 6. Big Data in the Cloud
    1. IaaS and Private Clouds
    2. Platform solutions
      1. Amazon Web Services
      2. Google
      3. Microsoft
    3. Big data cloud platforms compared
    4. Conclusion
    5. Notes
  9. 7. Data Marketplaces
    1. What Do Marketplaces Do?
    2. Infochimps
    3. Factual
    4. Windows Azure Data Marketplace
    5. DataMarket
    6. Data Markets Compared
    7. Other Data Suppliers
  10. 8. The NoSQL Movement
    1. Size, Response, Availability
    2. Changing Data and Cheap Lunches
    3. The Sacred Cows
    4. Other features
    5. In the End
  11. 9. Why Visualization Matters
    1. A Picture Is Worth 1000 Rows
    2. Types of Visualization
      1. Explaining and exploring
    3. Your Customers Make Decisions, Too
    4. Do Yourself a Favor and Hire a Designer
  12. 10. The Future of Big Data
    1. More Powerful and Expressive Tools for Analysis
    2. Streaming Data Processing
    3. Rise of Data Marketplaces
    4. Development of Data Science Workflows and Tools
    5. Increased Understanding of and Demand for Visualization
  13. About the Author
  14. Copyright
O'Reilly logo

Chapter 8. The NoSQL Movement

By Mike Loukides

In a conversation last year, Justin Sheehy, CTO of Basho, described NoSQL as a movement, rather than a technology. This description immediately felt right; I’ve never been comfortable talking about NoSQL, which when taken literally, extends from the minimalist Berkeley DB (commercialized as Sleepycat, now owned by Oracle) to the big iron HBase, with detours into software as fundamentally different as Neo4J (a graph database) and FluidDB (which defies description).

But what does it mean to say that NoSQL is a movement rather than a technology? We certainly don’t see picketers outside Oracle’s headquarters. Justin said succinctly that NoSQL is a movement for choice in database architecture. There is no single overarching technical theme; a single technology would belie the principles of the movement.

Think of the last 15 years of software development. We’ve gotten very good at building large, database-backed applications. Many of them are web applications, but even more of them aren’t. “Software architect” is a valid job description; it’s a position to which many aspire. But what do software architects do? They specify the high level design of applications: the front end, the APIs, the middleware, the business logic--the back end? Well, maybe not.

Since the 80s, the dominant back end of business systems has been a relational database, whether Oracle, SQL Server or DB2. That’s not much of an architectural choice. Those are all great products, ...

The best content for your career. Discover unlimited learning on demand for around $1/day.