You are previewing Big Data Analytics: Turning Big Data into Big Money.

Big Data Analytics: Turning Big Data into Big Money

Cover of Big Data Analytics: Turning Big Data into Big Money by Frank J. Ohlhorst Published by John Wiley & Sons
  1. Cover
  2. Contents
  3. Title
  4. Copyright
  5. Preface
  6. Acknowledgments
  7. Chapter 1: What is Big Data?
    1. The Arrival of Analytics
    2. Where is the Value?
    3. More to Big Data Than Meets the Eye
    4. Dealing with the Nuances of Big Data
    5. An Open Source Brings Forth Tools
    6. Caution: Obstacles Ahead
  8. Chapter 2: Why Big Data Matters
    1. Big Data Reaches Deep
    2. Obstacles Remain
    3. Data Continue to Evolve
    4. Data and Data Analysis are Getting More Complex
    5. The Future is Now
  9. Chapter 3: Big Data and the Business Case
    1. Realizing Value
    2. The Case for Big Data
    3. The Rise of Big Data Options
    4. Beyond Hadoop
    5. With Choice Come Decisions
  10. Chapter 4: Building the Big Data Team
    1. The Data Scientist
    2. The Team Challenge
    3. Different Teams, Different Goals
    4. Don’t Forget the Data
    5. Challenges Remain
    6. Teams versus Culture
    7. Gauging Success
  11. Chapter 5: Big Data Sources
    1. Hunting for Data
    2. Setting the Goal
    3. Big Data Sources Growing
    4. Diving Deeper into Big Data Sources
    5. A Wealth of Public Information
    6. Getting Started with Big Data Acquisition
    7. Ongoing Growth, No End in Sight
  12. Chapter 6: The Nuts and Bolts of Big Data
    1. The Storage Dilemma
    2. Building a Platform
    3. Bringing Structure to Unstructured Data
    4. Processing Power
    5. Choosing among In-house, Outsourced, or Hybrid Approaches
  13. Chapter 7: Security, Compliance, Auditing, and Protection
    1. Pragmatic Steps to Securing Big Data
    2. Classifying Data
    3. Protecting Big Data Analytics
    4. Big Data and Compliance
    5. The Intellectual Property Challenge
  14. Chapter 8: The Evolution of Big Data
    1. Big Data: The Modern Era
    2. Today, Tomorrow, and the Next Day
    3. Changing Algorithms
  15. Chapter 9: Best Practices for Big Data Analytics
    1. Start Small with Big Data
    2. Thinking Big
    3. Avoiding Worst Practices
    4. Baby Steps
    5. The Value of Anomalies
    6. Expediency versus Accuracy
    7. In-Memory Processing
  16. Chapter 10: Bringing it All Together
    1. The Path to Big Data
    2. The Realities of Thinking Big Data
    3. Hands-on Big Data
    4. The Big Data Pipeline in Depth
    5. Big Data Visualization
    6. Big Data Privacy
  17. Appendix: Supporting Data
    1. “The MapR Distribution for Apache Hadoop”
    2. “High Availability: No Single Points of Failure”
  18. About the Author
  19. Index
O'Reilly logo

Chapter 1

What Is Big Data?

What exactly is Big Data? At first glance, the term seems rather vague, referring to something that is large and full of information. That description does indeed fit the bill, yet it provides no information on what Big Data really is.

Big Data is often described as extremely large data sets that have grown beyond the ability to manage and analyze them with traditional data processing tools. Searching the Web for clues reveals an almost universal definition, shared by the majority of those promoting the ideology of Big Data, that can be condensed into something like this: Big Data defines a situation in which data sets have grown to such enormous sizes that conventional information technologies can no longer effectively handle either the size of the data set or the scale and growth of the data set. In other words, the data set has grown so large that it is difficult to manage and even harder to garner value out of it. The primary difficulties are the acquisition, storage, searching, sharing, analytics, and visualization of data.

There is much more to be said about what Big Data actually is. The concept has evolved to include not only the size of the data set but also the processes involved in leveraging the data. Big Data has even become synonymous with other business concepts, such as business intelligence, analytics, and data mining.

Paradoxically, Big Data is not that new. Although massive data sets have been created in just the last two years, Big ...

The best content for your career. Discover unlimited learning on demand for around $1/day.