Introduction

An Anthropological Perspective

If you believe that as a species, communication advanced our evolution and position, let us take a quick look from cave paintings, to scrolls, to the printing press, to the modern day data storage industry.

Marked by the invention of disk drives in the 1950s, data storage advanced information sharing broadly. We could now record, copy, and share bits of information digitally. From there emerged superior CPUs, more powerful networks, the Internet, and a dizzying array of connected devices.

Today, every piece of digital technology is constantly sharing, processing, analyzing, discovering, and propagating an endless stream of zeros and ones. This web of devices tells us more about ourselves and each other than ever before.

Of course, to meet these information sharing developments, we need tools across the board to help. Faster devices, faster networks, faster central processing, and software to help us discover and harness new opportunities.

Often, it will be fine to wait an hour, a day, even sometimes a week, for the information that enriches our digital lives. But more frequently, it’s becoming imperative to operate in the now.

In late 2014, we saw emerging interest and adoption of multiple in-memory, distributed architectures to build real-time data pipelines. In particular, the adoption of a message queue like Kafka, transformation engines like Spark, and persistent databases like MemSQL opened up a new world of capabilities for fast business to understand real-time data and adapt instantly.

This pattern led us to document the trend of real-time analytics in our first book, Building Real-Time Data Pipelines: Unifying Applications and Analytics with In-Memory Architectures (O’Reilly, 2015). There, we covered the emergence of in-memory architectures, the playbook for building real-time pipelines, and best practices for deployment.

Since then, the world’s fastest companies have pushed these architectures even further with machine learning and predictive analytics. In this book, we aim to share this next step of the real-time analytics journey.

Get The Path to Predictive Analytics and Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.