Introduction

Overview and Organization of This Book

Dealing with streaming data involves a lot of moving parts and drawing from many different aspects of software development and engineering. On the one hand, streaming data requires a resilient infrastructure that can move data quickly and easily. On the other, the need to have processing “keep up” with data collection and scale to accommodate larger and larger data streams imposes some restrictions that favor the use of certain types of exotic data structures. Finally, once the data has been collected and processed, what do you do with it? There are several immediate applications that most organizations have and more are being considered all the time.This book tries to bring together all of these aspects of streaming data in a way that can serve as an introduction to a broad audience while still providing some use to more advanced readers. The hope is that the reader of this book would feel confident taking a proof-of-concept streaming data project in their organization from start to finish with the intent to release it into a production environment. Since that requires the implementation of both infrastructure and algorithms, this book is divided into two distinct parts.

Part I, “Streaming Analytics Architecture,” is focused on the architecture of the streaming data system itself and the operational aspects of the system. If the data is streaming but is still processed in a batch mode, it is no longer streaming data. It is ...

Get Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.