Chapter 3. Data Sources

In order to visualize streaming data, you need to identify and connect to a data source. There are a few publicly available data sources that are always streaming data, such as the examples mentioned in Chapter 1, but these are unlikely to meet your goals or needs. There is a big gap between all the unexploited possibilities of streaming data and data that is actually being streamed for a purpose.

No one will know your data better than you do. This chapter will help you determine what data is worth streaming and how to access it. A lot of systems have streaming data but don’t easily allow it to be used that way. After reading this chapter, you should be able to identify how you can tap into data that isn’t officially streaming yet.

Data Source Types

Any data can be streamed, but some types lend themselves better to downstream actions like visualizing than others. It’s a good idea to know what you are working with prior to diving in. The easiest data to work with as a stream is atomic and structured. Atomic means that it has a clear beginning and end. We will refer to atomic data as “messages.” Structured data is parsed into various fields and values with a consistent schema. A schema defines the structure for the data: what fields exist, what data they contain, and if there is any hierarchy or relationships. Structured data allows for the most ...

Get Visualizing Streaming Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.