O'Reilly logo
live online training icon Live Online training

Time series data

Architecture and use cases

Ted Malaska

The ongoing and steep increase in the number of internet-connected devices is inescapable, but traditional data processing pipelines are not well-equipped to deal with streaming data and other data whose defining dimension is time. This course will provide an overview of time series data. You will dive into real-world use cases and look at different patterns to get the most value from your datasets. This course is designed to help analysts, engineers, architects, and product managers get the most out of time series data.

What you'll learn-and how you can apply it

By the end of this live, online course, you'll understand:

  • How to store time series data for different use cases
  • How to learn from time series data with Spark and Spark MlLib
  • How to set up time series data to be accessed in real time

And you'll be able to:

  • Gain insight from your time series data
  • Increase accessibility to your time series data

This training course is for you because...

  • You are an analyst looking for new ideas of what is possible with time series data.
  • You are a software engineer who wants to use big data toolkits to handle time series in a way that maximises value.
  • You are an architect or product manager who wants to discover new use cases to get real value from time series data

Prerequisites

The following are required to make the best out of this class:

  • Have use cases involving time series data
  • Understanding of basic time series data models

About your instructor

  • Ted is working on the Battle.net team at Blizzard, helping support great titles like World of Warcraft, Overwatch, HearthStone, and much more. Previously, he was a Principal Solutions Architect at Cloudera, helping clients succeed with Hadoop and the Hadoop ecosystem. Previously, he was a Lead Architect at the Financial Industry Regulatory Authority (FINRA). He has also contributed code to Apache Flume, Apache Avro, Apache Yarn, Apache HDFS, Apache Spark, Apache Sqoop, and many more. Ted is also a co-author of O’Reilly “Hadoop Application Architectures” and a frequent speaker at many conferences, and a frequent blogger on data architectures.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

DAY ONE

  • Overview of time series (25 minutes)
  • Breakdown of common time series use cases (10 minutes)
  • Back group of distributed execution (30 minutes)
  • Back group of storage formats (30 minutes)
  • Summary of trick from execution and storage (10 minutes)
  • The different types of implementation for each use case (15 minutes)
  • Batch
  • Streaming NRT
  • Time Series DB
  • Machine Learning
  • Use case: Rolling averages, counts, stddev (30 minutes)
  • Develop use cases for class to work on during break (25 minutes)

DAY TWO

  • Use case: Frequency (30 minutes)
  • Use case: Comparing curves (30 minutes)
  • Use case: Causation (40 minutes)
  • Use case: NGrams of events: finding patterns (20 minutes)
  • Use case: NRT alerting based on trends (30 minutes)
  • Review class use cases and code (20 minutes)
  • Review of execution and storage principles (10 minutes)