Chapter 6. Time Series Data in Practical Machine Learning

With the increasing availability of large-scale data, machine learning is becoming a common tool that businesses use to unlock the potential value in their data. There are several factors at work to make machine learning more accessible, including the development of new technologies and practical approaches.

Many machine-learning approaches are available for application to time series data. We’ve already alluded to some in this book and in Practical Machine Learning: A New Look at Anomaly Detection, an earlier short book published by O’Reilly. In that book, we talked about how to address basic questions in anomaly detection, especially how determine what normal looks like, and how to detect deviations from normal.

Keep in mind that with anomaly detection, the machine-learning model is trained offline to learn what normal is and to set an adaptive threshold for anomaly alerts. Then new data, such as sensor data, can be assessed to determine how similar the new data is to what the model expects. The degree of mismatch to the model expectations can be used to trigger an alert that signals apparent faults or discrepancies as they occur. Sensor data is a natural fit to be collected and stored as a time series database. Sensors on equipment or system logs for servers can generate an enormous amount of time-based data, and with new technologies such as the Apache Hadoop–based NoSQL systems described in this book, it is now feasible ...

Get Time Series Databases: New Ways to Store and Access Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.