Chapter 10

Accessing Data with Datasets and Iterators

IN THIS CHAPTER

check Creating and manipulating datasets

check Loading and storing TFRecord data

check Exploring four types of iterators

When you start out in machine learning, your fondest wish is to have your application converge to a solution. But as you progress in the field, you become more and more concerned with performance. Performance is especially important when your training data occupies gigabytes or terabytes of memory.

This chapter and the following two chapters focus on ways to improve TensorFlow’s performance — no more lengthy equations or geometric diagrams. Instead, I focus on capabilities that you can use to accelerate your applications. Two important capabilities are datasets and iterators, which make it easier to load and process input data.

Datasets

One effective method of improving an application’s performance involves creating threads. Modern processors have multiple cores, and developers can take advantage of them by splitting an application’s workload into threads. This multithreading becomes particularly helpful when an application needs to load a great deal of data.

In the past, TensorFlow developers created threads ...

Get TensorFlow For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.