Preprocessing Atari screen image frames

The Atari Gym environment produces observations which typically have a shape of 210x160x3, which represents a RGB (color) image of a width of 210 pixels and a height of 160 pixels. While the color image at the original resolution of 210x160x3 has more pixels and therefore more information, it turns out that often, better performance is possible with reduced resolution. Lower resolution means less data to be processed by the agent at every step, which translates to faster training time, especially on consumer grade computing hardware that you and I own.

Let's create a preprocessing pipeline that would take the original observation image (of the Atari screen) and perform the following operations:

Get Hands-On Intelligent Agents with OpenAI Gym now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.