Chapter 10. Switching Gears

Until now, this book has focused mainly on specific data compression algorithms and how they generally work. Even though it’s all highly informative, unless you’re trying to write your own breakthrough data compressor, it’s primarily useful as a foundation for understanding and compressing your data. So, we’d like to switch gears and talk about the pragmatic points of data compression, and how they relate to you, the projects you develop, and the world at hand.

There are two types of compression out there right now: media-specific and general-purpose. Let’s look at each of them.

Media-Specific Compression

Media specific compressors are designed specifically for media data such as images, audio, video, and the like. Most likely, these types of files and compressors make up the majority of content your applications send, receive, manipulate, store, and display to users. The old saying, “A picture is worth a thousand words,” is quite literally true when it comes to data compression: a 1024 x 1024 RGB image is 3 MB of data. If you assume ASCII-encoded letters, you could display 3,145,728 letters for that same size. To put that into context, the famous book The Hobbit is made up of 95,022 words. If you assume an average word size of 5 letters, that’s roughly 475,110 characters. You could fit that book about 6 times into a single 1024 × 1024 image.

This is why most media compressors employ lossy compression algorithms. Lossy compression algorithms are types ...

Get Understanding Compression now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.