Raster File Formats

A raster image is one that is described by a grid of colored pixels. Even though particular image formats may use compression algorithms to store the image data differently, or alpha channels to create the illusion of non-rectangular shapes, all raster graphics translate into a rectangular grid of pixels.

Two techniques have been developed to make raster images easier to transfer over the Internet. Each image format has its own compression system, where advanced mathematics or data structures are used to exploit regularity in the image data so that the full image can be reconstructed from a smaller number of bytes. Many raster image files are arranged so that they may be displayed while partially downloaded (a technique called interlacing).

Table 1-2 gives you an idea of the relative effectiveness of each compression mechanism and the interlacing styles used by each format.

Table 1-2. Raster graphic compression and interlacing comparison

Format

Compression ratio

Compression algorithm

Method

Progressive display

Interlacing style

PNG

4:1 to 10:1

Deflate

lossless

Yes

Adam7

JPEG

5:1 to 100:1

JPEG

lossy

Yes

PJPEG

GIF

3:1 to 5:1

LZW

lossless

Yes

Scan line

The rest of this section describes compression and interlacing styles. The three raster image formats are then described in greater detail.

Compression

Image transmission is always a tradeoff between two limiting factors: the time it takes to transfer the image over the network and the time it takes to decode the image. JPEG, for example, is a highly compressed format that allows for small files and quick transmission times but requires longer to decode and display. The format works very well because generally the network is the bottleneck, with the average desktop computer perfectly able to perform the necessary decoding operations in a reasonable amount of time. The ideal is to achieve a very small file that can be very easily decoded. In practice, it is always a tradeoff.

Files can be compressed using “lossy” or “lossless” compression. People generally interpret the term “lossy” compression to mean that information is lost in the translation from source image to compressed image, and that this information loss results in a degraded image. This is true to a point. However, you could also argue that information is lost in the process of creating a GIF (a so-called “lossless” storage format) from a 24-bit source image, since the number of colors in the image must first be reduced from millions to 256. A more accurate definition of lossy would be something like “a compression algorithm that loses information about the source image during the compression process, and repeated inflation and compression results in further degradation of the image.”

JPEG is an example of a lossy compression format. PNG and GIF are both examples of “lossless” compression. A lossless compression algorithm is one that does not discard information about the source image during the compression process. Inflation of the compressed data exactly restores the source image data.

The distinction between these two methods of compressing image data affects the way you do your everyday work. For example, assume you have created a number of images for a web site, to be served as JPEGs (a lossy format) because they contain nice gradients that would look terrible as GIFs (and you haven’t explored the possibility of PNG yet). You create all these images in Photoshop (or even better, the Gimp) and save them as JPEGs, but neglect to save the original source files. What if your client wants the images cropped slightly differently? You have to re-open those JPEGs, edit the images, and re-save them. This would be a very painful way to learn the meaning of lossy compression, because the resulting images would be less smooth and the artifacts would make them less pleasing than the originals.

Generally a JPEG can be decoded and re-encoded, and, as long as the quality setting is the same, the image is not visibly degraded.[1] If you change any part of the image, however, the changed part loses even more information when it is re-encoded (see Figure 1-4). If an image is cropped or scaled to a different size, the entire image loses more information.

Repeated decoding and encoding of JPEGs can result in information loss
Figure 1-4. Repeated decoding and encoding of JPEGs can result in information loss

The GIF format uses a type of compression called LZW (Lempel-Ziv-Welch), which is also used by the Tagged Image File Format (TIFF). The enforcement of the patent on LZW has caused a lot of controversy, as we shall see in the later section GIF Animation for Fun and Profit.

The PNG file format was developed as an alternative to GIF. The compression algorithm used by PNG is actually a version of the Deflate algorithm used by the pkzip utility. Deflate is, in turn, a subset of the LZ77 class of compression algorithms (yes, that’s the same L and Z as in LZW compression). PNG’s compression method does not use any algorithms with legal restrictions, however. This is one of its major selling points.

The JPEG file format uses a custom compression system called JPEG compression. It works on completely different principles from GIF and PNG, and is explained in the JPEG section later in this chapter.

Interlacing

All three of the standard web graphics formats provide for the progressive display of an image as it is downloaded. The rationale for the further complication of an image file to support progressive display is that there is perceived to be a major improvement in download speed. Partial information about an entire image may be shown and the display refined as the image downloads, rather than displaying the final image one row at a time.

This capability is achieved by saving the pixels in a non-consecutive order. If the pixels are drawn in the order that they are decoded from the stream, the image is drawn as a grid of pixels that is progressively filled in with more information. Images with this sort of pixel ordering are said to be interlaced. Interlacing is implemented differently by different file formats.

Interlaced files tend to be slightly larger than non-interlaced files (except for progressive JPEGs, which tend to be slightly smaller). This is because most compression schemes make certain assumptions about the relationships of adjacent pixels in an image, and the interlacing process can disrupt this “natural” ordering of pixels that work well with compression algorithms. Interlacing can more than make up the slight difference in file size with a perceptual download speedup, however.

Scanline (GIF) interlacing

The image data for a GIF file is stored by the row (or scanline), with one byte representing each pixel. A non-interlaced GIF simply stores each scanline consecutively in the image data field of the GIF file. An interlaced GIF still groups pixels into scanlines, but the scanlines are stored in a different order. When the GIF file is encoded, the rows are read and saved in three passes; the even-numbered rows (using a 0-based counting system) are saved in the first four passes, and the odd-numbered rows are saved in the final pass. The interlacing algorithm looks like this, with each pixel coordinate labeled with the pass on which it is saved and rendered:

Row 0 11111111...
Row 1 44444444...
Row 2 33333333...
Row 3 44444444...
Row 4 22222222...
Row 5 44444444...
Row 6 33333333...
Row 7 44444444...

When the image is later reconstituted, the display client (e.g., web browser) usually temporarily fills in the intervening rows of pixels with the values of the nearest previously decoded rows, as you can see by looking at the progressive stages in an interlaced GIF display shown in Figure 1-5. The interlacing approach taken by the GIF format allows us to view a 1/8 vertical resolution version of the entire image after one pass of the display, 1/4 after two passes, 1/2 after three, and the complete image after the fourth. In many cases the user can interpret the image after only the first or second pass.

Interlacing provides a perceptual increase in download speeds by presenting a distribution of pixels as the image comes across the network
Figure 1-5. Interlacing provides a perceptual increase in download speeds by presenting a distribution of pixels as the image comes across the network

Adam7 (PNG) interlacing

PNG uses a slightly different interlacing scheme than GIF does. GIF completes the interlacing in four passes, where the first three passes count even scan lines. PNG uses a seven-pass scheme called Adam7 (named after its creator, Adam M. Costello), where the first six passes contribute to the even rows of pixels, and the seventh fills in the odd rows. Because PNG files do not necessarily have to store pixels in a scanline together, each pass contains only certain pixels from certain scanlines.

Graphically, this looks like the grid below, where each pixel in an 8 × 8 block is labeled with the pass on which it appears on the screen:

1 6 4 6 2 6 4 6
7 7 7 7 7 7 7 7
5 6 5 6 5 6 5 6
7 7 7 7 7 7 7 7
3 6 4 6 3 6 4 6
7 7 7 7 7 7 7 7
5 6 5 6 5 6 5 6
7 7 7 7 7 7 7 7

This scheme leads to a perceptual speed increase over the scanline interlacing used by GIF. After the first pass, only 1/64 of the image has been downloaded, but the entire image can be drawn with 8 × 8 pixel resolution blocks. After the second pass, 1/32 of the file has been transferred, and the image can be drawn at a 4 × 8 pixel block resolution. Small text in an image is readable after PNG’s 5th pass (25% of the file downloaded), which compares favorably with GIF’s interlacing gains, where small text is typically readable after the 3rd pass (50% of the file downloaded).

Progressive JPEGs

JPEG files may also be formatted for progressive display support. Progressive JPEG (PJPEG) is considered an extension of the JPEG standard, and the progressive display of PJPEGs is not fully implemented by all web clients.

The scanline interlacing techniques used by GIF and PNG are not applicable to JPEG files because JPEGs are a more abstract way of storing an image than a simple stream of pixels (it is more accurate to call a JPEG file a collection of DCT coefficients that describe a pixel stream, but that’s probably too much information). Essentially, a Progressive JPEG that is displayed as it is transferred over the network would first show the entire image as if it had been saved at a very low quality setting. On successive passes the image would resolve into the complete image, with the quality level at which it was saved.

Progressive JPEGs are not yet the most efficient means of progressive display, as the entire image must be decoded with each subsequent pass. The JPEG format can offer such high levels of compression, however, that progressive display is not as important as for other file formats.

PNG: An Open Standard for Web Graphics

In the “GNU’s Not Unix” tradition of self-referential acronyms, PNG may unofficially be taken to stand for “PNG’s Not GIF.” PNG was designed as an open standard alternative to GIF, and it plays that role very well. However, PNG will not completely replace GIF because PNG can store only one image per file,[2] and there are millions of web pages out there that are full of GIF images.

Because of patent issues, most Perl modules abandoned support for GIF in the late ’90s and retrofitted their code to support the PNG standard. For example, GIFgraph became PNGgraph (and eventually GD::Graph, documented in Chapter 4), and the GD library (see Chapter 2) starting using PNG, amongst other formats. ImageMagick (see Chapter 3) had always supported PNG and GIF, but stopped writing LZW-compressed GIFs by default. All of this helped to contribute to PNG’s popularity.

PNG is a well-written and flexible format. It supports indexed color images as well as 24-bit color images. It can also save a full alpha channel. It is best used as a GIF replacement, for images with text or spot colors, or for saving photographs without losing information. In general, however, JPEG allows a better balance between compression ratio and image quality for photographs. 24-bit photographic images in PNG are much larger.

JPEG: The P Stands for Photographic

JPEG stands for the Joint Photographic Experts Group (http://www.jpeg.org), which is the committee set up by the International Standards Committee that originally wrote the image format standard. The JPEG committee has the responsibility of determining the future of the JPEG format, but the actual JPEG software that makes up the toolkit used in most web applications is maintained by the Independent JPEG Group (http://www.ijg.org).

The JPEG standard actually defines only an encoding scheme for data streams, and not a specific file format. JPEG encoding is used in many different file formats (TIFF Version 6.0 and Macintosh PICT are two prominent examples), but the file format used on the Web is called JFIF. JFIF stands for JPEG File Interchange Format, which was developed by C-Cube Microsystems (http://www.c-cube.com) and placed in the public domain. JFIF became the de facto standard for web JPEGs because of its simplicity. When people talk about a JPEG web graphic, they are actually referring to a JPEG encoded data stream stored in the JFIF file format. In this book we refer to JFIF as JPEG to reduce confusion (or to further propagate it, depending on your point of view).

To create a JPEG you should start with a high-quality image sampled with a large bit depth (from 16 to 24 bits) for the best results. You should generally use JPEG encoding only on scanned photographs or continuous-tone images.

JPEG encoding takes advantage of the fuzzy way the human eye interprets light and colors in images by getting rid of certain information that is not perceived to create a much smaller image that is perceptually faithful to the original. The degree of information loss may vary so that the size of an encoded file may be adjusted at the expense of image quality to gain the optimum compression for your application. The quality of the resulting image is expressed in terms of a Q factor that may be set when the image is encoded. Most applications use an arbitrary scale of 1 to 100, where the lower numbers indicate small, lower quality files and the higher numbers larger, higher quality files. Note that a Q value of 100 does not mean that the encoding is completely lossless (although you won’t lose much). Also, the 1 to 100 scale is by no means standardized (the Gimp, for example, uses a 0 to 1.0 scale), but this is the scale used by the IJG software.

GIF Animation for Fun and Profit

The GIF file format is widely used, usually out of habit rather than for any technical virtues. In fact, the primary reason for not using GIFs is legal, not technical.

“What’s the deal with GIF—do I have to pay licensing fees?” is one of the more frequently asked questions about the GIF file format. In a nutshell, GIF is not free and unencumbered because CompuServe, the creators of GIF, used the LZW codec (algorithm) to implement its data compression. The Unisys Corporation owns the patent for the LZW algorithm (United States Patent No. 4,558,302) and requires a licensing fee for any software that uses the LZW codec.

The GIF file format does not allow the storage of uncompressed data or data compressed by different algorithms, so if you use GIF, you must use LZW. There is some confusion as to exactly what uses are covered by the patent, but Unisys has taken the matter to court a number of times. So most developers have moved to JPEG or PNG formats.

Unfortunately, neither PNG nor JPEG allows you to store multiple images in a single file, a capability that allows you to create efficient little animations. GIF is still good for this kind of animation, patents or no patents.

GIF animation can be thought of as flip-book animation. Just as a flip book is not an appropriate format for Fantasia, GIF animation is not an appropriate format for long works, large images, or sequences that require many frames per second to be effective. Also, any sort of presentation that necessitates interaction with the user requires another solution based in SWF, SVG, or a language like JavaScript or Java. GIF animation does have its strengths—for example, all browsers can display animated GIFs. Users don’t have to reconfigure their browsers or install a new plug-in as they might have to for SWF or SVG.

A full discussion of creating GIF animations with ImageMagick is found in Chapter 3.



[1] Actually, there is a form of lossless JPEG, but it has not been widely implemented.

[2] Multiple-image Network Graphics (MNG), a PNG variant capable of storing multiple images, has been in the works for years.

Get Perl Graphics Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.