Appendix C

Data Compression Using ZIP

ZIP is a simple matching algorithm using two sliding windows, called the base window and the look-ahead window. These two windows are placed side-by-side on the data file, where the look-ahead window goes ahead of the base window. ZIP scans the entire file by sliding these two windows and encoding data on the fly. In particular, ZIP finds the longest prefix bapp03-math-0001 of the data string contained in the look-ahead window that also appears in the base window. This string in the look-ahead window (if found) is a copy of bapp03-math-0002 in the base window, and so it can be uniquely identified by two attributes: (1) the distance between the location of the first character of bapp03-math-0003 in the base window and the location of the first character in the look-ahead window and (2) the length of bapp03-math-0004. If the space needed to hold the values of these two attributes is smaller than the space needed to hold bapp03-math-0005, we obtain a saving of space.

To implement this idea, we will need to distinguish the binary values ...

Get Introduction to Network Security, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.