Engineers can be a paranoid sort (but you didn’t hear that from me). At least I am. It comes from decades of seeing things go terribly wrong, I suppose. When I create a CD backup of my hard drive, for instance, there’s still something a bit too magical about the process to trust the CD writer program to do the right thing. Maybe I should, but it’s tough to have a lot of faith in tools that occasionally trash files and seem to crash my Windows machine every third Tuesday of the month. When push comes to shove, it’s nice to be able to verify that data copied to a backup CD is the same as the original—or at least to spot deviations from the original—as soon as possible. If a backup is ever needed, it will be really needed.
Because data CDs are accessible as simple directory trees in the
file system, we are once again in the realm of tree walkers—to verify
a backup CD, we simply need to walk its top-level directory. If our
script is general enough, we will also be able to use it to verify
other copy operations as well—e.g., downloaded tar files, hard-drive
backups, and so on. In fact, the combination of the
cpall script of the prior section and a
general tree comparison would provide a portable and scriptable way to
copy and verify data sets.
We’ve already studied generic directory tree walkers, but they won’t help us here directly: we need to walk two directories in parallel and inspect common files along the way. Moreover, walking either one of the two directories ...