Step 2: Back Up Everything

This sounds like a given, right? It’s not. Certain types of data typically are excluded or forgotten. Many companies cut corners by omitting certain types of data from their backups. For example, by excluding the operating system from your backups, you may save a little media. However, if you find yourself in need of the old /etc/fstab, you will be out of luck. You may save some money, but you also may be putting your company at risk. It’s easier and safer just to back up everything. There also may be types of data that are forgotten completely. The most common mistake is to back up the data on a system but not to get a “picture” of what the system itself looks like in case you have to rebuild it.

Exclude Lists Good, Include Lists Bad

It is best to have a system that automatically backs up everything, except for a few explicit exceptions specified on an exclude list. If your backup system requires you to update an include list every time a new filesystem is added, you may forget or you may add it incorrectly; the result is that the filesystem does not get backed up. In a disaster, this means the data never comes back. This is why I prefer backup products that automatically back up all filesystems. (The concept of include and exclude lists is covered in Chapter 2.)

Databases

Backing up a database requires more work than backing up a normal filesystem. (Actual database backup procedures are covered in Part V of this book.) Theoretically, if you are backing up everything in your filesystems and you are backing up your databases in some manner, you should be able to recover from disaster. Unfortunately, there are scenarios in which you might leave out an essential piece of the disaster recovery puzzle. The only way to ensure that you are prepared to recover your databases in case of a disaster is to back them up to another machine.

In fact, a previous version of my Oracle backup script (see Chapter 15) did not back up the online redologs during a hot backup. All my backup and recovery tests worked fine, until I attempted to restore the database to a different system. We were able to restore all the database files, but the database needed the redologs in order to complete the recovery. Since we had not backed up the redologs, we did not have them to restore. You see, when I was recovering the database to the same system, the redologs were always there. (Of course, I immediately changed the script to address this problem.)

Backups of Your Backups

Whether you are using a homegrown solution that creates flat file indexes of your volumes or a commercial backup product that has a btree index, you need to be able to recover it easily. Think about it. Even if your commercial backup system makes volumes that can be read by native backup utilities, without the database that identifies what’s where, you have no idea what system is on what volume. That means that this database has now become the most important database in your company. You need to make sure that it is backed up, and its recovery should be the easiest and most tested recovery in your entire environment. Again, you need to test your recoveries on a different system. One problem here is that many of the licenses for commercial backup products are node-locked. This means that you may have problems recovering the backups of one system to another system. Sometimes you can prepare for this in advance with a backup key, although that can really cost you. Some products enable recovery but disable backup to a server that is not licensed. This allows you to begin your disaster recovery on a new server, even if the product is not licensed for that particular server.

Another difficulty with a number of commercial products is that the backup of the database does not include any of the executables. In that case, you have two choices. The first choice is the normal backup method, in which case you will have to reinstall the software and any patches prior to restoring its database. The second choice is to run a special dump, tar, or cpio backup of all filesystems on which the backup software and database reside. (These utilities are discussed in Chapter 3.)

Metadata

There are a number of types of metadata that may or may not be backed up by a normal backup system. You need to ensure that each of them is backed up in other ways. This data ranges from things that would be merely helpful in a disaster to those that will be essential. As you look over this list, you may begin to get the idea that a lot of this would be much easier if you standardize your system and disk layout. You would be right.

AIX’s LVM, Sun’s ODS, Veritas’s LVM

Each of these products is a logical volume manager that allows you to stripe disks together, perform software-based RAID (Redundant Array of Independent Disks) and mirroring, and do many other wonderful things. The problem is that each of these products needs to have its individual configuration stored somewhere. If you are concerned only with rebuilding filesystems, then the physical layout of the system itself may not be that important. You simply need to supply the system with similarly sized disks and recover your data. However, if you are running databases on raw partitions, you had better have a good backup of these configurations, so that you can re-create those raw partitions exactly the way they were before a disaster.

AIX’s mksysb, HP’s make_recovery

Some operating systems have special utilities that store all of the appropriate information for you. The only problem with all of these utilities is that you have to use them up front, and you have to do so every time the system configuration changes.

The root slice

If you are really backing up the root slice, then disaster recovery of a single system is simple. You can recover this data to a properly partitioned drive without installing the operating system. You could then easily accomplish a normal restore of the rest of the filesystems. (Bare-metal recovery is covered in detail in Part IV of this book.)

Partition tables

Whether or not you are using a logical volume manager, maintaining a printout of the physical layout of all of your disks is a big help. If you’re not running LVM, it is essential.

System layout—SysAudit or SysInfo

A lot of the preceding information is recorded for you if you use the SysAudit and SysInfo programs.

Get Unix Backup and Recovery now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.