There are a variety of approaches you can take to help protect your system against the ill effects of system crashes, including the following:
Providing component redundancy
Using Real Application Clusters (formerly named Oracle Parallel Server)
Using Transparent Application Failover software services
As basic protection, the various hardware components that make up the database server itself must be fault-tolerant. Fault-tolerance, as the name implies, allows the overall hardware system to continue to operate even if one of its components fails. This feature, in turn, implies redundant components and the ability to detect component failure and seamlessly integrate the failed component’s replacement. The major system components that should be fault-tolerant include the following:
Disk failure is the largest area of exposure for hardware failure, since disks have the shortest times between failure of any of the components in a computer system. Disks also present the greatest variety of redundant solutions, so discussing that type of failure in detail should provide the best example of how high availability can be implemented with hardware.
Disk failure is the most common cause of system failure. Although the mean time to failure of an individual disk drive is very high, the ever-increasing number of disks used for today’s very large ...