Recovering an OPS Database

In a parallel server environment, the following types of failures may occur:

  • Node failure

  • Instance failure

  • Crash failure

  • Integrated Distributed Lock Manager (IDLM) failure

  • GMS failure

  • Media failure

These types are described in the following sections.

Some of these failures—for example, an instance failure or a media failure—also can occur in a standalone instance environment. Other types of failures—for example, an IDLM failure or a GMS failure—are specific to an OPS environment. You may need to perform a database recovery as a result of any one of these failures.

Node Failure and Recovery

A node may fail because of a power outage, operating system crash, or any other event on the node that makes it nonfunctional. Failure of a node causes the instance, the IDLM processes, and the GMS process running on that node to fail. The recovery from a node failure consists of instance recovery, IDLM recovery, and GMS recovery. A surviving instance will perform instance and IDLM recovery. You will have to restart GMS and the instance manually after you have diagnosed and corrected the cause of the node failure.

Instance Failure and Recovery

When one or more of the Oracle background processes for an instance fails or dies or when the SGA for an instance is lost, the instance will stop running. This type of failure is called an instance failure. Issuing a SHUTDOWN ABORT command also causes instance failure.

The process of recovering from instance failure is called instance recovery ...

Get Oracle Parallel Processing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.