Oracle provides several methods for accessing data that is distributed among two or more database instances. All of these methods provide location transparency , which means that users and applications can manipulate data as though it were all in one single database instance. These various methods are summarized here and are described in detail throughout this book.
The Oracle export and import utilities (illustrated in Figure 1.4) are the most primitive
method of sharing data among databases and are also used as part of
a backup and recovery strategy. Export (
creates a file that is essentially a set of SQL statements that
invoke the DDL (Data Description Language) and DML (Data
Manipulation Language) required to create objects and insert data.
imp) is the utility that reads this
file and executes the SQL statements to re-create the objects and
populate tables. A full database export creates a file that you can
use to re-create the entire database.
Unlike any of the other options, export and import are static. An export file contains the data from the time of the export and cannot be updated. In fact, an export file could easily be out of date before the export job is finished. In addition, you must specify the export option CONSISTENT=Y in order for all of the data in the export file to be consistent as of a single point in time. Exports are only one part of a comprehensive backup strategy.
Database links are the invisible glue that makes location transparency possible. In more technical terms, a database link defines a connection from one database instance to another, and this definition is stored in the Oracle data dictionary. Since database link connections log in to a normal account in the remote database instance, you have complete control over its privileges and quotas.
Used in conjunction with synonyms, database links (shown in Figure 1.5) can make remote objects appear to be local as far as applications and users are concerned.
If your inventory application at a manufacturing site needs to reference the VENDORS table at headquarters, you could provide location transparency with the following three SQL statements:
CREATE PUBLIC DATABASE LINK D8CA.BIGWHEEL.COM USING 'hqaccounting.bigwheel.com' CREATE PUBLIC SYNONYM vendors FOR vendors@D8CA.BIGWHEEL.COM GRANT SELECT ON vendors TO inventory_reader
Since the CREATE DATABASE LINK statement in this example creates a PUBLIC link without specifying an account to connect to in the D8CA.BIGWHEEL.COM database, this particular implementation assumes that every application user in the inventory database has an account in the remote database with the same password and with privileges to see the VENDORS table. If the remote database is unavailable, the VENDORS table also will be unavailable.
Of course, there are several ways to provide location transparency; these are described in greater detail later in this book.
If you have an application that cannot risk a dependency on the availability of a remote database, you could use a read-only snapshot (shown in Figure 1.6). A read-only snapshot is essentially a local table whose data is refreshed at specified intervals by performing a query against one or more remote tables. The inventory application could create the same functionality as the database link described in the previous section by following these steps:
CREATE PUBLIC DATABASE LINK D8CA.BIGWHEEL.COM USING 'hqaccounting.bigwheel.com' CREATE SNAPSHOT vendors REFRESH COMPLETE START WITH SYSDATE NEXT TRUNC(sysdate + 1) + 10/1440 AS SELECT vendor_id, company_name FROM vendors@D8CA.BIGWHEEL.COM CREATE PUBLIC SYNONYM vendors FOR vendors GRANT SELECT ON vendors TO inventory_reader
This snapshot is populated when the CREATE SNAPSHOT statement
executes, and is then refreshed every day from that point on at 10
minutes after midnight. Again, this is just one example of how the
technique could be implemented; the details come later. Snapshots
use the Oracle built-in package DBMS_JOB to schedule refreshes and
JOB_QUEUE_PROCESSES to be greater than zero.
The benefit of read-only snapshots over database links and public synonyms is that the snapshot is available even when the remote site is not. The disadvantages are that the data is neither real time nor updateable.
Oracle introduced read-only snapshots with Oracle Version 7.0. The infrastructure this feature required has been expanded with each subsequent release, with additional functionality such as updateable snapshots and advanced replication. The base components include the job queue and triggers. The feature set is continuing to expand.
If your application needs to change data in a snapshot and send the changes back to the master site, you can use updateable snapshots, shown in Figure 1.7. A trigger on the snapshot table logs updates that are applied at the master site when the snapshot refreshes. Updateable snapshots require the advanced replication facilities. A common use of updateable snapshots is an application that consolidates data from various sites into a single master site. For example, a bicycle company might collect sales transactions from its distributors every night, or travelling salespeople might enter customer leads on their laptops and upload this information to the headquarters database when they return to the office.
Two important characteristics of updateable snapshots, which distinguish them from multi-master replicated tables, are:
They update only the master site.
They can be disconnected from the master site for extended periods.
You also can configure an updateable snapshot such that the updates are not sent back to the master. You can use this configuration to perform “What if " analyses against the local data without fear of overwriting the definitive values at the master site.
Advanced (or multi-master) replication (shown in Figure 1.8) is the most powerful of the replication options. You can use it to maintain a table at numerous sites, with updates at any one location being applied at all the other locations. There is no single “master” table, although there is a master definition site, from which schema maintenance must be performed. Unlike the situation with snapshots, you can configure a multi-master environment to provide real-time data; this technique is known as synchronous replication. If you use asynchronous replication (by far the more common implementation), updates to a table are placed in the deferred queue and pushed to other participating sites at user-defined intervals.
Since updates can occur at several locations, these updates can conflict with one another. Oracle provides a number of built-in methods to assist in resolving these conflicts, such as Latest Timestamp and Site Priority, but these techniques must be selected carefully to guarantee that data always converges. Conflict resolution, described in detail in Chapter 15, is usually the biggest challenge to creating and maintaining a successful implementation.
No support for sequences
No support for LONG or LONG RAW or HHCODE data, although Oracle8 supports replication of binary large objects (BLOBs) and character large objects (CLOBs)
Not recommended for applications performing massive updates (i.e., updates to tens of thousands of rows per hour)
Procedural replication (shown in Figure 1.9) is the preferred way to perform the massive updates that are not recommended with advanced replication. Instead of queuing up row-level changes and sending them to the other database instances, procedural replication queues calls to procedures and sends them to the other participants. If, for example, you wanted to mark up the prices of all your products by five percent, you could replicate the procedure call UPDATE_PRICES(pct_increase => 5). The procedure will execute at every site with the same parameters.
Oracle does not provide any conflict handlers that work in conjunction with procedural replication, so any routines that you want to use in this way must account for conflicts. In the price increase example, suppose that a price for one item had been changed at a remote site, and the change had not yet propagated to the site initiating the UPDATE_PRICES call. The data would not converge to the same values at both sites. Table 1.2 summarizes the kinds of conflicts that may occur with procedural replication.
Table 1-2. Potential Conflicts with Procedural Replication
CA calls UPDATE_PRICES(pct_increase => 5)
NY site updates price to $120 before procedure replicates
Procedure call replicates to NY site
Update from NY at 12:10 arrives at CA site
It is safest to perform procedural replication during periods of low or no activity.