TECHNOLOGY: Oracle Streams
Making Data Flow
By Sanjay Mishra
Use Oracle Streams to make data move itself.
One of the challenges of today’s distributed business environments is the sharing of information among a multitude of applications and databases. As the number of applications grows, you end up with several databases, which may even be from different vendors. Sharing information in such a complex and disparate environment can be demanding.
Oracle Streams provides a flexible infrastructure that businesses can use to simplify information sharing tasks.
What Is Streams?
Oracle Streams captures database changes at a source database, stages the changes, propagates the changes to one or more destination databases, and then applies the changes at the destination database(s). Using Oracle Streams, enterprise systems can capture, propagate, and apply information as follows:
- Within an Oracle database
- Between two Oracle databases
- Among multiple Oracle databases
- Between an Oracle database and a non-Oracle database
How Streams Works
Oracle Streams begins by capturing changes. The changes (to data, tables, schemas, and so on) that take place in a database are recorded in the database redo log files. The Streams capture process extracts these changes from the redo log files and formats each change into a logical change record (LCR). The LCRs are then stored in a queue (staged). Next, Streams propagates LCRs from one queue (the producer queue) to another (the consumer queue) and can then apply (or consume) the LCRs from the consumer queue to the destination database.
Oracle Streams can be used in performing tasks such as
- Data replication
- Data warehouse extraction and loading
- Event notification
- Message queuing
- Database platform migration
- Database and application upgrading
This article illustrates how you can set up Streams data replication from one Oracle database to another.
Setting up Streams Between Two Oracle Databases
The example in this article illustrates the replication of a source database (TEST10G1) to a destination database (TEST10G2). (The two databases reside on two separate machines in this example, but you can use the same set of steps if the two databases reside on the same machine.) You can take the following step-by-step approach to set up a Streams replication environment:
- Set up ARCHIVELOG mode.
- Set up the Streams administrator.
- Set initialization parameters.
- Create a database link.
- Set up source and destination queues.
- Set up supplemental logging at the source database.
- Configure the capture process at the source database.
- Configure the propagation process.
- Create the destination table.
- Grant object privileges.
- Set the instantiation system change number (SCN).
- Configure the apply process at the destination database.
- Start the capture and apply processes.
The following sections describe the process for each step and include sample code where applicable.
- Set up ARCHIVELOG mode. The Streams capture process reads information from the redo log files of the database. To ensure that the information in the redo log files is available for capture, you must run the source database in ARCHIVELOG mode. In this example, you must set the source database to run in ARCHIVELOG mode. Refer to Oracle Database Administrator’s Guide for instructions on how to set up ARCHIVELOG mode.
- Set up the Streams administrator. The Streams environment is managed by an administrative database user. To manage the Streams environment, the Streams administrator needs some specific privileges and has to create some tables to store Streams-related information.
- Set initialization parameters. Set specific initialization parameters at the databases participating in the Streams data sharing. Table 1 describes these initialization parameters and the values they must be set to.
Table 1: Initialization parameters and their values
For Streams to work, the COMPATIBLE parameter must be 9.2.0 or higher in both the source and the destination databases. For this example, use the value 10.1.0 for this parameter, because you are setting up Streams replication in an Oracle 10g database.
The JOB_QUEUE_PROCESSES parameter must be set to 2 or higher in the source database(s). In this example, you must set JOB_QUEUE_PROCESSES at the source database (TEST10G1) to 2 or higher.
Streams uses system global area (SGA) memory for storing information required for capture and apply. To allocate memory in the SGA for use by Streams, set up a Streams pool in the SGA by specifying the STREAMS_POOL_SIZE parameter. The default value for this parameter is 0, which means that Streams instead uses memory from the shared pool. If you leave STREAMS_POOL_SIZE at its default value (0), Streams will plan to use up to 10 percent of the SHARED_POOL_SIZE, so Oracle recommends setting theSTREAMS_POOL_SIZE parameter to at least 200MB. Both the source and the destination databases must account for this SGA memory.
- Create a database link. You need to create a database link from the source database to the destination database. In this example, you will create a database link from TEST10G1 toTEST10G2.
CONNECT strmadmin/strmadmin@TEST10G1 CREATE DATABASE LINK TEST10G2 CONNECT TO strmadmin IDENTIFIED BY strmadmin USING 'TEST10G2';
- Set up source and destination queues. The data moves from the source to the destination database through queues. Use the SET_UP_QUEUE procedure in theDBMS_STREAMS_ADM package to set up the queues. By default, this procedure creates a queue table named streams_queue_table and a queue named streams_queue. You can override these names by specifying the queue_table and queue_name parameters of the procedure SET_UP_QUEUE. The default names work well, unless you want to create multiple queues and multiple queue tables. For the purpose of the replication example in this article, you will accept the default names, by executing the procedure as follows:
Execute this procedure on both the source and the destination databases. When you execute it on the source database (TEST10G1), it creates a source queue, and when you execute it on the destination database (TEST10G2), it creates a destination queue. Because you are choosing to accept the default queue name, the names of the source and destination queues will be the same. In a later step, you will associate these two queues with the propagation process.
- Set up supplemental logging at the source database. Before you start capturing changes at the source database, you need to add supplemental logging on the tables being changed. Supplemental logging puts additional information in the redo logs that helps during the apply process. For more information on supplemental logging, refer to the Oracle Streams Replication Administrator’s Guide.
To add supplemental logging, connect to the source database as the schema owner (SCOTT, in this example) and run the following statement:
ALTER TABLE emp ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY, UNIQUE) COLUMNS;
- Configure the capture process at the source database. The capture process initiates replication, by capturing the changes in the source database. It then formats each change into an LCR and enqueues the LCRs.
Create a capture process to extract the changes from the redo logs. You can configure a capture process to run on the source database—called a local capture—or remotely on another database—called a downstream capture.
While creating a capture process, you add rules to specify which changes to capture and which ones to discard. The rules are combined into rule sets. The capture process can have a positive rule set and a negative rule set. For a replication capture process to work, you need to add a positive rule set that specifies that the data manipulation language (DML) changes as well as the data definition language (DDL) changes be captured. Do this by using the ADD_TABLE_RULES procedure of the DBMS_STREAMS_ADM package:
BEGIN DBMS_STREAMS_ADM.ADD_TABLE_RULES( table_name => 'scott.emp', streams_type => 'capture', streams_name => 'capture_stream', queue_name => 'strmadmin.streams_queue', include_dml => true, include_ddl => true, inclusion_rule => true); END; /
This procedure call creates the local Streams capture process, named capture_stream. Note that you can use the optional parameter source_database to specify another database for a downstream capture. If you omit this parameter (as in this example) or set the parameter value to NULL, the procedure will create a local capture process.
The streams_type parameter indicates that this procedure will create a capture process. (Note that the same procedure will later be used to create an apply process.) Theinclusion_rule parameter with the value set to true means that this procedure will create a positive rule set for the capture process. The true value for the include_dmlparameter means that a rule will be created for DML changes, and a true value for the include_ddl parameter means that a rule will be created for DDL changes. Thetable_name parameter indicates that these rules are associated with the EMP table in the SCOTT schema. You can also set up rules for a schema or for the entire database, using the ADD_SCHEMA_RULES and ADD_GLOBAL_RULES procedures, respectively. See the PL/SQL Packages and Types Reference manual for more information on these packages.
The queue, streams_queue, specified in this procedure call was created by the SET_UP_QUEUE procedure call in Step 5.
- Configure the propagation process. Once the changes are captured and enqueued, propagate the changes to the destination database. To do so, you create a propagation process and associate the source queue with the destination queue. You create the source and destination queues (each named the default streams_queue) in the source and destination databases, respectively, in Step 5 by calling the SET_UP_QUEUE procedure.
The procedure in Listing 1 creates a propagation process and adds rules to the positive rule set of that process.
This procedure call creates a propagation process, named TEST10G1_to_TEST10G2, for which the source queue is the streams_queue in the TEST10G1 database and the destination queue is the streams_queue in the TEST10G2 database. This procedure also adds DML and DDL rules to the positive rule set.
- Create the destination table. Before you can start replicating the DDL and DML changes on a source table to the destination database, the table must exist in the destination database. If the table doesn’t already exist in the destination database, you have several options for creating the object in the destination database, such as using Data Pump, export/import, RMAN, transportable tablespaces, and so on. See “Moving Data Faster” for more information on your destination table setup options.
- Grant object privileges. On the destination database, the Streams administrator applies the changes captured from the source database. To be able to apply the changes to the destination tables, the Streams administrator must have the required privileges on the relevant objects. For example, if the changes need to be applied to the EMP table of the SCOTTschema, you need to execute the following after connecting to the destination (TEST10G2) database:
GRANT ALL ON scott.emp TO strmadmin;
Because you set up the Streams administrator with the DBA role in this example (in Step 2), you can choose to skip this step. But it is a good idea to explicitly grant the required object privileges to the Streams administrator.
- Set the instantiation System Change Number (SCN). Now you need to set the instantiation SCN for the table from the source database you want to replicate. This ensures that the changes in the source table captured before the instantiation SCN is set will not be applied at the destination database. The procedure in Listing 2 sets the instantiation SCN for theEMP table in the destination database, by fetching the current SCN from the source database. Execute the procedure in Listing 2 by connecting to the source database (TEST10G1) as the Streams administrator.
Listing 2 shows how the GET_SYSTEM_CHANGE_NUMBER procedure of the package DBMS_FLASHBACK returns the current SCN of the source database. This SCN is used to set the instantiation SCN for the EMP table of the SCOTT schema in the destination database. Note that the SET_TABLE_INSTANTIATION_SCN procedure of the DBMS_APPLY_ADMpackage is called via the TEST10G2 database link, which means that although this procedure is being called while you are connected to TEST10G1, it will be executed in theTEST10G2 database.
If the table didn’t previously exist in the destination database and you use export/import, Data Pump, or transportable tablespaces to copy it, the instantiation SCN will be set automatically for the table.
- Configure the apply process at the destination database. Now you create an apply process and associate the destination queue with it. You also add rules for the apply process. You achieve this by calling the ADD_TABLE_RULES procedure of the DBMS_STREAMS_ADM package:
BEGIN DBMS_STREAMS_ADM.ADD_TABLE_RULES( table_name => 'scott.emp', streams_type => 'apply', streams_name => 'apply_stream', queue_name => 'strmadmin.streams_queue', include_dml => true, include_ddl => true, source_database => 'TEST10G1', inclusion_rule => true); END; /
This procedure creates an apply process (as indicated by the streams_type parameter) for the EMP table in the SCOTT schema. The apply process, named apply_stream, is associated with the streams_queue. The apply process also adds DML and DDL rules to the positive rule set (as indicated by the parameter inclusion_rule).
Oracle Streams Concepts and Administration
Oracle Streams Replication Administrator’s Guide
LEARN about Oracle Streams
Oracle by Example Series
Note that earlier you used DBMS_STREAMS_ADM.ADD_TABLE_RULES to create (and add rules for) a capture process (Step 7).
- Start the capture and apply processes. Now that you have configured all necessary objects and processes, all you need to do is start the capture and apply processes. To start the capture process, connect to the source database and execute the START_CAPTURE procedure of the DBMS_CAPTURE_ADM package:
BEGIN DBMS_CAPTURE_ADM.START_CAPTURE( capture_name => 'capture_stream'); END; /
Similarly, to start the apply process, connect to the destination database and execute the START_APPLY process of the DBMS_APPLY_ADM package. However, before you do that, best practices recommend that you set the disable_on_error parameter of the apply process to n, so that the apply process will continue even if it encounters some errors.
BEGIN DBMS_APPLY_ADM.SET_PARAMETER( apply_name => 'apply_stream', parameter => 'disable_on_error', value => 'n'); END; / BEGIN DBMS_APPLY_ADM.START_APPLY( apply_name => 'apply_stream'); END; /
Now the Streams replication environment is ready, and DML and DDL changes in the SCOTT.EMP table in the source database will be replicated in the corresponding table in the destination database.
For each database participating in the Streams data sharing environment, you need to create a user account and designate this account as the Streams administrator. Don’t use this account for any other purpose, and don’t use the SYS or the SYSTEM user as the Streams administrator.
The Streams administrator creates some tables in its default tablespace. You should specify a tablespace other than SYSTEM for this purpose. You can use the USERS tablespace or another tablespace for storing the tables owned by the Streams administrator, but for ease of management, best practices recommend that you use a separate tablespace for Streams. Use the following command to create the tablespace for the Streams administrator:
CREATE TABLESPACE streams_tbs DATAFILE '/u01/app/oracle/oradata/ TEST10G1/streams_tbs.dbf' SIZE 25M;
Now create the Streams administrator user in the database, as follows:
CREATE USER strmadmin IDENTIFIED BY strmadmin DEFAULT TABLESPACE streams_tbs TEMPORARY TABLESPACE temp QUOTA UNLIMITED ON streams_tbs;
Now grant the CONNECT, RESOURCE, and DBA roles to the Streams administrator:
GRANT CONNECT, RESOURCE, DBA TO strmadmin;
Use the GRANT_ADMIN_PRIVILEGE procedure in the DBMS_STREAMS_AUTH package to grant the required privileges to the Streams administrator:
BEGIN DBMS_STREAMS_AUTH.GRANT_ADMIN_ PRIVILEGE( grantee => 'strmadmin', grant_privileges => true); END; /
Note that Streams does not require you to use the following roles, but granting these roles can assist with administration:
GRANT SELECT_CATALOG_ROLE TO strmadmin; GRANT SELECT ANY DICTIONARY TO strmadmin;
Complete the preceding steps to set up a Streams administrator on each of the databases participating in the Streams data sharing environment. In this example, these steps need to be executed on both the TESTG1 and the TEST10G2 databases.
Oracle Streams can capture, propagate, and apply changes in the database automatically, including both DML and DDL changes. Applications requiring replication, data warehouses, database migrations, and database upgrades can all benefit from Oracle Streams.