Thursday, October 28, 2010

Kill Two Birds with One Stone - Create Physical Standby Database in Oracle10g with Data Guard


Oracle Data Guard, included with Oracle Database Enterprise Edition, is really a cool product. Implement Oracle Data Guard and you will most likely kill two birds with one stone. Oracle Data Guard will help you achieve high availability and disaster recovery. One tool and two benefits. With Oracle Data Guard, you have a standby database that is in synch with the primary database. God forbids and you lose your primary database, you can very quickly switch to the standby database with minimal or no data loss. Your standby database is just a command away from being a primary database.

Most importantly, Oracle manages the Data Guard environment. You don't need to write any custom scripts. You don't need to setup any backup jobs. You don't need to write a job to copy archive logs. You don't need to manage archive logs and RMAN hot backups. Implementation is quite simple, and it works very well.

Based on your Recovery Point Objective(RPO) and Recovery Time Objective(RTO), you can configure Oracle Data Guard in three different modes:
  • Maximum Performance
  • Maximum Availability
  • Maximum Protection
Oracle Data Guard operates on a simple principle - ship redo stream to the standby database and apply it to standby database to keep it in synch with the primary database. A simple but very effective solution. If you have licensed Oracle Enterprise Edition and haven't implemented Data Guard, then you should seriously consider implementing one.

In this blog, I will illustrate the steps that you need to perform to implement physical standby database in maximum performance mode.

Environment

My data guard test environment is as follows:

  • Primary Database Server: HRPRDSVR
  • Oracle Version: 10gR2 10.2.0.4
  • Operating System: Windows
  • Standby Type: Physical
  • Data Guard Mode: Maximum Performance
  • Redo Apply: Real-time
  • Archive Log Mode: Enabled on primary
  • Primary DB Unique Name: HRPRD
  • Primary DB TNS Service Name: HRPRD
  • Data and Log Files Location: E:\Oradata\HRPRD
  • Standby Database Server: HRSTBYSVR
  • Standby DB Unique Name: HRPRDS
  • Standby DB TNS Service Name: HRPRDS
  • Data and Log Files Location: E:\Oradata\HRPRDS
  •  
Note that Primary and Standby Database servers have identical disk configuration.

Primary Database Configuration

Step 1 - Update init.ora parameters on primary database

There are only few init.ora parameters that govern the behavior and implementation of Data Guard based physical standby databases. In order to make it easy for you to understand data guard related parameters, I have created a table that lists data guard related parameters and their corresponding values. This approach would help you understand these parameters and allow you to compare the values between primary and standby databases. I have also added my comments to further explain or clarify these parameters.

Please use this table as a guide while setting up init.ora parameters in your environment. The parameter values listed below are representative values, and should be applicable to most implementations. Update values as appropriate to reflect environment specific characteristics.

Parameter NamePrimary DB ParametersPhysical Standby DB
Maximum Performance Mode with real-time redo apply
Comments
db_nameHRPRDHRPRDDb_name must be the same for primary and physical standby
instance_nameHRPRDHRPRD  
db_unique_nameHRPRDHRPRDSDb_unique_name must be different for primary and physical standby
service_namesHRPRDHRPRDS  
remote_login_passwordfileexclusiveexclusive  
log_archive_config'dg_config=
(HRPRD,HRPRDS)'
'dg_config=
(HRPRD,HRPRDS)'
  
log_archive_format HRPRD%r%s.%tHRPRD%r%s.%t  
log_archive_trace11  
log_archive_dest_1"location=
f:\orarch\HRPRD "
"location=
f:\orarch\HRPRD"
Replace with your Local Archive Destination
log_archive_dest_2'service=
HRPRDS
async
lgwr db_unique_name=
HRPRDS
valid_for=
(primary_role,
online_logfile)'
'service=
HRPRD
async
lgwr
db_unique_name=
HRPRD
valid_for=
(primary_role,
online_logfile)'
For maximum performance mode, make sure that you specify async value as part of this parameter.
You also need to use lgwr value to enable real-time redo apply.
log_archive_dest_state_1enableenable  
log_archive_dest_state_2deferenableStart with deferred configuration on primary. We will enable it later.
standby_file_managementautoautoLet Oracle add/resize data and log files on standby
fal_serverHRPRDSHRPRDIn the event primary assumes the role of standby.
fal_clientHRPRDHRPRDSIn the event primary assumes the role of standby.
log_file_name_convert('HRPRDS',
'HRPRD')
('HRPRD',
'HRPRDS')
Let Oracle manage logfiles on the standby node through this parameter. Specify on both nodes to manage role reversal.
db_file_name_convert'HRPRDS',
'HRPRD')
('HRPRD',
'HRPRDS')
Let Oracle manage datafiles on the standby node through this parameter.
Specify on both nodes to manage role reversal.

Step 2 - Create a password file on primary

In an Oracle Data Guard environment, the primary and standby databases authenticates via password file. Use the following command to create a password file on the primary database server.

orapwd file=PWDHRPRD.ora password=verysecretpassword (On Windows platform)
orapwd file=orapwHRPRD.ora password=verysecretpassword (On Unix platform)

The default name and location of the password file on Windows platform is %ORACLE_HOME%\DATABASE\PWD<ORACLE_SID>.ora and $ORACLE_HOME\dbs\orapw<ORACLE_SID> on Unix platform. You need to restart the instance after creating the password file.

A sidebar on the password file is in order as many Oracle DBAs are not very clear on password file concepts. When you want to startup an Oracle database instance, you need an authentication mechanism outside the database. During startup, you can't use database authentication as Oracle database instance is not up at that time.

Oracle operating system group membership or password file provide authentication outside the database. If you are a member of ORA_DBA group on Windows or dba group on Unix, then you can login "SYS AS SYSDBA" without any password and startup the instance.

But if you are not a member of these groups, then you need a password file to authenticate your "SYS AS SYSDBA" connections. In addition, if you want to startup and/or shutdown databases from remote servers on different domains, then you also need a password file to authenticate "SYS AS SYSDBA" connections. Databases participating in an Oracle Data Guard environment also need password file for authentication.

Step 3 - Enable FORCE LOGGING

Oracle Data Guard operates the standby database with the primary database redo stream. Primary database ships redo stream to the standby database. Redo stream shipment to the standby database can be synchronous or asynchronous depending on the data guard protection mode that you choose. Standby database's managed recovery process applies the redo stream to keep both the databases in synch. Any issues or interruption with the redo stream would put your standby database in jeopardy. For instance, NOLOGGING operations on primary database will definitely jeopardize your standby database. To prevent NOLOGGING operations, you must enable FORCE LOGGING on the primary database.

SQL> SHUTDOWN IMMEDIATE ;
SQL> STARTUP MOUNT ;
SQL> ALTER DATABASE FORCE LOGGING
SQL> ALTER DATABSE OPEN;

Run the following query just to make sure:

SELECT FORCE_LOGGING
FROM V$DATABASE ;

You can also query V$DATAFILE to see if there are any datafiles with NOLOGGED operations, as shown below.

SELECT FILE#, FIRST_NONLOGGED_SCN, FIRST_NONLOGGED_TIME
FROM V$DATAFILE ;

Step 4 - Last Night's RMAN Backup on primary

You need to use your last RMAN backup to seed the standby database. If you don't have RMAN backups, then you need to take a fresh RMAN backup. But I am sure you do perform RMAN backups. If you don't need to take RMAN backups, then I don't think you would be needing a standby database!

If this is a test implementation and you don't have RMAN backups, then take RMAN backup through BACKUP DATABASE command.

BACKUP FORMAT 'C:\OraBack\FullBackup%U.rman' DATABASE PLUS ARCHIVELOG;

Your database must be in ARCHIVELOG mode. If not, enable ARCHIVELOG mode.

Step 5 - Create Standby Control File on primary

A physical standby database needs a control file that is created with "FOR STANDBY" additional clause. You can use RMAN or SQL*Plus to create a standby control file.

SQL> alter database create standby controlfile as 'c:\oraback\stbyctr.ctl';


RMAN> BACKUP FORMAT 'c:\oraback\stbyctr.ctl' CURRENT CONTROLFILE FOR STANDBY;

Step 6 - Add TNSNAMES.ORA Service on primary server

Add the secondary database's TNSNAMES service entry to the primary server's TNSNAMES.ORA file:

HRPRDS =
  (DESCRIPTION =
     (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = HRSTBYSVR)(PORT = 1521))
     )
    (CONNECT_DATA =
        (SERVICE_NAME = HRPRDS)
    )

)

Step 7 - Copy Backups

Copy the following backups to the standby server on the same location as primary database:

  • RMAN Hot Backup with database and archive logs
  • Standby Control File  
You may need to copy additional archive log backups that were generated since you took the RMAN backup and executed RMAN DUPLICATE command to create a standby database.

Standby Database Configuration

Step 1 - Add TNSNAMES.ORA Service on Secondary

Add the primary database's TNSNAMES service entry on the standby server to TNSNAMES.ORA file :

HRPRD =
  (DESCRIPTION =
     (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = HRPRDSVR)(PORT = 1521))
     )
     (CONNECT_DATA =
        (SERVICE_NAME = HRPRD)
     )

  )

Test connectivity to the primary database server as follows:

C:\> SQLPLUS sys/verysecretpassword@HRPRD as sysdba

Step 2 - Configure Listener.ora on standby server

Update listener.ora as appropriate, and reload or restart listener.

(SID_DESC =
    (ORACLE_HOME = <ORACLE_HOME>)
    (SID_NAME = HRPRDS)
    (SDU=8192)
)

Step 3 - Create init.ora file on standby server

Create init.ora file on the secondary server. Refer to the table above for Oracle Data Guard related parameters that you need to configure on the standby server.

If you are on Windows platform, then you need to create the Windows service as follows:

C:>oradim -new -sid HRPRDS -startmode auto -pfile initHRPRDS.ora -syspwd verysecretpassword
Note that the sys password on standby database must be the same as primary database.

Now start the database instance on standby server and logout from your SQL*Plus session.


Step 4 - Create password file on standby

If you are on Windos platform, then you don't need to create a password file. Otherwise, create the password file on the secondary server as follows. Note that the password on standby database must be the same as primary database.

orapwd file=PWDHRPRDS.ora password=verysecretpassword (On Windows platform)

Use SQL*Plus to test connectivity from the primary database server to the standby server.

C:\> SQLPLUS sys/verysecretpassword@HRPRDS as sysdbaDon't proceed to the next step if you are having connectivity issues. Verify passwords and existance of password files

Step 5 - Create Standby Database

RAMN is the tool to create the standby database as follows:

C:\> set ORACLE_SID=HRPRD

C:\>rman nocatalog

Recovery Manager: Release 10.2.0.4.0 - Production on Wed Oct 13 17:04:40 2010

Copyright (c) 1982, 2007, Oracle. All rights reserved.

RMAN> connect auxiliary /

connected to auxiliary database: HRPRD (not mounted)

RMAN> connect target sys@HRPRD

target database Password:
connected to target database: HRPRD (DBID=358909779)
using target database control file instead of recovery catalog

RMAN> duplicate target database for standby dorecover ;

-----
-----
-----

media recovery complete, elapsed time: 00:07:50
Finished recover at 13-OCT-10
Finished Duplicate Db at 13-OCT-10

RMAN>

Good News! Physical Standby Database has been created.

Common Errors

If you haven't specified db_name = HRPRD and db_unique_name=HRPRDS, then you will get the following error:

RMAN-03002: failure of Duplicate Db command
RMAN-03015: error occurred in stored script Memory Script
RMAN-03009: failure of sql command on clone_default channel
RMAN-11003: failure during parse/execution of SQL statement: alter database mount standby database
ORA-01103: database name 'HRPRD' in control file is not 'HRPRDS'


If you haven't copied the primary database backup to the standby database server in the same location as primary database, then you would get the following errro.

executing command: SET until clause
Starting restore at 16-DEC-09
using channel ORA_AUX_DISK_1

channel ORA_AUX_DISK_1: starting datafile backupset restore
channel ORA_AUX_DISK_1: restoring control file
channel ORA_AUX_DISK_1: reading from backup piece .......
ORA-19870: error reading backup piece .........
ORA-19505: failed to identify file ........
ORA-27041: unable to open file
OSD-04002: unable to open file
O/S-Error: (OS 3) The system cannot find the path specified. failover to previous backup

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at ........
RMAN-03015: error occurred in stored script Memory Script
RMAN-06026: some targets not found - aborting restore
RMAN-06024: no backup or copy of the control file found to restore


If you haven't copied all the archive logs that were generated since you took the RMAN backup, you would get the following error:

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: ...........

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 12/16/2010 11:15:38
RMAN-03015: error occurred in stored script Memory Script
RMAN-06053: unable to perform media recovery because of missing log
RMAN-06025: no backup of log thread 1 seq 10 lowscn 113033639 found to restore
RMAN-06025: no backup of log thread 1 seq 9 lowscn 113029915 found to restore
RMAN-06025: no backup of log thread 1 seq 8 lowscn 113014889 found to restore


Note that the standby database must be in NOMOUNT state when you initiate the RMAN duplicate command.

RMAN> duplicate target database for standby dorecover ;
Starting Duplicate Db at 16-DEC-10
using channel ORA_AUX_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 12/16/2009 11:26:19
RMAN-05500: the auxiliary database must be not mounted when issuing a DUPLICATE
command

RMAN> quit

Step 6 - Add Standby Redo Log Files on primary and standby databases

You must configure standby redo logs on standby database to activate real-time redo transport. Standby redo log files must be of the same size as primary database redo log files. You need to create at least the same number of standby redo log files as primary database.

ALTER DATABASE
ADD STANDBY LOGFILE 'E:\oradata\HRPRDS\StbyRedo1.log'
SIZE 100M;

Also perform the same task on primary database in the event primary assumes the role of standby database.

Step 7 - Activate standby database managed recovery on Standby

SQL> Alter database recover managed standby database using current logfile disconnect ;

Step 8 - Enable Physical Standby on Primary


Enable log_archive_dest_state_2 parameter and then perform a log switch on the primary database server:

SQL> ALTER SYSTEM SET log_archive_dest_state_2 = enable ;

SQL> ALTER SYSTEM SWITCH LOGFILE;
SQL> ALTER SYSTEM SWITCH LOGFILE;

And you should see the latest archive log on the standby server! Oracle Data Guard Physical Standby Database is running!!

If you don't see what you were expecting, then review alert logs and trace files. You may also query V$ARCHIVE_LOG table to degug this issue further.

Test It!

You can also run a simple test case to demonstrate that redo is being shipped in real-time and application data is available on standby database.

Run the following on primary database.

SQL> drop table test purge ;

Table dropped.

SQL> create table test
2 ( test date ) ;

 Table created.

SQL> insert into test values(sysdate) ;

1 row created.

SQL> /

1 row created.

SQL> commit ;

Commit complete.

SQL> alter session set nls_date_format='MM/DD/YYYY:HH24:MI:SS' ;

Session altered.

SQL> SELECT * FROM TEST ;

TEST
-------------------
10/17/2010:00:01:57
10/17/2010:00:01:58

Run the following on standby database server:

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL ;

Database altered.

SQL> ALTER DATABASE OPEN READ ONLY ;

Database altered.

SQL> SELECT * FROM JMEHTA.TEST ;

TEST
-------------------
10/17/2010:00:01:57
10/17/2010:00:01:58

SQL> ALTER DATABASE CLOSE ;

Database altered.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
USING CURRENT LOGFILE DISCONNECT ;

Database altered.

Monday, October 25, 2010

Verify Data Guard Implementation


Few Verification Checks you should perform on Physical Standby Database running in Maximum Performance Mode.

For physical standby database to be relevant to your high availability and disaster/recovery requirements, you need to make sure that it's keeping up with the primary database and is in sync with the primary database. Once implemented, you need to make sure that it is up and running at all time. Below is a list of few important tasks that you should perform to verify physical standby database's operation.

Test 1 - Archive Logs on Primary and Standby

When you perform a log switch on the primary database, you will see the corresponding archive log on the standby database server. Archive log location on primary and standby is defined by log_archive_dest_1 and its format by log_archive_format parameter.

In addition, V$ARCHIVE_LOG view on primary displays archive logs generated by both primary and standby databases. Run the following query to verify:

SQL> SELECT STANDBY_DEST,DEST_ID, ARCHIVED, APPLIED, MAX(SEQUENCE#)
2 FROM V$ARCHIVED_LOG
3 GROUP BY STANDBY_DEST, DEST_ID, ARCHIVED, APPLIED;

STA DEST_ID    ARC APP MAX(SEQUENCE#)
--- ---------- --- --- --------------
NO   1         YES  NO 164607
YES  2         YES YES 164607

Test 2 - Archive Log Gap

Query V$ARCHIVE_GAP to determine if you have gaps in archive logs on standby database. Since you should have configured fal_server and fal_client parameters, Oracle Data Guard should automatically resolve the archive log gaps by requesting to the primary database to send the missing archive logs.

But if you don't have the archive logs that the physical standby database needs, then you would have archive log gaps and your standby database would be out of sync. Remedies to fix this issue may vary, but your standby database is out of sync.

Test 3 - Real-time Redo Propagation

You want to make sure that your physical standby database is keeping up with the primary database.

Method 1

On standby, Run the following query to evaluate "apply lag" and "transport lag" parameters. The values for these parameters will vary, but you will get an idea whether your standby database is keeping up with the primary database or not.

SQL> SELECT *
2 FROM V$DATAGUARD_STATS ;

NAME
--------------------------------
VALUE
----------------------------------------------------------------
UNIT TIME_COMPUTED
------------------------------ ------------------------------
apply finish time
+00 00:00:00.0
day(2) to second(1) interval 16-OCT-2010 23:35:11

apply lag
+00 00:00:05
day(2) to second(0) interval 16-OCT-2010 23:35:11

estimated startup time
14
second 16-OCT-2010 23:35:11

standby has been open
N
16-OCT-2010 23:35:11

transport lag
+00 00:00:00
day(2) to second(0) interval 16-OCT-2010 23:35:11

Method 2

On standby, Run the following query to compute redo lag. You may see negative value for RedoAsOf, but at least you would get an idea about synchronization lag.

SQL> SELECT SYSDATE CurrentTime, MAX(LAST_TIME) RedoAsOf,
2 SYSDATE - MAX(LAST_TIME) RedoLag
3 FROM V$STANDBY_LOG ;

CURRENTTIME          REDOASOF           REDOLAG
------------------- ------------------- ----------
10/16/2010:23:41:02 10/16/2010:23:41:03 -.00001157

Method 3

On primary database, run the following query to compute the current SCN

SQL> SELECT CURRENT_SCN FROM V$DATABASE ;

CURRENT_SCN
--------------------
9,084,966,569,214

On secondary database, run the following query to compute the current SCN, and compare it against the value from the previous query.

SQL> SELECT MAX(LAST_CHANGE#) CURRENT_SCN FROM V$STANDBY_LOG ;

CURRENT_SCN
--------------------
9,084,966,569,212