How to recover Physical Standby database if it goes out of sync J
(Missing log Archive log flies)
When you are using Data guard, there are several scenarios when physical standby can go out of sync with the primary database.Before doing anything to correct the problem, we need to verify that why standby is not in sync with the primary. In this particular article, we are covering the scenario where a log is missing from the standby but apart from the missing log, all logs are available.
Verify from v$archived_log that there is a gap in the sequence number. All the logs up to that gap should have APPLIED=YES and all the sequence# after the missing log sequence# are APPLIED=NO. This means that due to the missing log, MRP is not applying the logs on standby but the logs are still being transmitted to the standby and are available.
SQL> SELECT SEQUENCE#, APPLIED FROM V$ARCHIVED_LOG;
So for example, if the missing log sequence# is 400, then the above query should show that up to sequence#399, all have APPLIED=YES and starting from 401, all are APPLIED=NO.
There are few steps to be performed when the standby is not in sync with the primary because there is a gap of logs on standby.
These steps are:
STEP #1: Take an incremental backup of primary from the SCN where standby is lagging behind and apply on the standby server
STEP #2: If step#1 is not able to sync up, then re-create the controlfile of standby database from the primary….
STEP #3: If after step#2, you still find that logs are not being applied on the standby, check the alert log and you may need to re-register the logs with the standby .
STEP#1
1. On STANDBY database query the v$database view and record the current SCN of the standby database:
SQL> SELECT CURRENT_SCN FROM V$DATABASE;
CURRENT_SCN
-----------
1.3945E+10
-----------
1.3945E+10
SQL> SELECT TO_CHAR (CURRENT_SCN) FROM V$DATABASE;
----------------------------------------
13945141914
13945141914
2. Stop Redo Apply on the standby database:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
*
ERROR at line 1:
ORA-16136: Managed Standby Recovery not active
If you see the above error, it means Managed Recovery is already off
You can also confirm from the view v$managed_standby to see if the MRP is running or not
SQL> SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY;
3. Connect to the primary database as the RMAN target and create an incremental backup from the current SCN of the standby database that was recorded in step 1:
For example,
BACKUP INCREMENTAL FROM SCN 13945141914 DATABASE FORMAT '/tmp/ForStandby_%U' tag 'FOR STANDBY'
You can choose a location other than /tmp also.
4. Do a recovery of the standby database using the incremental backup of primary taken above:
On the Standby server, without connecting to recovery catalog, catalog the backupset of the incremental backup taken above. Before this, of course you need to copy the backup piece of the incremental backup taken above to a location accessible to standby serve
$ rman nocatalog target /
RMAN> CATALOG BACKUPPIECE '/dump/proddb/inc_bkup/ForStandby_1qjm8jn2_1_1';
Now in the same session, start the recovery
RMAN> RECOVER DATABASE NOREDO;
You should see something like:
Starting recover at 2008-09-17 04:59:57
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=309 devtype=DISK
channel ORA_DISK_1: starting incremental datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
....
..
..
.
channel ORA_DISK_1: reading from backup piece /dump/proddb/inc_bkup/ForStandby_1qjm8jn2_1_1
channel ORA_DISK_1: restored backup piece 1
piece handle=/dump/proddb/inc_bkup/ForStandby_1qjm8jn2_1_1 tag=FOR STANDBY
channel ORA_DISK_1: restore complete, elapsed time: 01:53:08
Finished recover at 2008-07-25 05:20:3
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=309 devtype=DISK
channel ORA_DISK_1: starting incremental datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
....
..
..
.
channel ORA_DISK_1: reading from backup piece /dump/proddb/inc_bkup/ForStandby_1qjm8jn2_1_1
channel ORA_DISK_1: restored backup piece 1
piece handle=/dump/proddb/inc_bkup/ForStandby_1qjm8jn2_1_1 tag=FOR STANDBY
channel ORA_DISK_1: restore complete, elapsed time: 01:53:08
Finished recover at 2008-07-25 05:20:3
Delete the backup set from standby:
RMAN> DELETE BACKUP TAG 'FOR STANDBY';
using channel ORA_DISK_1
List of Backup Pieces
BP Key BS Key Pc# Cp# Status Device Type Piece Name
------- ------- --- --- ----------- ----------- ----------
17713 17713 1 1 AVAILABLE DISK /dump/proddb/inc_bkup/ForStandby_1qjm8jn2_1_1
Do you really want to delete the above objects (enter YES or NO)? YE
deleted backup piece
backup piece handle=/dump/proddb/inc_bkup/ForStandby_1qjm8jn2_1_1 recid=17713 stamp=660972421
Deleted 1 objects
5. Try to start the managed recovery.
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
If you get an error here, you need to go to STEP#2 for bringing standby in sync.
If no error, then using the view v$managed_standby, verify that MRP process is started and has the status APPLYING_LOGS.
6. After this, check whether the logs are being applied on the standby or not:
SQL> SELECT SEQUENCE#, APPLIED FROM V$ARCHIVED_LOG;
After doing a recovery using the incremental backup, you will not see the sequence#'s which were visible earlier with APPLIED=NO because they have been absorbed as part of the incremental backup and applied on standby during recovery.
The APPLIED column starts showing YES for the logs which are being transmitted now, this means logs are being applied.
Check the status of MRP process in the view v$managed_standby. The status should be APPLYING_LOGS for the duration that available logs are being applied and once all available logs have been applied, the status should be WAITING_FOR_LOGS
7. Another check to verify that primary and standby are in sync. Run the following query on both standby and primary:
SQL> select max (sequence#) from v$log_history.
Output should be same on both databases.
STEP #2: Steps to recreate the standby controlfile and start the managed
recover on standby
Since Managed recovery failed after applying the incremental backup, we need to recreate the controlfile of standby. The reason for recreating the controlfile is that the state of the database was same because the database_scn was not updated in the control file after applying the incremental backup while the scn for datafiles were updated. Consequently, the standby database was still looking for the old file to apply.
A good MOSC note for re-creating the controlfile in such a scenario is 734862.1.
1. Take the backup of controlfile from primary
rman target sys/oracle@proddb catalog rman/cat@emrep
rman> backup current controlfile for standby;
2. Copy the controlfile backup to the standby system (or if it is on the common NFS mount, no need to transfer or copy) and restore the controlfile onto the standby database
Shutdown all instances (If standby is RAC) of the standby.
sqlplus / as sysdba
shutdown immediate
exit
shutdown immediate
exit
Startup nomount, one instance.
sqlplus / as sysdba
startup nomount
exit
Restore the standby control file.
rman nocatalog target /
restore standby controlfile from '/tmp/o1_mf_TAG20070220T151030_.bkp';
exit
3. Startup the standby with the new control file.
sqlplus / as sysdba
shutdown immediate
startup mount
exit
4. Restart managed recovery in one instance (if standby is RAC) of the standby database:
sqlplus / as sysdba
SQL > ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
The above statement may succeed without errors but the MRP process will still not start. The reason is that since the controlfile has been restored from the primary, it is looking for datafiles at the same location as are in primary instead of standby. For example, if the primary datafiles are located at '+DATA/proddb_1/DATAFILE' and standby datafiles are at '+DATA/proddb_2/DATAFILE', the new controlfile will show the datafile’s location as '+DATA/proddb_1/DATAFILE'. This can be verified from the query "select name from v$datafile" on the standby instance. We need to rename all the datafiles to reflect the correct location.
There are two ways to rename the datafiles:
1. Without using RMAN
Change the parameter standby_file_management=manual in standby’s parameter file.
ALTER DATABASE RENAME FILE '+DATA/proddb_1/datafile/users.310.620229743' TO '+DATA/proddb_2/datafile/USERS.1216.648429765';
2. Using RMAN
rman nocatalog target /
Catalog the files, the string specified should refer to the diskgroup/filesystem destination of the standby data files.
RMAN> catalog start with '+diskgroup//datafile/';
e.g.:
RMAN> catalog start with '+DATA/proddb_2/datafile/';
This will give the user a list of files and ask if they should all be cataloged. The user should review and say YES if all the datafiles are properly listed.
Once that is done, then commit the changes to the controlfile
RMAN> switch database to copy;
Now start the managed recovery as:
SQL > ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT
and check for processes in the view v$managed_standby. MRP process should be there. It will also start applying all the archived logs that were missing since last applied log. This process might take hours.
5. Another check to verify that primary and standby are in sync:
Run the following query on both standby and primary after all logs in v$archived_log show APPLIED=YES:
SQL> select max(sequence#) from v$log_history.
Output should be same on both databases.
STEP #3 Register all the redo logs to standby database.
After recreating the controlfile, you still find that logs are being transmitted but not being applied on the standby. Check the alert log of standby. For example, see if you find something similar to below snippet:
Fetching gap sequence in thread 1, gap sequence 74069-74095
Wed Sep 17 06:45:47 2008
RFS[1]: Archived Log: '+DATA/ipwp_sac1/archivelog/2008_09_17/thread_1_seq_74093.259.665649929'
Wed Sep 17 06:45:55 2008
Fetching gap sequence in thread 1, gap sequence 74069-74092
Wed Sep 17 06:45:57 2008
RFS[1]: Archived Log: '+DATA/proddb_2/archivelog/2008_09_17/thread_1_seq_74094.258.665649947'
Wed Sep 17 06:46:16 2008
RFS[1]: Archived Log: '+DATA/proddb_2/archivelog/2008_09_17/thread_1_seq_74095.256.665649957'
Wed Sep 17 06:46:26 2008
FAL[client]: Failed to request gap sequence
GAP - thread 1 sequence 74069-74092
The contents of alert log shows that logs sequence# from 74069 to 74092 may have been transmitted but not applied. The view v$archived_log shows the sequence# starting from 74093 and APPLIED=NO.
So this situation means that logs up to 74068 were applied as part of the incremental backup and from 74069 to 74093 have been transferred to standby server but they must have failed to register with standby database. Try the following steps:
- Locate the log sequence# shown in alert log (for example 74069 to 74092). For example,
+DATA/proddb_2/archivelog/2008_09_17/thread_1_seq_74069.995.665630861
- Register all these archived logs with the standby database.
alter database register logfile '+DATA/proddb_2/archivelog/2008_09_17/thread_1_seq_74069.995.665630861';
alter database register logfile '+DATA/proddb_2/archivelog/2008_09_17/thread_1_seq_74070.998.665631405';
alter database register logfile '+DATA/proddb_2/archivelog/2008_09_17/thread_1_seq_74071.792.665633755';
alter database register logfile '+DATA/proddb_2/archivelog/2008_09_17/thread_1_seq_74072.263.665633713';
……..
….and so on till the last one.
- Now check the view v$archived_log and finally should see the logs being applied. The status of MRP should change from ARCHIVE_LOG_GAP to APPLYING_LOGS and eventually WAITING_FOR_LOGS.
Oracle 10g Dataguard – Sync Standby database
One of the major problems being faced by DBAs managing Physical Standby Databases is to keep the standby database in sync with Primary Database.At times, standby database will go out of sync and there will be hundreds of archive logs to be applied.
It takes huge amount of data transfer and also large amount of archive log apply activity in the standby database.
Also, there is a potential chance of archive log non-availability, which may lead to rebuild of standby database.
The above situation can be tacked, without rebuild in Oracle 10g environment.
Steps to bring the standby database in sync with Primary Database are listed below.
1. Find the current SCN of standby database.
select current_scn from v$database;
CURRENT_SCN
———–
4793543
2. On the primary database – create the needed incremental backup from the above SCN
rman target /
RMAN> BACKUP DEVICE TYPE DISK INCREMENTAL FROM SCN 4793543 DATABASE FORMAT ‘C:\temp\bkup_%U’;
3. create a new standby controlfile from production
sqlplus> alter database create standby crontrolfile ‘blah’;
4. Cancel managed recovery on standby
sqlplus> recover managed standby database cancel;
5. Move your incremental backup from (2) to the standby server (empty folder) and catalog it
rman target /
rman> catalog start with ‘c:\temp\backup\’;
6. recover your standby from the incremental backup
rman> recover database noredo;
7. shutdown the standby and replace the controlfile with the one you backup in (3)
8. startup the standby and put it back into managed recovery mode
sqlplus> startup mount
sqlplus> recover managed standby database disconnect;
J Steps to failover to physical standby database if your Production Database Stops J
In this document we will see the steps to failover to a physical standby database.
1. Stop Redo Apply.
Issue the following SQL statement on the target standby database:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
2. Finish applying all received redo data.
By this we are giving indication to the standby database that primary database is no more.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
Once the FINISH command has completed, the protection mode of the primary database
is lowered to Maximum Performance, regardless of its original protection mode.
This is done because the new primary can be activated without an standby
SQL > select protection mode, protection level from v$database
Protection Mode Protection Level
——————– ——————–
MAXIMUM PERFORMANCE UNPROTECTED
3. Verify that the target standby database is ready to become a
Primary database.
Query the SWITCHOVER_STATUS column of the V$DATABASE view on the target
Standby database.
SQL > select SWITCHOVER_STATUS from v$database
4. Switch the physical standby database to the primary role.
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
5. Open the new primary database
SQL> ALTER DATABASE OPEN;
At this stage the protection level is changed to Max Performance from ‘Unprotected’
6. Backup the new primary database.
J Manual Switchover to Standby and Primary J
On the PRIMARY Database: (Open stage)
1. Query V$DATABASE to check the role of Primary Database.
SQL> Select Database_role from v$Database;
It will return “PRIMARY”
2. Now check the Switchover Status of the Primary Database.
SQL> Select switchover_status from v$Database;
It will return “SESSIONS ACTIVE”.
3. Now you are ready to perform a manual switchover. Execute the following command using
WITH SESSION SHUTDOWN” option.
SQL> Alter Database Commit to Switchover to Physical Standby with session Shutdown;
Database Altered.
4 Now your PRIMARY Database has become Physical Standby. To verify this change,
SQL> Alter Database Commit to Switchover to Physical Standby with session Shutdown;
Database Altered.
4 Now your PRIMARY Database has become Physical Standby. To verify this change,
again query the Database_role column of V$DATABASE. Now it will return
“PHYSICAL STANDBY”.
SQL > Select database_role from v$database;
5. Shutdown the database and again Start it to mount stage.
SQL >Shutdown immediate
SQL > Startup mount;
On the PHYSICAL STANDBY Database: (Mount stage)
6. Query V$DATABASE to check the role of Standby Database.
SQL> Select Database_role from v$Database;
It will return “PHYSICAL STANDBY”,
SQL> Select Database_role from v$Database;
It will return “PHYSICAL STANDBY”,
8. Now check the Switchover Status of the Standby Database.
. SQL> Select switchover_status from v$Database;
It will return “SESSIONS ACTIVE”.
9. Now cancel the MRP which is running in the background of the Standby Database.
. SQL> Select switchover_status from v$Database;
It will return “SESSIONS ACTIVE”.
9. Now cancel the MRP which is running in the background of the Standby Database.
Execute the following command:
SQL> Alter database Recover Managed Standby Database Cancel;
Database Altered.
SQL> Alter database Recover Managed Standby Database Cancel;
Database Altered.
10. Now you are ready to perform a manual switchover from Physical Standby to Primary.
Execute the following command using “WITH SESSION SHUTDOWN” option.
SQL> Alter Database Commit to Switchover to PRIMARY with session Shutdown;
Database Altered.
Now your PHYSICAL STANDBY Database has become PRIMARY.
SQL> Alter Database Commit to Switchover to PRIMARY with session Shutdown;
Database Altered.
Now your PHYSICAL STANDBY Database has become PRIMARY.
Verify this change, again query the Database role column of V$DATABASE.
Now it will return “PRIMARY”.
11. Shutdown the database and again Start it to Open stage.
11. Shutdown the database and again Start it to Open stage.
SQL > Shutdown immediate;
SQL > Startup Mount;
SQL > Alter database open;
J Manual Switchover without Data guard in Oracle J
Switchover can be done without dataguard. Using switchover operation we can convert physical standby database to primary and primary database to physical standby database in Oracle. Switchover operation can perform without dataguard and with dataguard. Here we can see how to perform switchover operation without dataguard configuration. In following example we assume that our primary database and physical standby database are working properly. Standby database is up with managed recovery mode and there is no archive log gap found in physical standby database. In short we can say that our standby database is synchronize with primary database. Find out step by step switchover operation. This switchover calls as manual switchover.
Primary Database: First we should need to check our job queue processes and aq processes parameters. Check the value of both parameter and save in your log book. These value should need to restore while standby switchover finished. After get value of both parameters disable those parameter with value 0. Execute log switch forcefully for all recent changes also reflect to standby database. After finishing both tasks, execute switchover command in primary database. Once successfully execution of switchover command bring database down. Detail commands have been given below with exact steps by steps.
SQL> show parameter aq_tm_processes
SQL> show parameter job_queue_processes
SQL> alter system set aq_tm_processes=0;
SQL> alter system set job_queue_processes=0;
SQL>alter system switch logfile;
SQL>select switchover_status from v$database;
SQL>alter database commit to switchover to physical standby with session shutdown;
SQL>shutdown immediate;
Physical Standby Database:
First check archive sequence applied in standby database. Because Physical Standby database needs to synchronize with primary database. Disable log archive destination which contains service of Primary database. Execute switchover command and wait for finishing. After successfully completion of switchover command, shutdown database.
SQL>startup nomount
SQL>alter database mount standby database;
SQL>alter system set log_archive_dest_state_2=defer scope=both;
SQL>select switchover_status from v$database;
SQL>alter database commit to switchover to primary;
SQL>shutdown immediate
Now switchover operation is almost finished. We should need to start both Primary (switched from physical standby) database and Physical standby (switched from primary) database carefully.
Now start NEW PRIMARY database (it was previous physical standby database).
SQL>startup
SQL> select switchover_status from v$database;
Now start NEW PHYSICAL STANDBY database (It was previous primary database).
SQL>startup nomount
SQL>alter database mount standby database;
SQL> select switchover_status from v$database;
SQL>recover managed standby database disconnect from session;
In NEW Primary Database:
Enable archive log destination which contains service of standby database. Add tempfile in new primary database because before it was physical standby. Restore all processes parameters which we got in first step. Switch log forcefully to monitor archive shipping to physical standby database.
SQL>alter system set log_archive_dest_state_2=enable scope=both;
SQL> alter tablespace TEMP add tempfile '/u01/orcl/oradata/temp/temp01.dbf' size 1000M;
SQL> alter system set job_queue_Processes=10;
SQL> alter system set aq_tm_processes=5;
SQL>alter system switch logfile;
Check connectivity from new physical standby to new primary database. Also monitor archive log shipping from primary database to physical standby database. Post check for switchover_status of v$database from both new databases. Manual switchover without dataguard is finished.
No comments:
Post a Comment