Dataguard
Oracle Data Guard
ensures high availability, data protection, and disaster recovery for enterprise
data. Data Guard provides a comprehensive set of services that create, maintain,
manage, and monitor one or more standby databases to enable production Oracle
databases to survive disasters and data corruptions. Data Guard maintains these
standby databases as copies of the production database. Then, if the production
database becomes unavailable because of a planned or an unplanned outage, Data Guard
can switch any standby database to the production role, minimizing the downtime
associated with the outage.
You can manage primary and standby
databases using the SQL command-line interfaces or the Data Guard broker
interfaces.
Primary Database:
Its a running Production Database.
Stand By Database:
A replica
of the primary database. Using a primary database, you can create up to thirty
standby databases. Once created, Data Guard automatically
maintains each standby database by transmitting archive logs from the primary
database and then applying the archives to the standby database. Similar to a
primary database, a standby database can be either a single-instance Oracle
database or an Oracle RAC database.
The types of standby databases are as follows:
·
Physical standby database à provides
a physically identical copy of the primary database, with the same database
structures and it will be same block-for-block basis. The database schema,
including indexes, everything. A physical standby database is kept synchronized
with the primary database.
·
Logical standby database à the
logical standby database is kept synchronized with the primary database through
SQL Apply, which transforms the data in the redo received from the primary
database into SQL statements and then executes the SQL statements on the
standby database.
·
Snapshot Standby database à like a
physical or logical standby database, a snapshot standby database receives and
archives redo data from a primary database. Unlike a physical or logical
standby database, a snapshot standby database does not apply the redo data that
it receives. The redo data received by a snapshot standby database is not
applied until the snapshot standby is converted back into a physical standby
database.
Dataguard
Protection Modes:
·
Maximum
availability à Its first priority is to be available its second priority is
zero loss protection, In the event that the standby server is unavailable
the primary will wait the specified time in the NET_TIMEOUT parameter
before giving up on the standby server and allowing the primary to continue to
process. Once the connection has been re-established the primary will
automatically resynchronize the standby database.
·
Maximum
performance à this is
the default protection mode. It provides the highest level of data protection
that is possible without affecting the performance of a primary database. This
is accomplished by allowing transactions to commit as soon as all redo data
generated by those transactions has been written to the online log in the
primary.
·
Maximum
Protection à This
protection mode ensures that no data loss will occur if the primary database
fails. Transactions do not commit until all redo data needed to recover those
transactions has been written to the online redo log and to the standby redo
log on at least one synchronized standby database.
Set the data
protection mode.
Execute the following SQL statement on the primary database:
ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE {AVAILABILITY |
PERFORMANCE | PROTECTION};
Perform the following query on the primary database to confirm
that it is operating in the new protection mode:
SQL> SELECT PROTECTION_MODE FROM V$DATABASE;
Failover
A standby database will become the
primary database in case a primary database fails.
Switchover
The primary database will switch roles
with one of its standby databases. There is no data loss during a switchover.
After a switchover, each database continues to participate in the Data Guard
configuration with its new role.
Dataguard Services
The
services required on the primary
database are:
- Log Writer Process (LGWR) - Collects redo information and updates the online redo logs. It can also create local archived redo logs and transmit online redo to standby databases.
- Archiver Process (ARCn) - One or more archiver processes make copies of online redo logs either locally or remotely for standby databases.
- Fetch Archive Log (FAL) Server – Pushes archived redo log files to the standby site. Services requests for archive redo logs from FAL clients running on multiple standby databases. Multiple FAL servers can be run on a primary database, one for each FAL request.
The
services required on the standby
database are:
- Fetch Archive Log (FAL) Client - Pulls archived redo log files from the primary site. Initiates transfer of archived redo logs when it detects a gap sequence.
- Remote File Server (RFS) - Receives archived and/or standby redo logs from the primary database.
- Archiver (ARCn) Processes - Archives the standby redo logs applied by the managed recovery process (MRP).
- Managed Recovery Process (MRP) - Applies archive redo log information to the standby database.
Synchronous
transport (SYNC):
It is also referred to as "zero
data loss" method because, in the diagram,
The user commits a transaction creating a redo record in the SGA,
the LGWR reads the redo record from the log buffer and writes it to the online
redo log file and waits for confirmation from the LNS (Log Network
Server).
The LNS reads the same redo record
from the buffer and transmits it to the standby database using Oracle Net
Services, the RFS receives the redo at the standby database and writes it to
the Standby Redo Logs.
When the RFS receives a write complete
from the disk, it transmits an acknowledgment back to the LNS process on the
primary database which in turns notifies the LGWR that the transmission is complete;
the LGWR then sends a commit acknowledgment to the user.
Asynchronous
transport (ASYNC) is different from SYNC in that it
eliminates the requirement that the LGWR waits for an acknowledgment from the
LNS. A log file gap occurs whenever a primary database continues to commit
transactions while the LNS process has ceased transmitting redo to the standby
database (network issues). The primary database continues writing to the
current log file.
Data Guard Broker
- Configuration
- Monitoring
- Alerting
- Performance Analysis
- Manual switchover's
- Automatic switchover's
The
broker can make changes to a Data Guard setup (physical or logical database),
the type of changes are configuration, transport modes, apply setup and role
change services and the protection mode. You can also monitor the health of any
configuration; it is also responsible for implementing and managing the
automatic failover capability also known as Fast-Start Failover (FSFO).
The
broker has three parts,
- A set of background processes on each database.
- A set of configuration files.
- Command line interface (CLI) called DGMGRL.
Background Processes:
- Data Guard Monitor (DMON) - this is the main process and is responsible for coordinating all broker actions as well as maintaining the configuration files, this process is enabled or disabled with the DG_BROKER_START parameter.
- Broker Resource Manager (RSM) - is responsible for handling any SQL commands used by the broker that need to be executed on one of the databases in the configuration. These SQL commands are made as a result of a change to the configuration made through DGMGRL or are the configuration commands executed by the broker during the database startup.
- Data Guard Net Server (NSVn) - from 1 to n can exist, they are responsible for making contact with the remote databases and sending across any work items to the remote database.
- DRCn - these network receiver processes establish the connection from the source database NSVn process, this is a similar connection to the LogWriter Network Service (LNS) to a Remote File Server (RFS) connection for redo transport. When the broker needs to send something (data or SQL) between databases it uses this NSV to DRC connection, this connection are started as needed.
Each
database has two copies of the configuration, stored in the below parameter
locations
- DG_BROKER_CONFIG_FILE1
- DG_BROKER_CONFIG_FILE2
By
default they are stored in $ORACLE_HOME/dbs directory with the filename of
dr1.dat and dr2.dat, if using a RAC
these should be keep on a shared location as only one set of copies can be
exist for the entire RAC.
The
primary has the master copies of the configuration files, all changes to these
files are done by the primary, even if you are on the standby server the
configuration changes will be passed to the primary via NSV-DRC processes and
the DMON process will make the change which is then propagated back to the
standby databases. The reason for the multiple copies of the configuration file
is if the primary has a problem then each standby database has a copy, when a
failover occurs the standby that becomes the primary will then become the
master of those configuration files.
Command
line interface
You have two choices to
interact with Data Guard either Enterprise Manager (EM) or the broker Command
Line Interface DGMGRL you can swap between the two with a few simple
configuration changes. To gain full flex functionality of Data Guard through
Grid Control you must use the CLI.
Broker
CLI DGMGRL comes with Enterprise Manager, you can run the broker on any
platform you wish it does not need to be the same platform as the primary or
standby databases.
You
can access the DGMGRL command line using the below
DGMGRL
CLI
|
[primargydg01] > dgmgrl sys/oracle
[primargydg01] > dgmgrl sys/oracle@PROD1
|
When
you start the DGMGRL command line it does not connect you to a database, only
the Data Guard configuration. Before you start to use the broker there are a
number of things to perform first
- Configure the Broker parameters
- The broker and the listener
- RAC and the broker
- Connecting to the broker
Even
if you use CLI, as mentioned above there are two configuration files, both the
systems parameters must be set on the primary and all standby databases, you
put set this files anywhere you like,
data
Guard configuration files location
|
# Directory
alter system set DG_BROKER_CONFIG_FILE1 = ' alter system set DG_BROKER_CONFIG_FILE2 = '
# ASM
alter system set DG_BROKER_CONFIG_FILE1 = '+DATA/PROD1/Broker/dr1PROD1.dat'; alter system set DG_BROKER_CONFIG_FILE1 = '+FLASH/PROD1/Broker/dr2PROD1.dat';
Note: if you are using ASM then the directories must be created
already, use the asmcmd command to do this, the broker will
create symlinks to the actual config files but the directories must be there
|
In a
RAC environment they should be in a shared area and all nodes should point to
the same files, only one set of files must exist in a RAC.
Once
the directory has been created you are ready to start up the broker, do this on
all databases (primary and standby)
Start
the Broker
|
alter
system set DG_BROKER_START=TRUE SCOPE=BOTH;
|
When
you start the broker it does not configure any files yet because you have to
use the DGMGRL command, all it does is start the necessary processes we
mentioned earlier. If you need to change the location of the files the steps
are below
- stop the broker using the DG_BROKER_START parameter by setting this to false (on all nodes in the cluster)
- change the DG file destination parameters
- copy the files to the new location
- then re-start the broker, again set the DG_BROKER_START parameter to true (on all nodes in the cluster)
The
broker uses the listener.ora to make connections to the databases, setup both
redo transport and archive gap resolution and perform role transitions. We need
to create a special static entry in the listener.ora file for each database in
the broker configuration, this entry makes it possible for the broker to
connect to an idle instance using a remote SYSDBA connection and perform the
necessary startup. Here is an example
Broker
listener.ora static entry
|
## Primary and all standby databases
SID_LIST_LISTENER =
(SID_LIST = (SID_DESC = (GLOBAL_DBNAME = PROD1DR_DGMGRL) (ORACLE_HOME = /scratch/OracleHomes/OraHome111) (SID_NAME = PROD1DR) ) ) SID_LIST_LISTENER = (SID_LIST = (SID_DESC = (GLOBAL_DBNAME = PROD1_DGMGRL) (ORACLE_HOME = /scratch/OracleHomes/OraHome111) (SID_NAME = PROD1) ) ) |
Once
the above has been done we are ready to create a broker configuration, there is
no difference in a non-RAC or RAC environment, to connect to the configuration
you use the dgmgrl command but one word of advice make sure
that you use a specific user and password and not the / as the user, this has
known to cause problems at a later date
DGMGRL
CLI
|
[primargydg01] > dgmgrl sys/oracle
[primargydg01] > dgmgrl sys/oracle@PROD1
|
There
are four main areas to DGMGRL CLI
- Connection and Help - connect, help and exit
- Creation and Editing - create, add, enable, edit and convert
- Monitoring - show
- Role Transition - switchover, failover and reinstate
The
first step is to create the base configuration
Create
base configuration
|
# Primary Database server
DGMGRL> connect sys/password DGMGRL> create configuration prod1 as > primary database is prod1 > connect identifier is prod1; Configuration "prod1" created with primary database "prod1" |
The
above command would have created a base configuration, you can display the
configuration using the below command
Display
configuration
|
#
Primary Database server
DGMGRL> connect sys/password DGMGRL> show configuration
Configuration - prod1
Protection Mode: MaxPerformance
Databases: prod1 - Primary database prod1dr - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
## Here is the same screen shot after adding a logical server
DGMGRL> show configuration;
Configuration - prod1
Protection Mode: MaxPerformance
Databases: prod1 - Primary database prod1dr - Physical standby database prod1lr - Logical standby database
Fast-Start Failover: DISABLE
Configuration Status:
SUCCESS |
Next
we add the standby database to the configuration, if you receive an error
stating that the properties could not be imported from the database, you need
to check DB_UNIQUE_NAME also the TNSNAME and the transport parameter
(LOG_ARCHIVE_DEST_n), so what is the proper setup, you must have your redo
transport parameter defined using the DB_UNIQUE_NAME method, meaning that each
redo transport parameter must contain the DB_UNIQUE_NAME=
attribute. The broker will search all of your LOG_ARCHIVE_DEST_n parameters
looking for a database with a unique name that matches the database name you
entered for the command. If you have not done this then you need to use the
full set of arguments to add the database
Add
the standby database
|
# Primary Database server - the
full command set
DGMGRL> connect sys/password DGMGRL> add database prod1dr > as connect identifier is prod1dr > maintained as physical;
Database "prod1dr"
added
# if you want to add a logical
standby you can use the following
DGMGRL> add database prod1lr
> as connect identifier is prod1lr > maintained as logical; Database "prod1lr" added |
When
you display the configuration you should see that the standby database has been
added
Display
configuration
|
# Primary Database server
DGMGRL> connect sys/password DGMGRL> show configuration
Configuration - prod1
Protection Mode: MaxPerformance
Databases: prod1 - Primary database prod1dr - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
DISABLED
## Because we have not setup any parameters for Broker all
parameters defaults will be presumed,
## some parameters are set by examining the database others are Broker default values. DGMGRL> show database verbose prod1
Database - prod1
Role: PRIMARY
Intended State: OFFLINE Instance(s): PROD1
Properties:
DGConnectIdentifier = 'prod1' ObserverConnectIdentifier = '' LogXptMode = 'ASYNC' DelayMins = '0' Binding = 'optional' MaxFailure = '0' MaxConnections = '1' ReopenSecs = '300' NetTimeout = '30' RedoCompression = 'DISABLE' LogShipping = 'ON' PreferredApplyInstance = '' ApplyInstanceTimeout = '0' ApplyParallel = 'AUTO' StandbyFileManagement = 'AUTO' ArchiveLagTarget = '0' LogArchiveMaxProcesses = '4' LogArchiveMinSucceedDest = '1' DbFileNameConvert = '' LogFileNameConvert = '' FastStartFailoverTarget = '' StatusReport = '(monitor)' InconsistentProperties = '(monitor)' InconsistentLogXptProps = '(monitor)' SendQEntries = '(monitor)' LogXptStatus = '(monitor)' RecvQEntries = '(monitor)' HostName = 'primarydg01' SidName = 'PROD1' StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=primarydg01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=PROD1_DGMGRL)(INSTANCE_NAME=PROD1)(SERVER=DEDICATED)))' StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST' AlternateLocation = '' LogArchiveTrace = '0' LogArchiveFormat = '%t_%s_%r.dbf' TopWaitEvents = '(monitor)'
Database Status:
DISABLED
DGMGRL> show database verbose prod1dr
Database - prod1dr
Role: PHYSICAL STANDBY
Intended State: OFFLINE Transport Lag: (unknown) Apply Lag: (unknown) Real Time Query: OFF Instance(s): PROD1DR
Properties:
DGConnectIdentifier = 'prod1dr' ObserverConnectIdentifier = '' LogXptMode = 'ASYNC' DelayMins = '0' Binding = 'OPTIONAL' MaxFailure = '0' MaxConnections = '1' ReopenSecs = '300' NetTimeout = '30' RedoCompression = 'DISABLE' LogShipping = 'ON' PreferredApplyInstance = '' ApplyInstanceTimeout = '0' ApplyParallel = 'AUTO' StandbyFileManagement = 'AUTO' ArchiveLagTarget = '0' LogArchiveMaxProcesses = '4' LogArchiveMinSucceedDest = '1' DbFileNameConvert = '' LogFileNameConvert = '' FastStartFailoverTarget = '' StatusReport = '(monitor)' InconsistentProperties = '(monitor)' InconsistentLogXptProps = '(monitor)' SendQEntries = '(monitor)' LogXptStatus = '(monitor)' RecvQEntries = '(monitor)' HostName = 'physicaldg01' SidName = 'PROD1DR' StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=physicaldg01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=PROD1DR_DGMGRL)(INSTANCE_NAME=PROD1DR)(SERVER=DEDICATED)))' StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST' AlternateLocation = '' LogArchiveTrace = '0' LogArchiveFormat = '%t_%s_%r.dbf' TopWaitEvents = '(monitor)'
Database Status:
DISABLED |
When
you do add any additional standby databases always double check the parameters
are set correctly for the type of configuration that you want, any changes
should be done before enabling the additional standby database.
Once
we are happy that all standby databases have been added and their parameters
have been check we are ready to enable the configuration, the command will
issue alter system commands on both the primary and standby
databases, start the redo transport to the standby databases and the apply services,
before examining the configuration give it some time to complete as it does
take a while, otherwise you see some ORA-16610 errors.
Enabling
the configuration
|
#
Primary Database server
DGMGRL> connect sys/password DGMGRL> enable configuration Enabled. |
You
can tail the alert log file and see what action is taking place, hopefully
after a few minutes you should see a successful configuration running, you can
also check the redo transport and apply services by viewing the v$managed_standby view.
Display
configuration
|
# Primary Database server
DGMGRL> connect sys/password DGMGRL> show configuration;
Configuration - prod1
Protection Mode: MaxPerformance
Databases: prod1 - Primary database prod1dr - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
# Standby database - check the LGWR - RFS connection process and
the MRP0 is in the applying state
sql> select client_process, process, thread#, sequence#, status from v$managed_standby; |
As I mentioned
above the broker has a number of properties which can be changed, to change the
property we use the EDIT command in the DGMGRL CLI
Edit
configuration
|
EDIT
CONFIGURATION SET PROPERTY
EDIT DATABASE EDIT INSTANCE |
Configuration Properties
|
|
BystandersFollowRoleChange
|
|
FastStartFailoverAutoReinstate
|
|
FastStartFailoverLagLimit
|
|
FastStartFailoverPmyShutdown
|
|
FastStartFailoverThreshold
|
|
CommunicationTimeout
|
the
amount of time the broker will wait for a response before giving up (default
180 seconds), you can remove any timeout by setting this to 0
|
Database Properties
|
|
FastStartFailoverTarget
|
|
ObserverConnectIdentifier
|
|
ApplyInstanceTimeout
|
defines
how long the broker should wait until moving the apply process to another
instance in a standby RAC, default is 0 when means immediately
|
PreferredApplyInstance
|
allows
you to tell the broker where you would like the apply to run when you have a
RAC, by default it is empty which means it can choose any instance
|
ArchiveLagTarget
|
alters
database parameter: ARCHIVE_LAG_TARGET
|
DbFileNameConvert
|
alters
database parameter: DB_FILE_NAME_CONVERT
|
LogArchiveMaxProcesses
|
alters
database parameter: LOG_ARCHIVE_MAX_PROCESSES
|
LogArchiveMinSuccessDest
|
alters
database parameter: LOG_ARCHIVE_IN_SUCCEED_DEST
|
LogFileNameConvert
|
alters
database parameter: LOG_FILE_NAME_CONVERT
|
LogShipping
(standby role only) |
enables or defers redo transport to that standby database.
alters database parameter: LOG_ARCHIVE_DEST_STATE_n
|
StandbyFileManagement
(standby role only) |
alters
database parameter: STANDBY_FILE_MANAGEMENT
|
Instance Properties
|
|
HostName
|
No
explaination needed here
|
SidName
|
No
explaination needed here
|
LogArchiveTrace
|
alters
database parameter: LOG_ARCHIVE_TRACE
|
LogArchiveFormat
|
alters
database parameter: LOG_ARCHIVE_FORMAT
|
StandbyArchiveLocation
|
alters
database parameter: LOG_ARCHIVE_DEST_n
|
AlternateLocation
|
alters
database parameter: LOG_ARCHIVE_DEST_n
|
LsbyMaxSga
|
alters
database parameter: MAX_SGA
|
LsbyMaxServers
|
alters
database parameter: MAX_SERVERS
|
There
are a number of commands that you can use to change the state of the database
turn
off/on the redo transport service for all standby databases
|
Primary
|
DGMGRL>
edit database prod1 set state=transport-off;
DGMGRL> edit database prod1 set state=transport-on; |
turn
off/on the apply state
|
Standby
|
DGMGRL> edit database prod1dr set state=apply-off;
DGMGRL> edit database prod1dr set state=apply-on; |
put
a database into a real-time query mode
|
Standby
|
DGMGRL>
edit database prod1dr set state=apply-off;
sql> alter database open read only; DGMGRL> edit database prod1dr set state=apply-on; |
change
the protection mode
|
Primary
|
# Choose what level of protection you require
sql> alter database set standby to maximize performance; sql> alter database set standby to maximize availability; sql> alter database set standby to maximize protection;
# display the configuration
DGMGRL> show configuration |
There
are a number of useful monitoring commands and log files that can help with
diagnosing problems
configuration
|
DGMGRL> show configuration;
|
database
|
DGMGRL>
show database prod1;
DGMGRL> show database prod1dr; # There are a number of specific information commands, here are the most used DGMGRL> show database prod1 statusreport; DGMGRL> show database prod1 inconsistentProperties; DGMGRL> show database prod1 inconsistentlogxptProps; DGMGRL> show database prod1 logxptstatus; DGMGRL> show database prod1 latestlog; |
Logfiles
|
#
change the instance name to reflect the one you have choosen
prod1 (alert log): /u01/app/oracle/diag/rdbms/prod1/PROD1/trace/alert_PROD1.log prod1 (DG log): /u01/app/oracle/diag/rdbms/prod1/PROD1/trace/drcPROD1.log
prod1dr (alert log):
/u01/app/oracle/diag/rdbms/prod1dr/PROD1DR/trace/alert_PROD1DR.log
prod1dr (DG log): /u01/app/oracle/diag/rdbms/prod1dr/PROD1DR/trace/drcPROD1DR.log |
DG
Menu Utility
|
DG
Menu
======================================== # ------------------------------------------------------------------------- # # # # Data Guard Check List - primarydg01 # # # 0. Review database information and status # # 1. Check for password file # # 2. Check for forced logging and unrecoverable activities # # ------------------------------------------------------------------------- # # 3. Check for archive log mode # # 4. Check for standby redo logs # # 5. Check current SCN on primary and standby databases # # 6. Check archive log destinations # # ------------------------------------------------------------------------- # # 7. Check Data Guard Status View for errors and fatal messages # # 8. Check Managed Recovery Process Status # # 9. Check for missing archive logs # # 10. Check archive log gaps on the standby database # # 11. Check average apply rate / active apply rate # # 12. Check transport / apply lag # # 13. How far behind is my Data Guard in terms of time? # # # # ------------------------------------------------------------------------- # # 20. Launch the Logical Standby Data Guard Submenu # # ------------------------------------------------------------------------- # # 21. Generate init.ora entries for primary database # # 22. Generate init.ora entries for standby database # # 23. Generate tnsnames.ora entries for primary and standby databases # # 24. Generate SQL syntax to create standby redo logs # # # # ------------------------------------------------------------------------- # # 30. Generate syntax to duplicate standby database from active database # # # # x. Exit # # ------------------------------------------------------------------------- # # Enter Task Number:
# -------------------------------------------------------------------------
#
# # # Logical Standby Data Guard Check List - primarydg01 # # # 1. Check Logical Progress - View Overall Progress Of SQL Apply # # 2. Check Logical Events - History on Logical Standby Apply Activity # # 3. Check Logical Events - Detailed View # # 4. Check Logical Stats - Logical Standby Stats # # 5. Check Logical Parameters - Logical Standby Parameters # # 6. Look At What The Logical Standby Processes Are Doing # # Coordinator, Reader, Builder, Preparer, Analyzer, Applier ... # # 7. Look At The Status Codes For The Logical Standby Processes # # 8. Look At Events The Applier Process Is Stuck On # # ------------------------------------------------------------------------- # # 10. Check the LCR - Look At Bytes Paged Out # # 11. Generate Syntax To Skip Transactions # # Based On MAX(EVENT_TIME) FROM DBA_LOGSTDBY_EVENTS # # DO NOT SKIP DML STATEMENTS # # 12. Diagnostic Script Per Metalink Note ID: 241512.1 # # Look for output in logical_diag_[ORACLE_SID_MONDD_HHMM.out] format # # ------------------------------------------------------------------------- # # 20. Review What Is NOT Supported In Your Logical Standby Database # # 21. Review Tables That Do NOT have Unique Identifiers # # 22. Check Primary Database For Supplemental Logging # # # # ------------------------------------------------------------------------- # # 30. Start Logical Standby Database # # 40. Stop Logical Standby Database - PLEASE BE CAREFUL !!!!! # # THIS WILL STOP THE LOGICAL STANDBY APPLY PROCESS # # ------------------------------------------------------------------------- # # # # x. Exit # # ------------------------------------------------------------------------- # # Enter Task Number: |
Remove
the broker
|
|
Data
Guard has implemented something called Fast-start Failover (FSFO)
which uses the Broker to perform the failover actions when there is a problem.
This architecture uses a third member quorum that ensures that the failover
occurs only when everything meets the rules that you have defined, when the
failover has happened the primary will never be allowed to open to avoid any
split-brain scenarios, this would be a bit of a nightmare should both databases
be open and processing transactions.
The
third member is called the Observer and its job is to maintain
a connection with the primary and target standby databases, monitoring there
health and performing any failover's necessary, the Observer will also
reinstate the old primary when it comes back on, if it can. The observer pings
the primary database and that the first sign of trouble it will start to
countdown (which you configure), if it does re-establish the connection it will
make all the necessary checks before goes back to watch mode again, if the
timer expires then it checks that the standby can take over and initiates a
failover, this switchover will have all automatically and in the background
using the Broker. If and when the primary comes back the Observer will
reinstate the old primary as a standby database again using the Broker to
achieve this.
It is
import on where the Observer is placed in the network, only one observer per Data Guard installation can be installed, so
this server must have access to both the primary and standby databases with as
much redundant networking as possible. Next your thing is how much the observer
is going to cost me, not much as it can run on most platforms and only required
the Oracle Client Kit for the version of Oracle that you are running, you must
setup the TNSNAMES on the observer to allow it to ping the databases. If the
Observer was to crash it will have no impact on the current Oracle environment,
the only impact is that FSFO will not be available until the Observer is up and
running again. The Observer can monitoring the following
- Database crash
- System crash
- The loss of the network
- Complete site outage
You
can also get FSFO to perform a shutdown abort on the primary
when other issues arise such as
- Datafile Offline
- Corrupted Controlfile
- Corrupted Dictionary
- Inaccessible Logfile
- Stuck Archiver
The
tags above must be entered as they are above otherwise the Broker will not
understand them
Monitor
a specific condition via the Broker
|
DGMGRL>
enable fast_start failover condition "Corrupted Controlfile";
DGMGRL> enable fast_start failover condition "Datafile Offline"; |
To
display what you are monitoring use
Display
conditions that are be monitored
|
DGMGRL>
show fast_start failover;
Fast-Start Failover: DISABLED
Threshold: 30 seconds
Target: (none) Observer: (none) Lag Limit: 30 seconds Shutdown Primary: TRUE Auto-reinstate: TRUE
Configurable Failover Conditions
Health Conditions: Corrupted Controlfile YES Corrupted Dictionary YES Inaccessible Logfile NO Stuck Archiver NO Datafile Offline YES
Oracle Error Conditions:
(none) |
Now
that you have an overview of FSFO it's time to set it up and test it, just a
quick check before we progress, make sure that the following has been setup or
configured
- Use the Broker with all its prerequisites
- Enable flashback database on both primary and standby
- Setup the configuration correctly for the protection mode (standby redo logs files on both sides, redo transport setup the same in both directions)
- Install the Observer system and configure TNSNAME
If
you are using more than one standby you must let the Broker know which one you
want to become the primary, if you only have one then the broker will know
already
Select
the standby to become the primary
|
DGMGRL>
edit database prod1 set property FastStartFailoverTarget = 'prod1dr';
DGMGRL> edit database prod1dr set property FastStartFailoverTarget = 'prod1'; |
Now
its time to discuss how long you should wait before you want to failover, you
don't want it too short just in case you network blips, by default it is set to
30 seconds but you can go down to 6 seconds if you wish.
change
threshold
|
DGMGRL>
edit configuration set property FastStartFailoverTargetThreshold
= 45;
|
You
can control the amount of data loss, if using one of the lesser protection
modes, the greater the lag limit set the greater the data loss, again the time
is in seconds.
lag
limit
|
DGMGRL>
edit configuration set property FastStartFailoverLagLimit = 60;
|
If
the data loss is less than the limit the failover will proceed, if more redo
would be loss than the lag limit, the failover will not occur and nothing
happens until the primary database either comes back and processing continues
or you choose to failover manually, suffering the additional data loss. If you
are using maximum protection mode then this property is ignored.
Here
are two more additional properties that you can setup regarding the primary,
one is to shutdown it down if it becomes hung and the other is to reinstate it
if a failover does occur
abort
primary if in a hung state
|
|
reinstate
primary after a failover
|
DGMGRL>edit configuration set property FastStartFailoverAutoReinstate
= true;
|
Once
you are happy with everything you can now enable the FSFO
Enable
FSFO
|
DGMGRL>
enable fast_start failover;
## Display the configuration DGMGRL> show fast_start failover; |
No comments :
Post a Comment