Sgsmmmomva

Hitachi-Oracles BCM Platform Solution
Verification Report on Oracle Active Data Guard

Date: March 2008
Version: 1.0

- 1 -
Copyright 2008 Hitachi, Ltd. All Rights Reserved.
Copyright 2008 Oracle Corporation J apan. All Rights Reserved.

- 2 -
1. Introduction
Oracle Database 11g Release 1 was released in October 2007 as the latest major version of Oracle Database.
In this version, Oracle Data Guard offers a number of new and innovative features to help ensure business
continuity by protecting important corporate data, including a feature that initiates a failover to a remote
standby system in the event the production system fails due to a disaster or emergency.
Oracle Corporation J apan and Hitachi Ltd. performed verification tests of Oracle Data Guard at the Oracle
GRID Center, building a large-scale transaction environment for a simulated production system combining
Hitachi BladeSymphony high-reliability blade servers and Oracle Database 11g Release 1.
This white paper introduces the BCM (Business Continuity Management) platform solution realized by
combining Hitachis hardware and Oracle Database 11g Release 1 and results of verification with respect to
the effectiveness of features provided by Oracle Active Data Guard, a new option in the Oracle Database
11g Release.

Acknowledgements
Oracle Corporation J apan established a partnership with Hitachi Ltd. and other grid strategy partner
companies in November 2006, opening the Oracle GRID Center
(http://www.oracle.co.jp/solutions/grid_center/index.html), a facility that incorporates the most advanced
technologies, with the goal of constructing next-generation business solutions capable of optimizing
enterprise system infrastructures. Publication of this white paper was made possible by hardware and
software provided to the Oracle GRID Center by Intel Corporation and Cisco Systems G.K., which support
the purpose of the Oracle GRID Center, as well as support and aid provided by engineers from these
companies. We wish to express our sincere gratitude to the companies and engineers for their support.
*All rights reserved.
Disclaimer
This document is provided for informational purposes only. The contents hereof are subject to change
without prior notice. Oracle Corporation J apan or Hitachi, Ltd does not warrant that this document is
error-free, nor does it provide any other warranties or conditions, whether expressed or implied, including
implied warranties and conditions of merchantability or fitness for a particular purpose. Oracle Corporation
J apan and Hitachi Ltd. specifically disclaim any liability with respect to this document. No contractual
obligations are formed by this document, either directly or indirectly. This document may not be
reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without
prior written permission from Oracle Corporation J apan and Hitachi Ltd.
Trademarks
BladeSymphony is a registered trademark of Hitachi Ltd.
ORACLE is a registered trademark of Oracle Corporation.
Intel and Xeon are trademarks of Intel Corporation in the United States and other countries.
Red Hat is a trademark or a registered trademark of Red Hat Inc. in the United States and other countries.
Linux is a registered trademark of Linus Torvalds.
Cisco is a registered trademark of Cisco Systems, Inc. in the United States and other countries.
Other names of companies and products used herein are trademarks or registered trademarks of their
respective owners.

- 3 -

2. Contents
1. Introduction............................................................................................................................................. 2
2. Contents................................................................................................................................................... 4
3. Criticality of Business Continuity Management (BCM) ..................................................................... 6
4. Oracle Data Guard ................................................................................................................................. 7
5. Examples of BCM Platform Solutions Realized by Hitachi and Oracle .......................................... 10
6. Verifying Oracle Active Data Guard................................................................................................... 12
6-1 Purpose and specifics of verification tests........................................................................................12
6-2 Verification environment...................................................................................................................13
6-2-1 System configuration.............................................................................................................13
6-2-2 Hardware used.......................................................................................................................13
6-2-3 Software used.........................................................................................................................14
6-2-4 About workloads....................................................................................................................14
7. Verification Results............................................................................................................................... 15
7-1 Creating a standby database using RMAN network duplicate..........................................................15
7-2 Effective use of standby site via Oracle Active Data Guard and reductions in system downtime
based on effective use of standby site...............................................................................................19
7-3 Measuring REDO apply performance for standby database.............................................................23
7-4 Fast-Start Failover.............................................................................................................................27
7-5 Failover under high-load transaction condition.................................................................................29
8. Summary ............................................................................................................................................... 32

- 4 -

Figures
Figure 4-1: Schematics of Oracle Data Guard operation.......................................................................7
Figure 4-2: Effective use of standby database via Real-time Query......................................................8
Figure 4-3: Effective use of standby database with Snapshot Standby.................................................8
Figure 4-4: Fast-Start Failover operation..............................................................................................9
Figure 5-1: Online system maintenance based on Hitachi hardware and Oracle Data Guard..........10
Figure 5-2: Data protection with rapid application of server resources at reduced standby cost......11
Figure 6-1: Configuration of the system used in verification tests......................................................13
Figure 7-1: Conventional standby database production method..........................................................16
Figure 7-2: Creating a standby database using RMAN network duplicate..........................................16
Figure 7-3: Previous drawbacksRelationship between standby site use time and system
downtimes.....................................................................................................................20
Figure 7-4: Effective use of standby site via Oracle Active Data Guard.............................................21
Figure 7-5: Simulated business scenario used in verification tests......................................................22
Figure 7-6: Process of failover to physical standby database..............................................................23
Figure 7-7: Low REDO apply performance........................................................................................24
Figure 7-8: Adequate REDO apply performance................................................................................24
Figure 7-9: Fast-Start Failover operation............................................................................................27
Figure 7-10: Verifying failover under high-load transaction conditions.............................................29
Table
Table 7-1: Apply performance comparison patterns...........................................................................25
Table 7-2: Verification configuration patterns....................................................................................29
Table 7-3: Verified failure patterns.....................................................................................................29
Table 7-4: Verified failure patterns and verification results................................................................30
Graphs
Graph 6-1: CPU usage of primary database servers during load generation.......................................15
Graph 7-1: Comparison of standby data production times (via conventional method and using
RMAN network duplicate)............................................................................................17
Graph 7-2: CPU usage and network transfer volume in creation of standby database via conventional
method (top: primary database server, bottom: standby database server).....................17
Graph 7-3: CPU usage and network transfer volume in production of standby database using RMAN
network duplicate (top: primary database server, bottom: standby database server).....18
Graph 7-4: Business transaction throughput, CPU usage of primary database server, and network
transfer volumes during creation of standby database using RMAN network duplicate19
Graph 7-5: Effective use of CPU resources of standby site with Oracle Active Data Guard..............21
Graph 7-6: Reductions in system downtime via Oracle Active Data Guard during use of physical
standby site....................................................................................................................22
Graph 7-7: Comparison of volume of generated REDO against REDO apply performance...............25
Graph 7-8: Apply performance comparison........................................................................................26
Graph 7-9: Transactions during failure of all instances for the primary database and patterns in CPU
usage for individual database servers............................................................................31

- 5 -

3. Criticality of Business Continuity Management (BCM)
IT systems have grown increasingly important for corporations. Even in the event of an
earthquake-induced site failure or system failure caused by hardware malfunction, corporations must
continue to safeguard critical business data such as customer information and rapidly restore system
functionality to ensure continuing services. In particular, corporations must meet the following
requirements:
Business continuity
Interruptions or outages affecting important services pose serious threats to the entire business,
in certain cases resulting not just in lost income, but serious damage to the confidence of
customers and associated companies.
Data protection
Data remains a critical asset for any company. Corporate datafor example, payroll or
employee information, client records, valuable research results, financial records, or history
informationcan require both significant sums and effort to reconstruct or regenerate once lost,
if this is even possible, and in some cases such data loss may impair a companys capacity to
continue operating.
System flexibility to adapt to changes
IT systems must ensure business continuity even in the event of unplanned system downtimes,
including system failure. These systems must also minimize the duration of planned downtimes,
including downtimes for software updates and hardware maintenance, to reduce any negative
effects on business operations. Particularly in the case of open systems, the rapid pace of
software development requires that procedures for updating software and applying software
patches be kept as short as possible in order to keep systems up to date and maintain systems in
a robust condition. With respect to hardware, rapid developments in multi-core CPU technology
in recent years now makes it possible in certain cases to improve performance and reduce TCO
simply by replacing existing equipment with the latest hardware. In general, agility and
flexibility have become enterprise system requirements.
Cost efficiencyEffective use of standby sites
Also important for ensuring high cost efficiency is effective use of the server resources at
standby sites set aside for disasters and other emergency situations. Ensuring high cost
efficiency leads to the acquisition of countermeasures against system failure. Low resource
efficiency at established standby sites during ordinary operations, on the other hand, will
generally make it more difficult to acquire adequate funding, etc. for systems.
Combining Hitachi BladeSymphony or Hitachi Storage hardware with Oracle Real Application Clusters
(Oracle RAC) and Oracle Data Guard makes it possible to deliver a solution that resolves such issues.

- 6 -

- 7 -
4. Oracle Data Guard
Oracle Data Guard creates a standby database as a copy of the production database (called the primary
database) and provides features that perform a series of comprehensive services for that database, including
maintenance, management, and monitoring. A standby database is created as a copy that maintains
transactional consistency with the primary database. Following the creation of the standby database, REDO
sent from the primary database are used to reflect changes made in the primary database. If the primary
database becomes unavailable due to down, whether planned or unplanned, the standby database gains
primary database status to minimize the downtime. The Oracle Data Guard is provided by Oracle
Database Enterprise Edition.
Primary database
In normal operati on In emergenci es

Standby database
Copy
Primary database connected during normal
operation
Connection switches to standby database in
the event of failure.
Standby database
Primary database

Figure 4-1: Schematics of Oracle Data Guard operation
Standby databases generally come in one of two configurations. One, a physical standby database, is
identical to the primary database at the physical block level. The other, a logical standby database, is
identical to the primary database at the logical row data level.
The version of Oracle Data Guard in Oracle Database 11g Release 1 features various enhancements.
Introduced below are some of the new features examined in our verification testing.
Oracle Active Data Guard
In previous release versions, application of REDO had to be suspended when accessing data in a
physical standby database. A Oracle Active Data Guard option with Oracle Database 11g
Release 1 enables access to data in a physical standby database without suspending the
application of REDO. This feature is called Real-time Query. This feature enhancement allows
normal use of a physical standby database for reporting and other tasks..

Physical standby database Primary database
Normal operation
Patch process
reporting
Backup
acquisition
Off-loading of reporting process and
backup acquisition to standby database
Oracle Data Guard

Figure 4-2: Effective use of standby database via Real-time Query
Oracle Active Data Guard features a high-speed incremental backup feature based on a change-tracking file
when obtaining backups from a standby database, thereby offering both high availability and convenient
data protection against failures in the event of planned downtimes or unplanned outages at the production
site.
Snapshot Standby
The Snapshot Standby feature enables temporary use of a physical standby database as an
easy-to-use read-write test database. Even while being used as a test database, the physical
standby database can receive REDO from the primary database, allowing it to continue
providing the data protection feature. A snapshot standby database is also easily returned to
physical standby database status.
Snapshot standby Primary database
Normal operation
Oracle Data Guard
Client for testing
REDO transfers
continue while
database is open
Open as a
temporary read-
write test
database

Figure 4-3: Effective use of standby database with Snapshot Standby
- 8 -

Creating a standby database using RMAN network duplicate
Previous release versions required the acquisition of a full backup of the primary database on
local site, transfer of the backup to standby site and restoring of the backup to create a standby
database. With Oracle Database 11g Release 1, the enhanced Recovery Manager (RMAN)
network duplicate feature, used for database duplication, backups primary database while at the
same time restoring over the network to the standby. Network duplicate saves time and storage
Fast-Start Failover
The Fast-Start Failover provides a feature that automatically detects failures in the primary
database and initiates failover after failure detection. Detection of failure and initiation of
failover are performed by the observer set up separately from the primary database and standby
database. The observer is a component of Data Guard Broker. Fast-Start Failover enables
automatic failover in the event of a primary database failure without administrator intervention.
Automatic failover
REDO transfer
Standby database Primary database
Observer
Monitoring Monitoring

Figure 4-4: Fast-Start Failover operation
In previous release versions, Fast-Start Failover could be used only in Maximum Availability
modewhich required synchronous transfers of REDO. Oracle Database 11g Release 1 now
supports Maximum Performance mode to allow asynchronous REDO transfer settings, allowing
use in a wider range of operating environments. The new version also provides greater flexibility
in determining whether or not to initiate a failover at the time of failure detection, thereby
meeting various failover requirements.

- 9 -

- 10 -
5. Examples of BCM Platform Solutions Realized by Hitachi
and Oracle
Described below are some examples of the BCM solution realized through the combination of Hitachi
hardware and Oracle Database 11g Release 1.
Online system maintenance
Figure 5-1 shows an example of a Data Guard system configuration consisting of a production
business environment and a test environment. The test environment is used for report tasks using
Oracle Active Data Guard features or as a development environment using the Snapshot Standby
feature. This sample configuration permits not only the application of patch sets to Oracle
software and version updates, but also BladeSymphony server blade replacements and additions
in combination with the Oracle Data Guard switchover feature, and seamless online disk
addition to production environments via Hitachi Storage virtualization. The combination of
Hitachi hardware and Oracle Database 11g Release 1 enables online maintenance of both
software and hardware with minimal impact on production operations.
Test envir onment
Or acl e Data Guard
confi gur ation
(1) Switchover to
test environment
(2) Replacement
with new blade
server
Oracle
rolling
upgrades
Onli ne hard disk addition
to storage pool
Onli ne blade server
replacement
No need to set LVM, ASM, or other OS
No need to reboot for disk recogniti on
Swi tchover of production
environment to minimize impact on
business operations
Pr oduct ion
environment

Figure 5-1: Online system maintenance based on Hitachi hardware
and Oracle Data Guard
Data protection at reduced standby costs and rapid addition of server resources
Figure 5-2 shows an example of a configuration with minimum allocation of standby database
server resources. It provides data protection using Oracle Data Guard while minimizing standby
database costs. If the primary database fails due to a disaster or other reason, a failover to the
standby database is initiated to enable continuing business operations. However, restoring the

service levels of the primary database generally requires the allocation of additional resources to
ensure the same level of processing capacity as the primary databasea requirement that
generally costs a great deal of time and money. But combining the provisioning features of
BladeSymphony and Oracle Real Application Clusters can significantly reduce the cost of
adding server resources while enabling immediate response.
Pr imar y database
Normal
operations
4-node RAC
1-node RAC
Primary
database
failure
4-node RAC
Primary database failure due t o disaster...

Maintaining data protection at low
initial cost byallocating minimum
server resources to the standby
database
Additional server resources are required if the
standbydatabase is used to continue business
operations. Combining BladeSymphony's and
Oracle' s provisioning functions enables
significantlysimplified additional tasks and
immediate response.
+3 nodes
Provi si oni ng
St andby dat abase
Pr imar y database St andby dat abase
Data Guard
configuration
Data Guard
configuration
1-node RAC

Figure 5-2: Data protection with rapid application
of server resources at reduced standby cost

- 11 -

- 12 -
6. Verifying Oracle Active Data Guard

6-1 Purpose and specifics of verification tests
We performed verification testing at the Oracle GRID Center with the following three main goals:
Confirming the effectiveness of new Oracle Data Guard features
We performed verification tests to confirm the effectiveness and usability of the new Oracle
Data Guard features and to check for any important considerations when using the features. In
the verification testing, we focused mainly on the following features:
Creating a standby database using RMAN network duplicate
Benefits of creating a standby using RMAN network duplicate feature
Benefits of effectively using the standby database with Real-time Query feature of Oracle
Active Data Guard and reductions in system downtimes based on effective use of the
standby database
Snapshot Standby
Fast-start Failover
Performance and failover under large-scale high-volume transaction
We performed the verification tests to check for fast, effective failover to the standby database in
the event of a failure while the primary database was under heavy loads and with the CPU and
network resources at maximum capacity. Another goal was to identify any potential issues
associated with use in large-scale, high-volume transaction environments.
These represent critical performance aspects, since the primary purpose of introducing Oracle
Data Guard is to achieve switchover to the standby site in the event of a primary site failure.
Establishing best practices
We performed verification testing to establish procedures for creating a standby database and
managing an Oracle Data Guard environment.
* For a list of the procedures that proved effective in our verification tests, please refer to the
separate document titled Oracle Database 11g Release 1 Physical Standby Setting
Guide(Japanese only).

6-2 Verification environment

6-2-1 System configuration
Figure 6-1 shows the configuration of the system used in our verification tests. The same public network
was used to connect client machines to the database server and to transmit REDO from the primary site to
the standby site. The network bandwidth was 1 Gbps.
Primary site
Client machines
Standby site
Database server:
Hitachi BladeSymphony BS320
Primary site: 2-node RAC
Standby site: 2-node RAC
Cisco Catalyst 6504
Cisco Catalyst 3750
Storage: Hitachi
Adaptable Modular Storage

Figure 6-1: Configuration of the system used in verification tests
6-2-2 Hardware used
Database server
Model Hitachi BladeSymphony BS320 4 blades
CPU Dual-Core Intel
Xeon
processor 3 GHz
2 sockets/blade
Memory 8 GB
Client machine
Model Intel White Box, 4 units
CPU Quad-Core Intel
Xeon
processor 2.66 GHz

1 socket/server
Memory 4 GB
Storage
Model Hitachi Adaptable Modular Storage (AMS)
Hard disk 144 GB 28 HDD (+2 HDD as spare)
RAID group configuration 2D+1P 8 (for Oracle database)
- 13 -

6-2-3 Software used
Database server
OS Red Hat Enterprise Linux 4.5
Oracle Oracle Database 11g Release 1 (11.1.0.6) Enterprise Edition
Oracle Real Application Clusters
Oracle Partitioning
Client machine
OS Red Hat Enterprise Linux 4 Update 3
Oracle Oracle Client 10g Release 2 (10.2)
6-2-4 About workloads
In our verification tests, we used an online transaction processing system (OLTP) for a simulated online
Web shopping site as a workload model. SQL statements generated by J PetStore were provided as a sample
application for Spring Framework (http://www.springframework.org), an open-source J 2EE framework,
were multi-executed by a custom application. The process flow is described below.
(1) User sign-on
A user ID was randomly selected and a search performed for user information.
select from account, profile, signon
where account.userid=? and signon.password =? and ;

(2) Product search
A keyword for product search was randomly generated and a search performed for the product.
Adjustments were made so that the search results totaled approximately 100 on average.
select from category where catid =?;
select from product wherelowernamelike ?;

(3) Product selection
One item was selected from the search results (hits).
select from item, product
where i.itemid =? and

(4) Stock quantity check
The quantity of the selected item in stock was checked.
select from inventory where itemid =?

(5) Order placement
Order data for the specified product was issued.
insert into orders ;
insert into orderstatus ;
insert into lineitem ;

The quantity of ordered products was subtracted from the inventory quantity in the stock
management list.
Update inventory set qty=qty-1 where itemid =?;

(6) Order finalization
commit
- 14 -

The above-mentioned processes were multi-executed by client machines. As shown in Graph 6-1, the
workload generated a heavy load on the primary database server.
CPU usage of primarydatabase server 1
0
20
40
60
80
100
0 120 240 360 480 600 720 840 960 1080 1200
Time (sec)
C
P
U

u
s
a
g
e

(
%
)

user system iowait
CPU usage of primarydatabase server 2
0
20
40
60
80
100
0 120 240 360 480 600 720 840 960 1080 1200
Time (sec)
C
P
U

u
s
a
g
e

(
%
)

user system iowait

Graph 6-1: CPU usage of primary database servers during load generation
7. Verification Results

7-1 Creating a standby database using RMAN network duplicate
Creating a standby database requires the copying of database files from the primary database to the standby
site. With versions up to Oracle Database 10g, this was generally achieved by obtaining a backup of the
primary database and transferring backup files to the standby site via network using ftp or scp, or by
copying the backup file to a tape and sending the tape to the standby site.
Oracle Database 11g Release 1 enhances the RMAN duplicate command to allow copying of database files
from the primary database currently online directly to the standby site. This eliminates the need to obtain a
backup at the primary site and to produce a duplicate from the backup at the standby site. It also eliminates
the need to arrange a disk space to store the backup file at both the primary and standby sites.
Comparison and verification of standby database creation by the conventional method and
using RMAN network duplicate
We created standby databases by the conventional method and from the active database,
measuring the time required to create a standby database and CPU usage during that process. We
then compared and examined the results. The total size of the primary database used in this
verification test was approximately 170 GB.
Conventional method (Figure 7-1)
(1) A backup file was created by online backup using RMAN
(2) The backup file was sent from the primary site to the standby site across a network using
scp.
(3) The database was restored from the backup file by RMAN.
- 15 -

Creating a standby database from an active database (Figure 7-2)
(1) The online primary database file was copied directly to the standby database.

Primary database
(1) Creating a
backup file
(online backup by
RMAN)
Primary site
Standby database
Backup
file
Standby site
Conventi onal standby database
constructi on method
(3) Database
restored by
backup file using
RMAN.
(2) Transfer of backup file
by scp
Backup
file

Figure 7-1: Conventional standby database production method

Primary database
(1) Directly copying an online
database file
Primary site Standby site
Standby database
using RMAN network duplicate
Creating a standby database

Figure 7-2: Creating a standby database using RMAN network duplicate
Graph 7-1 compares the time required to create a standby database by the conventional method
and directly from the active database. Creating a standby database from the active database does
not require the creation of a backup at the primary site and the restoration of the database at the
standby site, enabling creation of the standby database in about 1/3 the time required by the
conventional method.
- 16 -

0 3000 6000 9000 12000
Time (sec)
Conventional method
Creating a standbydatabase using
RMAN networkduplicate

Graph 7-1: Comparison of standby data production times
(via conventional method and using RMAN network duplicate)
Graph 7-2 shows the CPU usage of the primary database server and standby database server and
network transfer volumes during the creation of the standby database by the conventional
method. Approximately 30% of the CPU resources were used to create a backup file at the
primary site and to restore the database at the standby site.
Graph 7-3 shows the CPU usage of the primary database server and standby database server and
network transfer volumes during the creation of the standby database from the active database.
Compared to the conventional method, creating a standby database from the active database kept
CPU usage at low levels and achieved efficient network transfer/copying of online data files.
And network transfer volumes per unit time are high, resulting in higher speeds than copying by
scp.
CPU usage of standbydatabase server
0
10
20
30
40
50
60
70
80
90
100
0 1200 2400 3600 4800 6000 7200 8400 9600
Time (sec)
C
P
U
u
s
a
g
e

(
%
)
Database
restoration by
RMAN
CPU usage of primary database server
0
10
20
30
40
50
60
70
80
90
100
0 1200 2400 3600 4800 6000 7200 8400 9600
Time (sec)
C
P
U
u
s
a
g
e

(
%
)
user system iowait
Networktransfer vol ume of pri mary database server
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
0 1200 2400 3600 4800 6000 7200 8400 9600 10800
Time (sec)
N
e
t
w
o
r
k

t
r
a
n
s
f
e
r

v
o
lu
m
e

(
K
b
y
t
e
s
/s
)

Receiving volume (kB/s)
Online backup by
RMAN
Backup file
transfer byscp
Backup file
transfer byscp
Transmitting volume (kB/s)
Networktransfer vol ume for standby database server
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
0 1200 2400 3600 4800 6000 7200 8400 9600 10800
Time (sec)
N
e
t
w
o
r
k

t
r
a
n
s
f
e
r

v
o
lu
m
e

(
K
b
y
t
e
s
/s
)

Backup file
reception by
scp
Receiving volume (kB/s) Transmitting volume (kB/s) user system iowait

Graph 7-2: CPU usage and network transfer volume in creation of standby database via
conventional method (top: primary database server, bottom: standby database server)

- 17 -

0
20
40
60
80
100
0 600 1200 1800 2400 3000
Time (sec)
C
P
U
u
s
a
g
e

(
%
)
user system iowait
CPU usage of secondarydatabase server
0
20
40
60
80
100
0 600 1200 1800 2400 3000
Time (sec)
C
P
U
u
s
a
g
e

(
%
)
user system iowait
Networktransfer vol ume of pri mary database server
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
0 600 1200 1800 2400 3000
Time (sec)
N
e
t
w
o
r
k

t
r
a
n
s
f
e
r

v
o
lu
m
e

(
K
b
y
t
e
s
/s
)

rxKB /s txKB /s
Networktransfer vol ume of secondarydatabase server
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
0 600 1200 1800 2400 3000
Time (sec)
Direct copying ofonline database file
Receiving volume (kB/s) Transmitting volume (kB/s)
rxKB /s txKB /s
Receiving volume (kB/s) Transmitting volume (kB/s)
N
e
t
w
o
r
k

t
r
a
n
s
f
e
r

v
o
lu
m
e

(
K
b
y
t
e
s
/s
)

Graph 7-3: CPU usage and network transfer volume in production of standby database
using RMAN network duplicate (top: primary database server, bottom: standby database
server)
Effect on business transactions during creation of standby database using RMAN network
duplicate
To examine the effects on business transactions of creating a standby database while transactions
are being processed, we created a standby database from the active database while generating a
business transaction load on the primary database. Graph 7-4 shows results for measurements of
business transaction throughput, CPU usage of the primary database server, and network transfer
volumes. In this case, contention between business transaction processing and database file
transfer processing reduced business transaction throughput by approximately 20%. Transfer
volumes of nearly 80 MB/s were recorded during the transfer of the database file. Since business
transactions under ordinary operating conditions utilized approximately 20 MB/s, database file
volumes transferred to the standby site are estimated to be about 60 MB/s. Since transfer
volumes would be lower than under conditions with no load, it took longer to create a standby
database in this test case.
The effect on the business transaction performance is expected to vary depending on the process
characteristics of the transaction being processed. In actual use, we recommend that users
consider creating a standby database in a time with low business loads to minimize effects on
business operations, as well as configuring a separate network to transfer REDO.
In high latency network environment like WAN, throughput of network duplicate might be
improved by tuning network I/O buffer size. Please refer to '14.2 Configuring I/O buffer space'
of Net Services Administrators Guide 11g Release 1(11.1).
- 18 -

0
20000
40000
60000
80000
100000
120000
0 360 720 1080 1440 1800 2160 2520 2880 3240 3600 3960 4320 4680
Time (sec)
N
e
t
w
o
r
k

t
r
a
n
s
f
e
r

v
o
l
u
m
e

(
K
b
y
t
e
s
/
s
)
rxKB txKB /s
0
20
40
60
80
100
0 360 720 1080 1440 1800 2160 2520 2880 3240 3600 3960 4320 4680
Time (sec)
C
P
U

u
s
a
g
e

(
%
)
user system iowait
Transaction throughput
0 360 720 1080 1440 1800 2160 2520 2880 3240 3600 3960 4320 4680
Time (sec)
T
r
a
n
s
a
c
t
io
n

th
r
o
u
g
h
p
u
t
Effect of creation of standbydatabase using RMAN
networkduplicate on transaction throughput was about
20% i n our verification tests.
Standby database production in
process
Receiving volume (kB/s)
Network transfer volume of primary database server
Total transfer vol ume was about 80 MB/s. Bysubtracting
about 20 MB/s used bybusi ness transactions fromthis
figure, the database file transfer vol ume is estimated to be
about 60 MB/s.
Transmitting volume (kB/s)

Graph 7-4: Business transaction throughput, CPU usage of primary database server, and network
transfer volumes during creation of standby database using RMAN network duplicate
7-2 Effective use of standby site via Oracle Active Data Guard and
reductions in system downtime based on effective use of standby
site
Oracle Data Guard versions up to Oracle Database 10g had the following issue related to effective use of
the standby site.
Application of REDO had to be stopped when the standby site is used on a read-only basis
by physical standby features.
A periodic data synchronizing process was required to reduce downtimes caused by
primary site failure. This meant the standby site had to be set to the managed recovery
mode at regular intervals, making operations more complicated.
Logical Standby are accessible during application of REDO, but there are limitations relate
to the data type and other factors.
These restrictions meant using the standby site previously required complex procedures. Longer standby
site use times meant longer times required to recovery the database in case of failure, impairing availability
(Figure 7-3).
- 19 -

Standby site use time
(Log data application downtime)
Volume of log data required in
case of primary site failure
Propor t ional t o
system downtime
caused by failure

Figure 7-3: Previous drawbacksRelationship between standby site use time
and system downtimes
Real-time Query of Oracle Active Data Guard, a new feature provided with Oracle Database 11g Release 1,
resolves these issues and enables effective use of the standby site while ensuring system availability. The
following two points were verified to confirm the effectiveness of Oracle Active Data Guard.
(1) Effective use of standby site with Oracle Active Data Guard
We confirmed that the standby site could be used for read-only at all times while a physical
standby feature accessed the REDO.
(2) Reducing system downtimes during effective use of physical standby site
We confirmed the absence of any need to perform periodic synchronization due to (1), allowing
reductions in downtimes attributable to a primary site failure to a specific duration.
Effective use of standby site with Oracle Active Data Guard
In the simulated situation shown in Figure 7-4, we confirmed the behavior resulting from
applying additional loads on the standby site, like daily processing and report batch application,
while the primary site was under online transaction loads associated with online shopping
operations. Real-time Query feature of Oracle Active Data Guard enabled the transfer and
application of REDO while additional tasks were performed at the standby site.
- 20 -

REDO transfer and
application
Primary database
OLTP transaction
Standby database
SELECT/query load
Real-time Query
Date/time processing,
report batch
Online shopping
business
Additional operations

Figure 7-4: Effective use of standby site via Oracle Active Data Guard
Graph 7-5 compares CPU usage of the standby database server while the Real-time Query
applies a SELECT load to the standby site and CPU usage with no load applied. When no
SELECT load is applied by Real-time Query, the standby database server performs only the
REDO apply process, and CPU usage is less than 20%. Application by Real-time Query of an
additional load results in CPU resource use exceeding 90%, confirming full use of CPU
resources previously not fully utilized.
CPU usage of standby database server
0
20
40
60
80
100
0 60 120 180 240 300 360 420 480 540 600
Time (sec)
C
P
U

u
s
a
g
e

(
%
)

With SELECT load Without SELECT load
Only REDO log appl yis performed.
CPU use is low.
Even as REDO l og data is being
appli ed, a SELECT load was applied,
resulting in effecti ve resource use.

Graph 7-5: Effective use of CPU resources of standby site with Oracle Active Data Guard
Reduced system downtimes during effective use of physical standby site
The primary site was assumed to ran a 24-hour online shopping business as shown in Figure 7-5,
and the standby site was assumed to operate in the Read Only mode for report batch application
and daily processing in the period from nighttime to daytime.
- 21 -

6:00
Online shopping service
12:00 18:00 24:00
Primary site
Report batch
Daily processing
Online shopping service
Failover
Generation of primary site failure
during use of standby site
Online shopping service downtime
Standby site

Figure 7-5: Simulated business scenario used in verification tests
If a failure occurs in the primary site while the physical standby database is in use, failover of
the online shopping service to the standby site takes place, but application of all REDO
transferred from the primary database must also be completed. Much of transferred REDO might
be applied under the conventional method because REDO application cant be performed while
the physical standby runs. If the Real-time Query feature of Oracle Active Data Guard is used,
the REDO application is performed as needed while the standby site runs, thereby reducing
failover time. Graph 7-6 gives the results of the verification test performed based on this
assumption. The graph shows transaction throughput remained at 0 from the time of failure to
the time of regenerating loads on the new primary database after the standby database was
changed the role to the primary database to resume services. This duration is defined as the
failover time. We compared one case based on the conventional method against another based on
Oracle Active Data Guard. The failover time with Oracle Active Data Guard was greatly reduced
compared to the failover time with the conventional method. With the conventional method, the
volume of REDO not applied at the time of the failover was approximately 20 GB. Volumes of
unapplied REDO exceeding this amount will lengthen failover times accordingly.
Time
T
r
a
n
s
a
c
t
io
n

th
r
o
u
g
h
p
u
t
Time
T
r
a
n
s
a
c
t
io
n

th
r
o
u
g
h
p
u
t
Extended downtime for
online shopping
functions
Use of standby site
with conventional
method
Use of standby site
based with Oracle
Active Data Guard
Short failover time resulting from
conti nuous applicati on of l og data
even during the use of the physical
standbydatabase

Graph 7-6: Reductions in system downtime via Oracle Active Data Guard during use of physical
standby site
- 22 -

- 23 -
7-3 Measuring REDO apply performance for standby database
The following two objectives generally need to be considered when examining system availability:
Recovery Point Objective (RPO) and Recovery Time Objective (RTO). In Oracle Data Guard, the PRO is
related to the settings made for REDO transfer from the primary database to the standby database and
transfer performance. This is because REDO not transferred to the standby database at the time of failover
are lost. REDO apply performance for the standby database affects the RTO because failover time in
Oracle Data Guard included the time required to process unapplied REDO.(*) Figure 7-6 illustrates the
general process of failover to a physical standby database.
Time until
failure is
detected
Generation
of failure
Start of failover
operation
Completion of
failover operation
Downtime from an application perspective
Failover operation of
Data Guard
Application of unapplied
REDO
Role
change
Opening of
instance

Figure 7-6: Process of failover to physical standby database
(*) Although Oracle Data Guard can resume service immediately after a failure, without application of
unapplied REDO, we recommend processing all applicable REDO before resuming services for
maximum data security.
One way to assess the adequacy of REDO apply performance is to compare the REDO apply performance
for the standby database against the volume of REDO generated by the primary database. If the REDO
apply performance falls short of the volume of generated REDO, the difference in the most recent data
between the primary database and standby database will occur, increasing the volume of unapplied REDO.
This can extend failover times in the event of a failure.

REDO transfer
Standby database
Low apply performance
expands the difference
between received and
applied REDO.
After
time n
Primary database
Transferred/received REDO
Applied REDO
Primary database
REDO transfer
Standby database

Figure 7-7: Low REDO apply performance
REDO apply performance that exceeds the volume of generated REDO minimizes the volume of unapplied
REDO and reduces failover times.

REDO transfer
Standby database
Adequate apply
performance minimizes
differences.
After
time n
Primary database
Transferred/received REDO
Applied REDO
Primary database
REDO transfer
Standby database

Figure 7-8: Adequate REDO apply performance
- 24 -

We compared the volume of REDO generated when the primary database is under large transaction loads
against the REDO apply performance of the standby database to assess REDO apply performance.
Oracle statistical information was obtained before and after load generation for the primary database and
the difference between the two values used to calculate the volume of REDO generated per second. We
measured REDO apply performance by applying a group of archived REDO log files totaling about 3 GB.
Oracle instances in standby were restarted before the start of measurement, and
V$RECOVERY_PROGRESS view was used to confirm the REDO apply size per second to measure the
apply performance. Since Oracle Active Data Guard was used during measurement, Oracle instances for
the standby database were read-only.
Graph 7-7 shows the results of a comparison of the volume of generated REDO against REDO apply
performance.
0 2 4 6 8 10
Ratio of amount of generati on to appl yperformance
Volume of generated REDO
REDO appl y performance

Graph 7-7: Comparison of volume of generated REDO against REDO apply performance
The graph indicates that the REDO apply performance far surpassed the total volume of REDO generated
by primary database instances. In Oracle Database 11 g Release 1, one instance handles REDO applications
for a standby database in an Oracle RAC configuration. Although the configuration of the disks on which
online REDO log files and archived REDO log files are located affects REDO apply performance, the
measurements indicate performance in the verification test environment is sufficient to apply the REDO
generated by multiple nodes without delays.
We then compared REDO apply performance in a case in which the physical standby database was set to
READ ONLY OPEN against performance in a case in which the physical standby database was set to
MOUNT status. The comparison sought to determine whether Oracle Active Data Guard affects REDO
apply performance. The measurement method was the same as the method previously described. We used
the following three patterns to compare measurements.
Pattern No. Standby instance 1 Standby instance 2
1 MOUNT MOUNT
2 READ ONLY OPEN MOUNT
3 READ ONLY OPEN READ ONLY OPEN
Table 7-1: Apply performance comparison patterns
- 25 -

Graph 7-8 shows the results of the performance comparison (value of 1 assigned to the apply
performance for pattern 1)
0
0.2
0.4
0.6
0.8
1
1.2
1 2 3
Pattern No.
A
p
p
l
y

p
e
r
f
o
r
m
a
n
c
e

r
a
t
i
o

Apply
performance
ratio

Graph 7-8: Apply performance comparison
The apply performance was consistent whether or not the instances of the physical standby database were
in the MOUNT or READ ONLY OPEN status. This indicates Oracle Active Data Guard has no impact on
REDO apply performance.

- 26 -

- 27 -
7-4 Fast-Start Failover
The Fast-Start Failover feature automatically detects failures in the primary database and starts failover
after failure detection. In Oracle Database 10g Release 2, protection mode is set to Maximum Availability
to use the Fast-Start Failover feature. This required setting synchronous REDO transfers. Synchronous
transmission of REDO guarantees commit-level protection of update data to the primary database, but its
effects on performance, including slower response times for the primary database due to network
performance limitations, must be considered when business functions require high response performance.
In Oracle Database 11g Release, Fast-Start Failover can be used in Maximum Performance protection
mode, which enables setting for asynchronous REDO transfer, allowing correspond with greater numbers
of cases.
When asynchronous REDO transfer is set, a lag may arise between the most recent data for the primary
database and the standby database, which would result in data loss in a failover. The Fast-Start Failover
feature in Oracle Database 11g Release 1 allows the administrator to preset the allowed time lag for
failover and determines whether or not to start failover based on that value in the event of failure. In our
verification testing, we set the time lag value to 60 seconds, then halted all instances of the primary
database using the abort option to check First-Start Failover operations. Figure 7-9 shows the behavior
after failure generation.
Standby database
Primary database
(2)
Observer
(1)

Figure 7-9: Fast-Start Failover operation
(1) When the primary database connection remains unavailable for a certain duration, the observer
concludes a failure has occurred. Any value can be set for the time period used to determine a
failure.
(2) The observer checks the time lag in the latest update information for the primary database and
standby database. If the value of the time lag is less than the preset value, a failover is initiated.
The value of the time lag can be checked with v$dataguard_stats view on the standby database.
In our verification testing, the time lag was 0 seconds, as shown below. Thus, a failover was
executed.

SQL> sel ect name, val ue f r omv$dat aguar d_st at s wher e
name=' t r anspor t l ag' ;

NAME VALUE
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
t r anspor t l ag +00 00: 00: 00
If the lag exceeds the preset threshold value, a failover will not be initiated. This is because a time lag value
greater than the threshold value means the volume of lost data is unacceptable. In this case, the Fast-Start
Failover status is shown to be TARGET OVER LAG LIMIT when checked in v$database view of the
standby database.
SQL> sel ect f s_f ai l over _st at us f r omv$dat abase;

FS_FAI LOVER_STATUS
- - - - - - - - - - - - - - - - - - - - - -
TARGET OVER LAG LI MI T
As above, we confirmed that the Fast-Start Failover feature of Oracle Database 11g Release 1 was capable
of achieving automatic failover to meet the data protection requirements of each system, even with the
asynchronous REDO transfer setting set to Maximum Performance mode.
Oracle Database 11g Release 1 allows the setting of various conditions in addition to the time lag value to
allow detailed control of automatic failover behavior. These extended features should reduce the time and
work required for failover management.

- 28 -

- 29 -
7-5 Failover under high-load transaction condition
While Oracle RAC provides features to ensure business continuity in the event of local failures within
sitesfor example, single-node failures in the primary databaseOracle Data Guard helps ensure business
continuity even against site failures on a scale involving all nodes of the primary database. In our
verification testing, we simulated a number of possible failure types while generating high loads to the
primary database, executing failovers to the standby database when necessary to confirm transaction
processing continuity. Figure 7-10 shows the failure cases used in the verification tests. In each of the three
Oracle Data Guard configurations (A, B, and C shown in Table 7-2), failures 1 through 5 (Table 7-3) were
simulated.

(1) Failure of all
instances of
the primary
database
Primary database
Primary site Standby si te
(2) Total primary
database
server failure
(4) Failure of all
instances of
the standby
database
(3) Network communication
failure between primary
and standby databases
Standby database
(5) Listener
failure of the
standby
database
Failure verifi cation patterns

Figure 7-10: Verifying failover under high-load transaction conditions
Configuration Oracle Data Guard Protection mode Status of standby site
A Maximum Performance mode Oracle Active Data Guard
B Maximum Availability mode Oracle Active Data Guard
C Maximum Performance mode Snapshot Standby
Table 7-2: Verification configuration patterns
# Simulated failure Failure-reproducing method
1 Failure of all Oracle instances for the
primary database
Execution of srvctl stop database -o abort
command for primary node 1
2 Failure of all primary database servers Execution of halt-n -f command for
primary node 1 and node 2
3 Network communication failure between
primary and standby databases
Network cable disconnection
standby database
Execution of srvctl stop database -o abort
command for standby node 1
5 Listener failure for the standby database Simultaneous kill of listener process for
standby node 1 and node 2
Table 7-3: Verified failure patterns

We used the following verification procedure:
(1) Began generating load to primary database.
(2) Simulated primary database failure.
(3) Stopped load generation.
(4) Initiated failover to standby database.
(5) Resumed load generation.
In all configurations, the verification result showed the expected behavior (Table 7-4). We confirmed that
failover to the standby database would enable continuous processing of transactions for cases involving the
failure of all Oracle instances for the primary database and all server failure.
# Simulated failure Behavior after failure
primary database
For each configuration, we confirmed
continuous processing of transactions
following the execution of failover to the
standby database.
2 Failure of all primary database servers For each configuration, we confirmed
continuous processing of transactions
following the execution of failover to the
standby database.
3 Network communication failure between
primary and standby databases
For each configuration, we confirmed that
continuous processing of transactions was
possible using the primary database.
For configuration B, we halted transaction
processing for the duration (set to 30
seconds in the verification test) set with
the NET_TIMEOUT attribute, after which
continuous processing was possible.
4 Failure of all Oracle instances for standby
database
For each configuration, we confirmed that
For configuration B, we also confirmed
possible.
5 Listener failure for standby database For each configuration, we confirmed that
Table 7-4: Verified failure patterns and verification results
The following introduces one of the characteristic behaviors exhibited by the failover operation occurring
under high-load transaction conditions.
Graph 7-9 shows transaction throughput during the all-instances failure of the primary database in
configuration A and patterns of CPU usage in the individual primary and standby servers. After failure in
(1), the failover was completed and transactions resumed in (2). Transaction throughput declined before (3)
due to contention between disk I/O resulting from standby REDO log files clearing performed by the
database server following the failover and disk I/O associated with online REDO log files generated by the
resumed transactions. The time required to clear standby REDO log files depends on total file size and disk
- 30 -

I/O performance. This behavior can be circumvented by having enough I/O bandwidth to handle normal
work load and additional I/O caused by clearing of the standby REDO log files. or by configuring online
REDO log files and standby REDO log files on separate disks to avoid disk I/O contention.
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
T
r
a
n
s
a
c
tio
n
th
r
o
u
g
h
p
u
t
CPU usage of primary
instance 1
CPU usage of primary
instance 2
CPU usage of standby
instance 1
CPU usage of standby
instance 2
Transaction
throughput
(1) (2) (3)
(1)
Generati on of failure of all
instances for the primary
database
(1) to (2)
Failover to standby database
(2) to (3)
Clear REDO processing

Graph 7-9: Transactions during failure of all instances for the primary database
and patterns in CPU usage for individual database servers

- 31 -

- 32 -
8. Summary
Verification tests at the Oracle GRID Center confirmed the effectiveness of Oracle Data Guard in Oracle
Database 11g Release 1 with a Hitachi platform. Specifically, we confirmed the capabilities of the Oracle
Active Data Guard, a new option introduced in Oracle Database 11g Release 1, to make effective use of
resources at the standby site and reduce failover times in the event of failures based on effective use of the
standby database. We believe that Oracle Database 11g Release 1 with its new feature can dramatically
improve the cost efficiency of disaster recovery systems over previous versions.
We also examined patterns resulting from failures under a large-scale transaction load environment,
confirming transaction continuity. We are confident that a disaster recovery solution based on a
combination of Hitachi hardware and Oracle Database 11g Release 1/Oracle Data Guard will provide the
support needed to ensure high levels of BCM for corporate infrastructures.
Precautions concerning use of this document
The contents of this white paper are based on the results of verification tests performed at the Oracle GRID
Center. We make no guarantees that the same results will be achieved under all conditions. Actual results
will depend on various factors, including the specific conditions under the clients environment.

Sgsmmmomva

Cargado por

Información del documento

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Sgsmmmomva

Cargado por

Copyright:

Formatos disponibles

Hitachi-Oracles BCM Platform Solution

Verification Report on Oracle Active Data Guard

processor 2.66 GHz

También podría gustarte