Está en la página 1de 174

HP StorageWorks XP Cluster Extension Software Administrator Guide

This guide contains detailed instructions for configuring and troubleshooting HP StorageWorks XP Cluster Extension Software in AIX, Windows, Solaris, and Linux environments. The intended audience has independent knowledge of related software and of the HP StorageWorks XP disk array and its software.

Part Number: T1656-96035 Fifteenth edition: April 2010

Legal and notice information Copyright 2003-2010 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. UNIX is a registered trademark of The Open Group.

Contents
1 XP Cluster Extension features ............................................................. 11
Integration into cluster software ................................................................................................... Enhanced disaster tolerance ....................................................................................................... Automated monitoring and redirecting of XP Continuous Access Software pairs ................................ Rolling disaster protection .......................................................................................................... Command-line interface (CLI) ..................................................................................................... Fast Failback using XP Continuous Access Software ...................................................................... XP Cluster Extension configurations ............................................................................................. One-to-one configurations ................................................................................................... Consolidated-site configuration ............................................................................................ Supported XP Continuous Access Software configurations and fence levels ................................ XP Cluster Extension server configurations .............................................................................. Planning for XP Cluster Extension ................................................................................................ Before configuring XP Cluster Extension resources ................................................................... Cluster setup considerations ................................................................................................. MNS quorum clusters (MSCS) ........................................................................................ SLE HA cluster setup considerations ................................................................................ RHCS cluster setup considerations .................................................................................. Setting up XP RAID Manager ............................................................................................... Creating XP RAID Manager command devices ................................................................ Creating XP RAID Manager instances ............................................................................. Creating XP RAID Manager device groups ...................................................................... Network considerations ................................................................................................ Starting and stopping the XP RAID Manager instances ..................................................... Test takeover function ................................................................................................... 11 11 11 12 12 12 12 12 13 14 14 15 15 15 15 15 18 20 20 20 21 21 22 22

2 Configuring XP Cluster Extension for AIX ............................................. 25


Configuring resources ................................................................................................................ Configuring the pair/resync monitor ..................................................................................... Updating the remote access hosts file ............................................................................. Configuring the pair/resync monitor port ........................................................................ Integrating XP Cluster Extension into HACMP ................................................................................ User configuration file for HACMP .............................................................................................. Bringing a resource group online ................................................................................................ Taking a resource group offline .................................................................................................. Pair/resync monitor integration ................................................................................................... Adding a Custom Cluster Event for pair/resync monitor integration ........................................... Redefining the Custom Cluster Event as a pre-event of the standard HACMP event ...................... Timing considerations ................................................................................................................ Failure behavior ....................................................................................................................... Restrictions for IBM HACMP with XP Cluster Extension ................................................................... 25 25 25 26 26 28 30 31 32 32 33 34 35 35

3 Configuring XP Cluster Extension for Windows ..................................... 37


Integrating XP Cluster Extension with MSCS ................................................................................. 37

XP Cluster Extension Software Administrator Guide

Configuring XP Cluster Extension ................................................................................................ Starting the XP Cluster Extension configuration tool ................................................................. Defining XP Cluster Extension configuration information using the GUI ....................................... Defining XP Cluster Extension configuration information using the CLI ........................................ Importing and exporting configuration information ................................................................. Exporting configuration settings using the GUI ................................................................. Exporting configuration settings using the CLI .................................................................. Importing configuration settings using the GUI ................................................................. Importing configuration settings using the CLI .................................................................. Adding an XP Cluster Extension resource ..................................................................................... Adding an XP Cluster Extension resource using the Cluster Administrator GUI (Windows Server 2003) ............................................................................................................................... Adding an XP Cluster Extension resource using the Failover Cluster Management GUI (Windows Server 2008/2008 R2) ...................................................................................................... Adding an XP Cluster Extension resource using the Microsoft CLI cluster commands .................... Changing an XP Cluster Extension resource name ......................................................................... Changing an XP Cluster Extension resource name (Windows Server 2003) ................................ Changing an XP Cluster Extension resource name (Windows Server 2008/2008 R2) ................. Configuring XP Cluster Extension resources ................................................................................... Setting Microsoft cluster-specific resource and service or application properties ........................... Setting XP Cluster Extension-specific resource properties .......................................................... Setting XP Cluster Extension resource properties using the Cluster Administrator GUI (Windows Server 2003) .............................................................................................................. Setting XP Cluster Extension resource properties using the GUI (Windows Server 2008/2008 R2, Server Core, and Hyper-V Server) ............................................................................. Setting XP Cluster Extension resource properties using the MMC ........................................ Setting XP Cluster Extension resource properties using the CLI ............................................ Setting XP Cluster Extension properties using a UCF ......................................................... Adding dependencies on an XP Cluster Extension resource ............................................................ Adding dependencies using Cluster Administrator (Windows Server 2003) ............................... Adding dependencies using Failover Cluster Management (Windows Server 2008/2008 R2) ..... Adding dependencies using the CLI ...................................................................................... Disaster-tolerant configuration example using a file share ............................................................... Managing XP Cluster Extension resources .................................................................................... Bringing a resource online ................................................................................................... Taking a resource offline ..................................................................................................... Deleting a resource ............................................................................................................ Using Hyper-V Live Migration with XP Cluster Extension .................................................................. Timing considerations for MSCS ................................................................................................. Bouncing service or application .................................................................................................. Administration .......................................................................................................................... Remote management of XP Cluster Extension resources in a cluster (Windows Server 2008/2008 R2) ................................................................................................................................... Remote management of XP Cluster Extension resources in a cluster (Windows Server 2003) ......... System resources ................................................................................................................ Logs ................................................................................................................................. Hyper-V Live Migration log entries ..................................................................................

37 37 38 40 41 41 42 42 42 42 43 44 44 44 44 45 45 46 49 49 54 62 63 64 64 65 65 66 66 70 70 70 70 71 72 72 73 73 73 74 74 74

4 Configuring XP Cluster Extension for Solaris ......................................... 75


Configuration of the XP Cluster Extension agent ............................................................................ Disaster tolerant configuration example using a web server ...................................................... Configuring the pair/resync monitor ............................................................................................ Updating the remote access hosts file .................................................................................... Configuring the pair/resync monitor port .............................................................................. 75 75 78 78 78

Including the XP Cluster Extension resource type ........................................................................... Configuring the XP Cluster Extension resource ............................................................................... XP Cluster Extension resource types ....................................................................................... Resource type definition ...................................................................................................... Adding an XP Cluster Extension resource ..................................................................................... Adding an XP Cluster Extension resource using the VCS CLI ..................................................... Adding an XP Cluster Extension resource using the VCS Cluster Manager GUI ........................... Changing XP Cluster Extension attributes ..................................................................................... Changing an attribute value using the VCS CLI ...................................................................... Changing an attribute value using the VCS Cluster Manager GUI ............................................. Linking an XP Cluster Extension resource ...................................................................................... Linking other resources to the XP Cluster Extension resource ..................................................... Linking other resources using the VCS Cluster Manager GUI .................................................... Bringing an XP Cluster Extension resource online .......................................................................... Enabling and bringing an XP Cluster Extension resource online using the CLI ............................. Enabling and bringing an XP Cluster Extension resource online using the VCS Cluster Manager GUI .................................................................................................................................. Taking an XP Cluster Extension resource offline ............................................................................. Taking an XP Cluster Extension resource offline using the VCS Cluster Manager GUI ................... Deleting an XP Cluster Extension resource .................................................................................... Deleting a resource using the VCS CLI .................................................................................. Deleting a resource using the VCS Cluster Manager GUI ......................................................... Disabling the XP Cluster Extension agent ...................................................................................... Pair/resync monitor integration ................................................................................................... Log-level reporting .............................................................................................................. Timing considerations for VCS .................................................................................................... Enabling/disabling service groups .............................................................................................. Restrictions for VCS with XP Cluster Extension ............................................................................... Unexpected offline conditions ..................................................................................................... Bringing the XP Cluster Extension resources online ..................................................................

79 79 79 79 80 80 80 83 83 83 84 84 85 85 85 85 86 86 87 87 87 87 88 88 88 89 90 90 91

5 Configuring XP Cluster Extension for Linux ........................................... 93


XP Cluster Extension for Linux: Sample configuration ...................................................................... 93 Configuring XP Cluster Extension with RHCS ................................................................................. 95 Configuration overview ....................................................................................................... 95 Creating an RHCS XP Cluster Extension shared resource ......................................................... 95 Using Conga to create a shared resource ....................................................................... 95 Using system-config-cluster to create a shared resource ..................................................... 96 Creating an RHCS service using the XP Cluster Extension shared resource ................................. 97 Using Conga to create a service .................................................................................... 97 Using system-config-cluster to create a service .................................................................. 98 Creating the XP Cluster Extension resource configuration file .............................................. 99 Testing the service configuration .................................................................................. 100 Managing XP Cluster Extension services (RHCS) ................................................................... 101 Starting an RHCS service ............................................................................................ 101 Stopping or disabling an RHCS service ........................................................................ 101 Configuring XP Cluster Extension with SLE HA ............................................................................ 102 Configuration overview ..................................................................................................... 102 Creating and configuring an XP Cluster Extension resource .................................................... 102 Creating the XP Cluster Extension resource configuration file ............................................ 102 Creating an XP Cluster Extension resource for Pacemaker ................................................ 103 Creating an XP Cluster Extension resource for Heartbeat ................................................. 105 Testing the configuration ............................................................................................. 106 Managing XP Cluster Extension services (SLE HA) ................................................................. 106

XP Cluster Extension Software Administrator Guide

Rescanning multipath devices ................................................................................................... Configuring the rescan script ............................................................................................. Finding the user-friendly name of a multipath device ............................................................. Configuring the pair/resync monitor .......................................................................................... Updating the remote access hosts file .................................................................................. Configuring the pair/resync monitor port ............................................................................ Activating the pair/resync monitor ............................................................................................ Timing considerations ..............................................................................................................

106 106 107 108 108 108 109 109

6 XP Cluster Extension and CLI ........................................................... 111


Configuring the CLI ................................................................................................................. Creating the Continuous Access environment and configuring XP RAID Manager ...................... Timing considerations ....................................................................................................... Restrictions for customized XP Cluster Extension implementations ............................................. Creating and configuring the user configuration file .............................................................. CLI commands ....................................................................................................................... clxrun ............................................................................................................................. clxchkmon ....................................................................................................................... Displaying resources .................................................................................................. Removing resources .................................................................................................... Stopping the pair/resync monitor ................................................................................ 111 111 111 112 112 113 113 114 115 115 115

7 XP Cluster Extension recovery procedures .......................................... 119


XP disk pair states ................................................................................................................... 119 Recovery sequence ................................................................................................................. 120

8 User configuration file and XP Cluster Extension objects ...................... 123


User configuration file location ................................................................................................. File structure .................................................................................................................... Specifying object values .................................................................................................... COMMON objects ................................................................................................................. APPLICATION objects ............................................................................................................. APPLICATION objects ....................................................................................................... Basic configuration example ..................................................................................................... 123 124 124 126 128 128 139

9 Advanced XP Cluster Extension configuration ..................................... 141


Implementing rolling disaster protection ..................................................................................... Using XP RAID Manager with rolling disaster protection ........................................................ Setting XP Cluster Extension objects to enable rolling disaster protection .................................. Setting automatic recovery for rolling disaster protection ........................................................ Using the pair/resync monitor with rolling disaster protection ................................................. Restoring server operation for rolling disaster protection ........................................................ Monitoring and resynchronizing device groups ........................................................................... Enabling write access regardless of disk pair state ...................................................................... Executing programs before and after an XP Cluster Extension takeover .......................................... Arguments ....................................................................................................................... Pre-executable return codes ............................................................................................... Post-executable return codes ............................................................................................... 141 141 141 142 142 142 143 144 145 145 146 147

10 Troubleshooting ........................................................................... 149


XP Cluster Extension log facility ................................................................................................ 149 Start errors ............................................................................................................................. 150

Failover error handling ............................................................................................................ HACMP-specific error handling ................................................................................................. Start errors ...................................................................................................................... Failover errors .................................................................................................................. MSCS-specific error handling ................................................................................................... Resource start errors ......................................................................................................... Failover errors .................................................................................................................. Using the Domain user account (Windows Server 2008/2008 R2 only) ................................. VCS-specific error handling ...................................................................................................... Start errors ...................................................................................................................... Failover errors .................................................................................................................. Linux-specific error handling ..................................................................................................... Failover errors .................................................................................................................. The FC link is down (RHCS) ............................................................................................... A storage replication link is down (RHCS) ............................................................................ A data center is down (SLE HA and RHCS) .......................................................................... Pair/resync monitor messages in syslog/errorlog/messages/event log ...........................................

150 150 151 151 153 153 153 154 154 155 155 156 157 157 158 158 159

11 Support and other resources .......................................................... 161


Contacting HP ........................................................................................................................ Subscription service .......................................................................................................... New and changed information in this edition ............................................................................. Related information ................................................................................................................. White papers .................................................................................................................. HP websites ..................................................................................................................... Typographic conventions ......................................................................................................... HP product documentation survey ............................................................................................. 161 161 161 161 162 162 162 163

Glossary .......................................................................................... 165 Index ............................................................................................... 169

XP Cluster Extension Software Administrator Guide

Figures
1 One-to-one (1:1) configuration ................................................................................. 13 2 Consolidated-site configuration ................................................................................. 14 3 HACMP configuration example ................................................................................. 29 4 Service or application example (quorum service control disks not shown) ....................... 67 5 CLX_FILESHARE resource sample .............................................................................. 67 6 XP Cluster Extension resource tree for CLX_SHARE ....................................................... 68 7 VERITAS Cluster Service configuration example ........................................................... 76 8 Sample resource graph of the CLX_WEB_SERVER service group .................................... 77 9 Sample configuration .............................................................................................. 93 10 Disaster-tolerant configuration with rolling disaster protection ...................................... 143 11 Incompatible XP disk pair state .............................................................................. 154 12 Incompatible XP disk pair state (VCS Cluster Manager Log Desk window) .................... 156 13 Detailed information of the XP disk pair state (VCS Log Desk) ..................................... 156

Tables
1 Setting resource properties and values in the GUI ........................................................ 47 2 Service or application properties and values .............................................................. 48 3 XP disk pair states ................................................................................................. 119 4 Cluster software supported objects .......................................................................... 125 5 Document conventions ........................................................................................... 162

XP Cluster Extension Software Administrator Guide

10

1 XP Cluster Extension features


HP StorageWorks XP Cluster Extension Software monitors HP StorageWorks XP Continuous Access Software disk pairs and enables automatic access to remote data copies when clustered applications become unavailable locally. XP Cluster Extension integrates with popular cluster software to ensure that consistent and concurrent data copies on HP disk arrays can be accessed when needed.

Integration into cluster software


XP Cluster Extension integrates with the following cluster software products: High Availability Cluster Multi-Processing (HACMP) Microsoft Cluster Service (MSCS) VERITAS Cluster Server (VCS) SUSE Linux Enterprise High Availability Extension (SLE HA) Red Hat Cluster Suite (RHCS) Integrating XP Cluster Extension with cluster software allows you to manage a disk array as if it were a disk or volume group of the clustered application. For supported cluster software versions, see the HP SPOCK website: http://www.hp.com/storage/ spock.

Enhanced disaster tolerance


HP XP Continuous Access Software copies valuable data to a remote data center so that you can restore application service after a local server, storage, or data center failure. Disk arrays with XP Continuous Access Software can change mirroring direction, swapping the primary/secondary relationship of disk pairs almost instantaneously if the application must access the secondary disk. This feature ensures that the failback process is as fast as the failover process. If the links between your primary and secondary disk arrays are broken, each array maintains a bitmap table to synchronize the changes when the links become available again. Because cluster software requires the application service to have read/write access to data disks and because the secondary volume of an XP Continuous Access Software disk pair is normally read-only, the failover process using cluster software alone typically involves manual intervention. With XP Cluster Extension Software, manual intervention is required only if the current disk array states and user settings conflict with the rules stored in the XP Cluster Extension database.

Automated monitoring and redirecting of XP Continuous Access Software pairs


XP Cluster Extension monitors the health of the XP Continuous Access Software links between your arrays. When it detects a lost and later re-established link, it automatically resynchronizes the suspended disk pairs, ensuring that the most current data is available on either site. For information on configuring resynchronization parameters, see Monitoring and resynchronizing device groups on page 143.

XP Cluster Extension Software Administrator Guide

11

Rolling disaster protection


A rolling disaster is a catastrophic event that affects the remote site after an outage at the local site. In a rolling disaster, data stored on remote disks can be entirely lost during a recovery attempt. To ensure the survival of critical data during a resynchronization/restore operation, HP StorageWorks XP Business Copy Software pairs can be associated with the local data disks. XP Cluster Extension recovers automatically, provided that a local XP Business Copy Software mirror can be suspended. Although the local copy can be out of date, it represents the best starting point for the recovery. XP Cluster Extension also resumes local XP Business Copy Software mirrors automatically, if specified, to allow the local site to keep an up-to-date image of the primary data. To implement rolling disaster protection, see Implementing rolling disaster protection on page 141.

Command-line interface (CLI)


XP Cluster Extension Software provides a CLI to enable disaster tolerance without cluster software. The CLI is convenient if you use in-house software to migrate application services from one system to another or if you want XP Cluster Extension to check disk states to make sure you can automatically start an application service on the local disk array.

Fast Failback using XP Continuous Access Software


XP Cluster Extension supports the XP RAID Manager Fast Failback feature. This feature allows XP Continuous Access Asynchronous Software to automatically redirect the mirroring direction of a disk pair even if the remote XP RAID Manager instance is not available. This ensures the fastest possible recovery to the original site in case of an application service failover at the alternate site.

XP Cluster Extension configurations


XP Cluster Extension is an array-based solution. It requires at least two HP XP disk arrays with XP Continuous Access Software providing remote mirroring. XP Cluster Extension connects XP software with cluster software and uses the ability of cluster software to react to system hardware and application failures. Servers are members of the same cluster dispersed over two or more sites. XP Cluster Extension supports one-to-one and consolidated-site configurations.

One-to-one configurations
In one-to-one (1:1) configurations, cluster host nodes are split between two geographically separate data centers and use redundant, diversely routed network connections for intra-cluster communications. (See Figure 1 on page 13.) These links must be as reliable as possible to prevent false failover operations or split-brain situations.

12

XP Cluster Extension features

Figure 1 One-to-one (1:1) configuration


.

Each cluster host node needs redundant FC or SCSI I/O paths to the XP disk array. Individual hosts cannot be connected to both the primary (P-VOL) and the secondary (S-VOL) copy of the application disk set. HP recommends a minimum of two cluster host nodes per site. This allows for a preferred local failover in case of a system failure. Local failover operations are faster than a remote failover between XP disk arrays because the mirroring direction of the XP disks does not need to be changed. XP Cluster Extension can be deployed in environments where several clusters use the same pair of XP disk arrays. Although XP disk arrays can be mirrored in various configurations, XP Cluster Extension does not support multiple disk arrays as both primary and secondary disk arrays. XP Cluster Extension supports configurations where two or more disk arrays use one remote disk array in a logical one-to-one configuration. CAUTION: XP Cluster Extension can operate with only one system at each site, with a single I/O path between the server system and the disk array and a single link in each direction between disk arrays. However, those configurations are not considered highly available, nor are they disaster tolerant. XP Cluster Extension configurations with single points of failure are not supported by HP.

Consolidated-site configuration
In consolidated-site configurations, a single XP disk array in the secondary (remote) data center is connected to up to four other primary XP disk arrays (see Figure 2 on page 14.) The restrictions outlined in One-to-one configurations on page 12 also apply to consolidated configurations. XP

XP Cluster Extension Software Administrator Guide

13

Cluster Extension does not support configurations in which the application service's data disk set is spread over two or more XP disk arrays.

Figure 2 Consolidated-site configuration


.

Supported XP Continuous Access Software configurations and fence levels


XP Continuous Access Software offers three modes of replication: Synchronous replication Asynchronous replication Journal replication For the replication modes supported by specific versions of XP Cluster Extension, see the HP SPOCK website: http://www.hp.com/storage/spock. For information about synchronous and asynchronous replication modes, see the HP StorageWorks XP Continuous Access Software User Guide. For information about journal replication, see the HP StorageWorks XP Continuous Access Journal Software User Guide. The XP Continuous Access Software fence level is used to configure the remote replication feature of an XP disk array based on needs for application service availability, data concurrency, and replication performance. XP Cluster Extension supports all XP Continuous Access Software fence levels: NEVER, DATA, and ASYNC (includes JOURNAL). XP Cluster Extension is supported with XP Continuous Access Software in the configurations described in the HP StorageWorks SAN Design Reference Guide, available at http://www.hp.com/go/ sdgmanuals.

XP Cluster Extension server configurations


The ideal cluster configuration for XP Cluster Extension consists of at least four servers (two at each site) and separate redundant communications links for cluster heartbeats, client access, and XP Continuous Access Software. Installing communications interfaces in pairs allows failover and prevents single points of failure (SPOFs). Using four servers provides faster recovery from a system failure by allowing local application services to fail over to a local cluster system instead of the remote system. On the remote site, HP recommends that two systems be available in case one system experiences a hardware or power failure. In addition to at least four servers, the MNS configuration requires an additional node per cluster, located at a third site so that whenever a disaster affects either the local

14

XP Cluster Extension features

or remote site, the other site together with the added node would have a majority. In a MNS with File Share Witness configuration, the, file share should be located at the third site. TIP: To upgrade XP firmware while the application service is running, use host load balancing and multipathing software, such as Auto Path, HP MPIO Full Featured Device Specific Module (DSM) for XP family of Disk Arrays (HP MPIO XP DSM), or VERITAS for Sun Solaris. XP Cluster Extension allows you to configure the failover behavior so that the application service startup is stopped if no remote cluster members can be reached. The default configuration of XP Cluster Extension expects the cluster software to deal with the split-brain syndrome.

Planning for XP Cluster Extension


Before configuring XP Cluster Extension resources
Before configuring XP Cluster Extension resources for the Windows CLI implementation or Unix environments, review the XP Cluster Extension objects in the UCF.cfg file. For more information about XP Cluster Extension objects, see Chapter 8 on page 123.

Cluster setup considerations


For cluster setup considerations that apply to Windows and Linux, see: MNS quorum clusters (MSCS), page 15 SLE HA cluster setup considerations, page 15 RHCS cluster setup considerations, page 18

MNS quorum clusters (MSCS)


In an MNS cluster, the cluster service is allowed to start or run only if it has access to the majority of the configured nodes. This means that losing half the nodes in a 2-, 4-, 6-, or 8-node cluster or losing the communication links with 50% of the nodes on each site forces every node to terminate the cluster services because none of them have access to a majority of the configured nodes. Therefore, a geographically dispersed MNS-based cluster requires an additional node per cluster located at a third site so that whenever a disaster affects either the local or remote site, the other site together with the added node has a majority.

SLE HA cluster setup considerations


Follow the guidelines in this section when you configure clusters for use with XP Cluster Extension. Quorum In an SLE HA cluster, quorum is defined as a strict majority of the defined cluster (more than 50%). With certain failures, a cluster might be divided into two subclusters. In an SLE HA cluster, a subcluster with more than 50% of the nodes wins the quorum. The subcluster that wins the quorum re-forms the cluster and fences the subcluster that lost the quorum. The behavior of the subcluster that lost the

XP Cluster Extension Software Administrator Guide

15

quorum depends on the defined no-quorum policy. This behavior is in effect until the cluster is fenced. When the cluster is fenced, the resources owned by the fenced nodes fail over to active cluster nodes. STONITH STONITH is an SLE HA cluster fencing method. SLE HA cluster provides STONITH plug-ins for devices such as UPS, PDU, Blade power control devices, and lights out devices. Some plug-ins can STONITH more than one node (for example, Split Brain Detector STONITH) and some can STONITH only one node (for example, HP iLO STONITH). HP iLO STONITH uses the power control functions of an HP iLO device to STONITH a node that has lost quorum and needs to be fenced. IMPORTANT: If all of the iLO devices in a cluster are connected using a single network, a single switch failure might disable iLO, preventing nodes from being fenced. This failure might be difficult to detect, especially before a node failure where iLO features would be required. The STONITH action can be set to power off or reset, depending on the environment requirements. Power off: The STONITH agent powers off the nodes in the errant subcluster. Reset: The STONITH agent resets the nodes in the errant subcluster, and the nodes try to automatically rejoin the cluster. NOTE: IPMI fencing can be used for Integrity servers that do not support RIBCL scripting.

Networking in an SLE HA cluster Configuring redundant and independent cluster communication paths is a good way to avoid Split Brain conditions. With redundancy in communication paths, the loss of a single interface or switch does not break the communication between nodes and prevents Split Brain conditions. Administrators can configure multiple independent communication paths. HP recommends using bonded Ethernet channels. Resource constraints Resource constraints allow administrators to specify which cluster nodes resources can run on, the order resources are loaded, and the other resources a specific resource is dependent on. There are three types of resource constraints: Resource location: Defines the nodes on which a resource can run, cannot run, or is preferred to be run. Resource colocation: Defines which resources can or cannot run together on a node. Resource order: Defines the sequence of actions for resources running on a node. Resource operation attribute SLE HA does not monitor resource health by default. To enable this feature, add the monitor operation to the resource definition. You can specify the interval attribute and the timeout attribute for a monitor

16

XP Cluster Extension features

operation. The interval attribute defines the time interval in which the monitor operation is executed. The timeout attribute determines how long to wait before considering the resource as failed. Define start, stop, and monitor operations for the XP Cluster Extension resource. XP Cluster Extension resource dependency A Group resource in an SLE HA cluster ensures that the member resource agents are started and stopped in the required order. An XP Cluster Extension resource must be added as the first member of the group. This way, all primitive resources added after the XP Cluster Extension resource are dependent on XP Cluster Extension. Since the primitive resources within a resource group can be failed over independently, set a collocation constraint for each resource group ID with the last resource in the group to achieve the failover of the entire group when any primitive resource fails. Failover order Use location constraints to define the failover order for a resource group. For each node, define a location constraint with the appropriate score to prioritize the resource group on that particular node. During failover, the cluster calculates the score of the resource group on the available nodes, and the node with the highest score is considered the next preferred owner. For more information, see the SLE HA documentation. Failback option HP does not recommend auto failback in configurations with XP Cluster Extension because the resource failovers due to storage failure can cause resources to go into an unstable state (failover/failback might toggle the resource between the nodes). SLE HA provides the meta-attribute resource-stickiness to determine how much a resource agent prefers to stay where it is. To disable auto failback, set resource-stickiness to the lowest value compared to the other resource location constraints. Migration-threshold A resource is automatically restarted if it fails. If a restart cannot be achieved on the current node or it fails to start a certain number of times on the current node, it tries to fail over to another node. You can define the number of failures for resources (a migration-threshold) after which they migrate to a new node. If you have more than two nodes in your cluster, the high availability software chooses the node a particular resource fails over to. When an XP Cluster Extension resource fails, HP recommends configuring your cluster to fail over the resource without restarting on the local node. To set this preference, set the migration-threshold to 1. Disk monitoring For the situations in which disk access is lost or read/write protection is in effect due to storage fencing, application monitoring agents, file system agents, or LVM resource agents detect the IO failure. XP Cluster Extension does not monitor the disk access status.

XP Cluster Extension Software Administrator Guide

17

RHCS cluster setup considerations


Quorum In RHCS, the quorum is based on a simple voting majority of the defined nodes in a cluster. To re-form successfully, a majority of all possible votes is required. Each cluster node is assigned a number of votes, and they contribute to the cluster while they are members. If the cluster has a majority of all possible votes, it has quorum (also called quorate); otherwise, it does not have quorum. Fencing Cluster software adjusts the node membership based on various failure scenarios. The concept of quorum defines which set of nodes continue to define the cluster. To protect data, nodes that do not have quorum are removed from the cluster. The non-quorate nodes that are removed must be prevented from accessing the shared resources. This process is called fencing. HP iLO fencing is one method that can be used with RHCS to restrict cluster node access to shared resources. Observe the following guidelines when using HP iLO network configurations with RHCS clusters: HP iLO can be connected to the client access network or to a different network, but the network must be routable. HP iLO should not be on the network that is used for cluster communication. The HP iLO of each cluster system must be accessible over the network from every other cluster system. To handle infrequent failures of the HP iLO fencing (such as a switch failure), you can set up a backup fence method for redundancy. HP iLO fencing can be used on HP Proliant systems with built-in iLO hardware. For third-party systems, other power control fencing methods can be used. NOTE: IPMI fencing can be used for Integrity servers that do not support RIBCL scripting.

Qdisk configuration Red Hat recommends the use of a Qdisk configuration to bolster quorum to handle failures such as half (or more) of the members failing, a tie-breaker in equal split partition, and a SAN failure. In an XP Cluster Extension configuration with multiple storage arrays, a Qdisk configuration is not supported. Failover domains A cluster service is associated with a failover domain, which is a subset of cluster nodes that are eligible to run a particular cluster service. To maintain data integrity, each cluster service can run on only one cluster node at a time. By assigning a cluster service to a restricted failover domain, you can limit the nodes that are eligible to run a cluster service in the event of a failover, and you can order the nodes by preference to ensure that a particular node runs the cluster service (as long as that node is active).

18

XP Cluster Extension features

A failover domain can have the following characteristics: Unrestricted: Specifies that the subset of members is preferred, but the cluster service assigned to this domain can run on any available member. Restricted: The cluster service is allowed to run only on a subset of failover domain members. Unordered: The member on which the cluster service runs is chosen from the available list of failover domain members with no preference order. Ordered: The failover domain member on which the cluster service runs is selected based on preference order. The member at the top of the list (as specified in /etc/cluster/ cluster.conf) is the most preferred, followed by the second member, and so on. For an orderly failover, HP recommends using the Ordered and Restricted options for your failover domains. Failback policy HP does not recommend auto failback in configurations with XP Cluster Extension because the resource failovers due to storage failure can cause resources to go into an unstable state (failover/failback might toggle the resource between the nodes). In this situation, HP recommends correcting the failure and then manually failing back to the intended data center or server. To disable the auto failback, set the nofailback flag for the failover domain. Enabling this option for an ordered failover domain prevents automated failback after a more-preferred node rejoins the cluster. Recovery policy When a resource inside the service fails, the default action is to restart the service on the local node before the failover. In an XP Cluster Extension environment, it is always expected to relocate the service during restart. To enable this functionality, set the service recovery policy to relocate. Service hierarchical structure and resource dependency In RHCS, a service is a collection of cluster resources configured into a single entity that is managed (started, stopped, or relocated) for high availability. A service is represented as a resource tree that specifies each resource, its attributes, and its relationship among other resources in the resource tree. The relationships can be parent, child, or sibling. Even though a service is seen as a single entity, the hierarchy of the resources determines the order in which each resource within the service is started and stopped. In the case of a child-parent relationship, the startup or shutdown is simple. All parents are started before children, and children must all stop cleanly before a parent can be stopped. For a resource to be considered in good health, all of its children must be in good health. A service is considered failed if any of its resources fail. In this case, the expected course of action is to restart the entire service, including the failed resource and the other resources that did not fail. In an XP Cluster Extension environment, configure the XP Cluster Extension resource as the parent resource in the service so that XP Cluster Extension can control the service behavior based on the user configuration and storage device status. This means that the XP Cluster Extension resource must be configured at the highest level in the dependency hierarchy.

XP Cluster Extension Software Administrator Guide

19

Disk monitoring For the situations in which disk access is lost or read/write protection is in effect due to storage fencing, application monitoring agents, file system agents, or LVM resource agents detect the IO failure. XP Cluster Extension does not monitor the disk access status.

Setting up XP RAID Manager


In addition to the cluster software it integrates with, XP Cluster Extension depends on HP StorageWorks XP RAID Manager. Before configuring XP Cluster Extension, verify that XP RAID Manager is installed and configured, and that the host and disk array systems are properly configured as described in the following topics.

Creating XP RAID Manager command devices


To control XP Continuous Access Software mirrored disks from a clustered server, install XP RAID Manager on the server and configure a special disk, called a command device. The command device must not be an MSCS resource, cannot be paired, and is assigned to a 36 MB or greater CVS volume. The command device is identified by CM appended to the emulation type. XP RAID Manager command devices can be accessed by redundant paths. HP recommends redundant paths to prevent XP Cluster Extension from aborting if one path to the command device is missing. See the HP StorageWorks XP RAID Manager User's Guide for more information. NOTE: If you use Auto Path to enable alternate pathing on IBM AIX together with the XP disk array, XP RAID Manager does not support Auto Path virtual paths for command devices.

Creating XP RAID Manager instances


XP Cluster Extension requires at least one instance of XP RAID Manager. XP Cluster Extension starts the configured XP RAID Manager instance if it is not running. However, if the XP RAID Manager instance cannot be started or returns an error, XP Cluster Extension can switch to an alternate XP RAID Manager instance. Ensure that the path to the XP RAID Manager binary files is included in the PATH environment variable. Create an XP RAID Manager instance to control pair operations and to gather disk array status information. Because XP Cluster Extension switches to the next available instance when a current instance becomes unavailable, HP recommends that you create several XP RAID Manager instances to provide redundancy. Bear in mind, however, that the XP RAID Manager instance numbers used for the RaidManagerInstances object must be the same among all servers using XP Cluster Extension. HP recommends that the XP RAID Manager instances be running at all times to provide the fastest failover capability. XP Cluster Extension provides scripts to include the XP RAID Manager startup procedure in the system startup file (for example, /etc/inittab for non-Windows systems). See Starting and stopping the XP RAID Manager instances on page 22 for more information. XP Cluster Extension starts the configured XP RAID Manager instances if it cannot find any running instance.

20

XP Cluster Extension features

Creating XP RAID Manager device groups


A device group is the unit in which the failover/failback operation is performed. A device group can contain several volume groups. Configure a single device group for a service group (VCS), resource group (HACMP), cluster group (MSCS), or resource (SLE HA, RHCS). This device group must include all disks being used for the application service. The XP RAID Manager configuration file (horcmX.conf) is used to map device groups to the internal disk array disks. A device group is the common unit for failover operations initiated from the server side.

Network considerations
Because XP RAID Manager is essential to XP Cluster Extension, HP recommends that you use the heartbeat network (a private network) for XP RAID Manager communications. Alternative network paths are highly recommended. Configure the networks XP RAID Manager uses for each device group in the HORCM_INST part of the XP RAID Manager configuration file.

XP Cluster Extension Software Administrator Guide

21

Starting and stopping the XP RAID Manager instances


Start the XP RAID Manager instances for XP Cluster Extension at system boot time to provide the fastest access to disk status information. XP Cluster Extension provides scripts (Linux/UNIX) or a service (Windows) to integrate the XP RAID Manager instance startup into the system startup process. This feature reduces resource group failover times because the XP Cluster Extension resource does not need to start the XP RAID Manager instances. If the system cannot automatically start and monitor XP RAID Manager instances, you can start and stop XP RAID Manager with the following commands: Linux/UNIX horcmstart.sh instance numbers horcmshutdown.sh instance_numbers Windows horcmstart instance_numbers horcmshutdown instance_numbers Starting XP RAID Manager without specifying an instance number will start instance 0 with the associated horcm.conf file. For this reason, zero (0) is not recommended as an instance number for an XP Cluster Extension RAID Manager instance.

Test takeover function


After configuring XP RAID Manager for the device groups used by XP Cluster Extension, verify that each device fails over correctly from each server in the cluster. The device group must be in PAIR state. CAUTION: XP RAID Manager keeps configuration data of the XP disk array in system memory. Therefore, you must stop and restart XP RAID Manager instances on all servers if a configuration change is applied to any XP disk array. To test the correct failover and failback behavior, log in to each server used with XP Cluster Extension and invoke the following commands if the local disk is the secondary (S-VOL) disk: Linux/UNIX export HORCMINST=instance_number pairdisplay g device_group_name fx CLI horctakeover g device_group_name t timeout Windows set HORCMINST=instance_number pairdisplay g device_group_name fx CLI horctakeover g device_group_name t timeout

22

XP Cluster Extension features

The output of the pairdisplay command indicates whether the local disk is the secondary (S-VOL) disk and if so, the horctakeover command shows a SWAP-takeover as a result. If pairdisplay shows the local disk as the primary (P-VOL) disk, log in to a system connected to the secondary (S-VOL) disk and invoke the horctakeover command there. If the horctakeover command does not result in a SWAP-takeover, see Recovery sequence on page 120 to resolve the issue. The t option of the horctakeover command is used only for fence level ASYNC (both Async and Journal).

XP Cluster Extension Software Administrator Guide

23

24

XP Cluster Extension features

2 Configuring XP Cluster Extension for AIX


XP Cluster Extension is integrated with the HACMP cluster software using HACMP customization functions. Cluster administrators configure disk array failover as a pre-event of the standard HACMP event get_disk_vg_fs. For information about how to install XP Cluster Extension in an IBM HACMP environment, see the HP StorageWorks XP Cluster Extension Software installation guide. For supported operating system versions, see the HP SPOCK website: http://www.hp.com/storage/ spock.

Configuring resources
The XP Cluster Extension resource gathers all necessary information about the disk arrays and configured device groups whenever a resource group is brought online. If configured, a pair/resync monitor is started to monitor the XP Cluster Extension resource. To use this monitor, set the standard HACMP event release_vg_fs to call a pre-event. Call the XP Cluster Extension binary clxhacmp as a pre-event of the standard HACMP event get_disk_vg_fs to check the status of the XP RAID Manager device group and, if necessary, to allow access to the device group before HACMP tries to access the disks of the resource group. Configure XP Cluster Extension parameters with the user configuration file: /etc/opt/hpclx/conf/ UCF.cfg. See Chapter 8 on page 123 for more information about the user configuration file.

Configuring the pair/resync monitor


The pair/resync monitor is a service that verifies that disks are in the pair state, and resyncs them when necessary. The pair/resync monitor determines whether the requesting server is allowed access to the pair/resync monitor. To access the pair/resync monitor, you must update the remote access hosts file and configure the pair/resync monitor port.

Updating the remote access hosts file


Enter the names of the remote systems in a remote access hosts file. 1. 2. Open the /etc/opt/hpclx/conf/clxhosts file. Enter each host name on a separate line. You can leave blank lines, but do not enter comments. For example:
# cat /etc/opt/hpclx/conf/clxhosts dcBserver dcAserver

XP Cluster Extension Software Administrator Guide

25

Configuring the pair/resync monitor port


Enter the port that the pair/resync monitor will monitor. 1. 2. Open the /etc/services file. Choose the port that the pair/resync monitor will use, and then add the following line to the services file: clxmonitor nnnnn /tcp where nnnnn is the port number. For example:
clxmonitor clxmonitor 22222/udp 22222/tcp # CLX Pair/Resync Monitor # CLX Pair/Resync Monitor

Integrating XP Cluster Extension into HACMP


1. Create a new Custom Cluster Event: #smitty hacmp 2. Select the following options: Extended Configuration Extended Event Configuration Configure Pre/Post-Event Commands Add a Custom Cluster Event Enter the following values: Cluster Event Name: get_disk_vg_fs_pre Cluster Event Description: Cluster Extension XP Cluster Event Script File: /opt/hpclx/bin/clxhacmp

3.

26

Configuring XP Cluster Extension for AIX

4.

Configure the newly created Custom Cluster Event as a pre-event of get_disk_vg_fs: #smitty hacmp

5.

Select the following options: Extended Configuration Extended Event Configuration Change/Show Pre-Defined HACMP Events Select the event get_disk_vg_fs. Define the previously defined custom event get_disk_vg_fs_pre as a pre-event of get_disk_vg_fs.

6. 7.

NOTE: With HACMP 5.2 and later, to have the get_disk_vg_fs event called, you must specify Serial Acquisition Order. If Serial Acquisition Order is not specified, AIX will use the default Parallel Acquisition Order. 8. Use SMIT and select the following options: Extended Configuration Extended Resource Configuration Configure Resource Group Run-Time Policies Configure Resource Group Processing Ordering

XP Cluster Extension Software Administrator Guide

27

9.

Specify the resource groups configured to run with XP Cluster Extension.

10. Once the resource groups have been specified, press Enter to complete the configuration process. XP Cluster Extension controls the disk pairs based on XP RAID Manager device groups. The volume group definition of the HACMP resource group is used to determine the corresponding XP RAID Manager device group. The mapping of the HACMP volume group configuration and the corresponding XP RAID Manager device group is done by the XP Cluster Extension user configuration file /etc/ opt/hpclx/conf/UCF.cfg. Because of this mapping mechanism, you must specify the volume groups owned by the HACMP resource groups in the user configuration file.

User configuration file for HACMP


Before configuring the objects in the user configuration file, review the XP Cluster Extension objects described in Chapter 8 on page 123. Set the ApplicationStartup object to RESYNCWAIT, because HACMP does not offer the feature for disabling resource groups on a particular system to move the resource group back to the most current copy of your data. If the ApplicationStartup object is set to FASTFAILBACK (default), the resource group will fail to be brought online in cases where the most current copy of your data resides in the disk array on the remote site. If you set the ApplicationStartup object to FASTFAILBACK, you must stop the resource group online process and either resynchronize your disk from the remote site or manually bring your resource group online at the remote site. The Vgs object in the user configuration file is used to map the volume groups of the HACMP resource group to the corresponding APPLICATION object within the user configuration file. The DeviceGroup object within each APPLICATION section determines the XP RAID Manager device group needed to control all the shared disks of the HACMP resource group.

28

Configuring XP Cluster Extension for AIX

Figure 3 on page 29 and the examples that follow show two possible mappings.

Figure 3 HACMP configuration example


.

Example 1
The application OracleRG corresponds to an HACMP resource group OracleRG, which consists of the volume groups ora1vg and ora2vg. The corresponding XP RAID Manager device group oracle controls all disks which form the volume groups of the HACMP resource group. The resource group is configured to wait for a pair resynchronization in case you have not done any disk pair recovery after the resource group has been moved to an alternate system. The resource group will be brought online on the local system again (ApplicationStartup object is set to RESYNCWAIT). The AutoRecover object is set to NO, which means that you will not utilize XP Cluster Extension capabilities to automatically recover suspended disk pair states. The DataLoseMirror object and DataLoseDataCenter object are set to NO, which means XP Cluster Extension will not allow you to bring the resource group online if the disk pair is suspended or a takeover operation leads to a suspended disk pair.

Example 2
The application SapRG uses the device group sap to control all the disks of the corresponding HACMP resource group SapRG, which uses the volume group sap1vg and sap2vg. The resource group is configured to fail back to the remote system rather than waiting for a pair resynchronization, in case you have not done any disk pair recovery after the resource group has been moved to an alternate system. The resource group will be brought online on the local system again (ApplicationStartup object is set to FASTFAILBACK per default). This setup will lead to an error loop, as HACMP does not provide the feature to automatically failback after an error has been reported. The AutoRecover object is set to NO by default, which means that you will not utilize XP Cluster Extension capabilities to automatically recover suspended disk pair states. The following example shows the configuration file for examples 1 and 2:

XP Cluster Extension Software Administrator Guide

29

COMMON LogDir /var/opt/hpclx/log/ #default (optional) LogLevel error # error|info default: error (optional) APPLICATION OracleRG # package/service group test_application Vgs ora1vg ora2vg # HACMP specific, to map vg to OracleRG ApplicationDir /etc/opt/hpclx XPSerialNumbers 30368 30380 RaidManagerInstances 11 DeviceGroup oracle # raid manager device group FenceLevel data # values: data | never | async ApplicationStartup resyncwait # values: fastfailback | resyncwait AutoRecover no # possible values: yes | no DataLoseMirror no # possible values: yes | no DataLoseDataCenter no # possible values: yes | no PreExecScript /etc/opt/hpclx/ora_pre.sh PostExecScript /etc/opt/hpclx/ora_post.sh APPLICATION Vgs XPSerialNumbers RaidManagerInstances DeviceGroup FenceLevel SapRG # package/service group test_application sap1vg sap2vg # HACMP specific, to map vg to SapRG 30368 30380 11 sap # raid manager device group never # possible values: data | never | async

Bringing a resource group online


Resource groups are usually brought online automatically when the cluster is started on a particular server. If necessary, a resource group can be brought online manually: 1. Run SMIT (HACMP section): #smitty hacmp 2. Select the following: System Management (C-SPOC) HACMP Resource Group and Application Management Bring a Resource Group Online

30

Configuring XP Cluster Extension for AIX

3.

Select a resource group from the list of available groups.

Taking a resource group offline


Resource groups will usually be taken offline automatically when the cluster is stopped on a particular system. If necessary, a resource group can be brought offline manually: 1. Run SMIT (HACMP section): #smitty hacmp 2. Select the following: System Management (C-SPOC) HACMP Resource Group and Application Management Bring a Resource Group Offline Select a resource group from the list of available groups.

3.

XP Cluster Extension Software Administrator Guide

31

Pair/resync monitor integration


The pair/resync monitor is used to detect and react to suspended XP Continuous Access links. It is activated by setting the ResyncMonitor object to YES. Additionally, the automatic disk pair resynchronization feature is activated if the ResyncMonitorAutoRecover object value is YES. When the HACMP resource group is taken offline, disable the monitor for the XP RAID Manager device group used for this resource group. CAUTION: If the resource group cannot be taken offline, disable monitoring of the device group for this HACMP resource group. To avoid data corruption, this must be part of the recovery procedure when XP Cluster Extension is deployed in the HACMP environment. See Stopping the pair/resync monitor on page 115. Ensure that the pair/resync monitor does not monitor and resynchronize the disk pair (device group) from both disk array sites. To use the pair/resync monitor you must create a pre-event for the release_vg_fs event. If the resource group is taken offline on a cluster system, the corresponding application/device group is taken from the list of monitored device groups and the monitoring is disabled. To adjust the event handling: 1. 2. Create an additional Custom Cluster Event. See Adding a Custom Cluster Event for pair/resync monitor integration on page 32. Configure the new Custom Cluster Event as a pre-event for the standard event release_vg_fs. See Redefining the Custom Cluster Event as a pre-event of the standard HACMP event on page 33.

Adding a Custom Cluster Event for pair/resync monitor integration


1. Run SMIT (HACMP section): #smitty hacmp 2. Select the following: Extended Configuration Extended Event Configuration Configure Pre/Post-Event Commands Add a Custom Cluster Event

32

Configuring XP Cluster Extension for AIX

3.

Enter the following values: Cluster Event Name: release_vg_fs_pre Cluster Event Description: XP Cluster Extension Pre-Event Cluster Event Script Filename: /opt/hpclx/bin/clxstopmonhacmp

Redefining the Custom Cluster Event as a pre-event of the standard HACMP event
1. Run SMIT (HACMP section): #smitty hacmp 2. Select the following: Extended Configuration Extended Event Configuration Change/Show Pre-Defined HACMP Events Select event release_vg_fs.

3.

XP Cluster Extension Software Administrator Guide

33

4.

Define the previously defined Custom Cluster Event release_vg_fs_pre as a pre-event of release_vg_fs.

Timing considerations
XP Cluster Extension is designed to prefer XP disk array operations over cluster software operations. If XP Cluster Extension invokes disk pair resynchronization operations or gathers information about the remote XP disk array, XP Cluster Extension will wait until the requested status information is reported. This assumption has been made to clearly prioritize data integrity over the cluster software's failover behavior. In some cases, however, this behavior could lead to an HACMP error event (config_too_long). The default timeout value is 6 minutes. Use the chssys command to increase the timeout. For example: #chssys -s clstrmgr -a "-u 60000" NOTE: You must stop the cluster to run the chssys command. The described timeouts can occur in the following situations: XP Cluster Extension uses XP RAID Manager instances to communicate with the remote XP disk array. Depending on the settings of the XP RAID Manager instance timeout parameter and the number of remote instances, the online operation could time out. This can happen if the local XP RAID Manager instance cannot reach the remote XP RAID Manager instance. See Setting up XP RAID Manager on page 20 for more details. XP Cluster Extension tries to resynchronize disk pairs and waits until the XP RAID Manager device group is in PAIR state if the ApplicationStartup attribute is set to RESYNCWAIT. Depending on the XP RAID Manager version and the XP firmware version, this could be a full resynchronization, which can take much longer than the online timeout interval. Even if the XP RAID Manager version and the XP firmware version allow a delta resynchronization, the delta between the primary and the secondary could be big enough for the copy process to exceed the online timeout value.

34

Configuring XP Cluster Extension for AIX

If running in fence level ASYNC, the default value of the AsyncTakeoverTimeout can cause the resource group online process to fail because its value is set very high. This is because the takeover process for fence level ASYNC can take longer when slow communication links are in place. To prevent takeover commands from being terminated prematurely by the takeover timeout, measure the time to copy the installed XP disk array cache. To measure the copy time, use only the slowest XP Continuous Access Software link. This ensures that the XP disk array cache can be transferred from the remote XP disk array, even in the event of a single surviving replication link between the XP disk arrays. Because the failover environment is dispersed into two (or more) data centers, the failover time cannot be expected to be the same as it would be in a single data center with a single shared disk device.

Failure behavior
XP Cluster Extension will run in an endless loop if either of the following is discovered: A configuration error An XP disk pair state that does not allow automated actions This event is logged in the log files: /var/opt/hpclx/log/clxhacmp.log and /tmp/hacmp.out. To return control to the HACMP cluster software, you must remove the lock file: application_dir/application_name.LOCK. For example: /etc/opt/hpclx/OracleRG.LOCK This process has been adopted from HACMP's behavior. In the case of a failure, HACMP will also run in an endless loop until you recover all errors and manually start the application. After all errors have been recovered, invoke the command clruncmd to return control to the cluster software.

Restrictions for IBM HACMP with XP Cluster Extension


The following is a summary of restrictions that apply for HACMP configurations when XP Cluster Extension is used to enable failover between two XP disk arrays: The FastFailbackEnabled object is not used by the XP Cluster Extension integration with HACMP. XP Cluster Extension must not be used with concurrent resource group configurations (for example, parallel databases). XP Cluster Extension must not be used with raw devices without volume groups. Target mode SCSI: Serial networks within HACMP based on target mode SCSI (TMSCSI) are not supported.

XP Cluster Extension Software Administrator Guide

35

36

Configuring XP Cluster Extension for AIX

3 Configuring XP Cluster Extension for Windows


After installing XP Cluster Extension, use the configuration tool to define the XP Cluster Extension setup configuration. After you configure XP Cluster Extension, use Cluster Administrator (Windows Server 2003), Failover Cluster Management (Windows Server 2008/2008 R2), or cluster commands in the CLI to add and configure resources. For information about how to install XP Cluster Extension, see the HP StorageWorks XP Cluster Extension installation guide.

Integrating XP Cluster Extension with MSCS


To integrate XP Cluster Extension with MSCS: 1. 2. 3. 4. Configure the XP Cluster Extension application. For instructions, see Configuring XP Cluster Extension on page 37. Add an XP Cluster Extension resource. For instructions, see Adding an XP Cluster Extension resource on page 42. Configure the XP Cluster Extension resource. For more information, see Configuring XP Cluster Extension resources on page 45. Add dependencies on the XP Cluster Extension resource. For instructions, see Adding dependencies on an XP Cluster Extension resource on page 64.

Configuring XP Cluster Extension


After installation, you must define the setup configuration using the XP Cluster Extension configuration tool, and then copy the configuration information to all of the cluster nodes that use XP Cluster Extension. You can configure XP Cluster Extension with the GUI or the CLI. Use the following instructions for the GUI. For instructions on performing XP Cluster Extension configuration tasks with the CLI, see Defining XP Cluster Extension configuration information using the CLI on page 40.

Starting the XP Cluster Extension configuration tool


To start the XP Cluster Extension configuration tool: For Windows Server 2003 or Windows Server 2008/2008 R2: Double-click the HP StorageWorks XP CLX Configuration Tool icon on the desktop, or select Start > Programs > Hewlett-Packard > HP StorageWorks XP CLX Configuration Tool. For Server Core or Hyper-V Server: Open a command window and enter CLXXPCONFIG -I. The XP CLX Configuration Tool window appears.

XP Cluster Extension Software Administrator Guide

37

NOTE: The service name clxmonitor is appended with the text (not configured) unless the port number is configured in the configuration tool.

Defining XP Cluster Extension configuration information using the GUI


1. 2. Open the configuration tool. For instructions, see Starting the XP Cluster Extension configuration tool on page 37. The pair/resync monitor monitors the disk pair status if the ResyncMonitor attribute is set to YES, and resynchronizes disk pairs if the ResyncMonitorAutoRecover attribute is set to YES. By default, the pair/resync monitor uses port 5307. To use a port other than 5307, enter the desired port number in the Port box. The range of available ports is 1024 through 65535. The pair/resync monitor port value must be the same on all cluster nodes.

38

Configuring XP Cluster Extension for Windows

3.

Specify the XP RAID Manager instances that define the device groups you want to manage with XP Cluster Extension. For more information about XP Cluster Extension and XP RAID Manager, see Setting up XP RAID Manager on page 20. a. Click Add in the XP RAID Manager Instance Configuration section to open the Add XP RAID Manager instances window.

b. 4.

Select the XP RAID Manager instances to use, and then click OK.

Specify the servers that are possible owners for the XP Cluster Extension-managed disks. A server is a possible owner of a disk if it is capable of managing the disk when failover occurs. a. Click Add in the Server Configuration section to display the Add Servers window.

b.

Select the servers that are possible owners of the XP Cluster Extension-managed disks, and then click OK. NOTE: See the Microsoft Cluster Administrator (Windows Server 2003) or Failover Cluster Management (Windows Server 2008/2008 R2) documentation for more information about possible owners.

XP Cluster Extension Software Administrator Guide

39

5.

Click OK to save the information and close the configuration tool. The configuration information is saved to the ClxXPCfg file. NOTE: XP Cluster Extension provides an XP RAID Manager service, which automatically starts XP RAID Manager instances at system boot time. This feature reduces resource group and service and application failover times because the XP Cluster Extension resource does not need to start the XP RAID Manager instances. When you click Apply or OK in the configuration tool, the XP RAID Manager service is started.

6.

Use the procedures in Importing and exporting configuration information on page 41 to copy the ClxXpCfg file to the other cluster nodes.

Defining XP Cluster Extension configuration information using the CLI


You can configure XP Cluster Extension with the CLI command CLXXPCONFIG. Enter CLXXPCONFIG -?, CLXXPCONFIG /?, or CLXXPCONFIG /help to view usage information. 1. 2. Enter CLXXPCONFIG -I to open the configuration tool. The pair/resync monitor monitors the disk pair status if the ResyncMonitor attribute is set to YES, and resynchronizes disk pairs if the ResyncMonitorAutoRecover attribute is set to YES. By default, the pair/resync monitor uses port 5307. To change the pair/resync monitor port, enter CLXXPCONFIG PRM /PORT=value, where value is the port number you want to use. NOTE: To view the pair/resync monitor port, enter CLXXPCONFIG PRM.

40

Configuring XP Cluster Extension for Windows

3.

Specify the XP RAID Manager instances that define the device groups you want to manage with XP Cluster Extension. For more information about XP Cluster Extension and XP RAID Manager, see Setting up XP RAID Manager on page 20. To view the available XP RAID Manager instances, enter CLXXPCONFIG RM. To add an XP RAID Manager instance, enter CLXXPCONFIG RM /ADDVAL=value, where value is the XP RAID Manager instance you want to add. For example: Enter CLXXPCONFIG RM /ADDVAL=101 to add XP RAID Manager instance number 101. To remove an XP RAID Manager instance, enter CLXXPCONFIG RM /REMOVEVAL=value, where value is the XP RAID Manager instance you want to remove. For example: Enter CLXXPCONFIG RM /REMOVEVAL=101 to remove XP RAID Manager instance number 101.

NOTE: XP Cluster Extension provides an XP RAID Manager service, which automatically starts XP RAID Manager instances at system boot time. This feature reduces resource group and service or application failover times because the XP Cluster Extension resource does not need to start the XP RAID Manager instances. Adding or removing XP RAID Manager instances will start or restart the XP RAID Manager service. 4. Specify the servers that are possible owners for the XP Cluster Extension-managed disks. A server is a possible owner of a disk if it is capable of managing the disk when failover occurs. To determine whether cluster nodes have been configured for XP Cluster Extension, enter CLXXPCONFIG SERVER. To add a server, enter CLXXPCONFIG SERVER /ADD /NAME=servername, where servername is the server to add. To remove a server, enter CLXXPCONFIG SERVER /REMOVE /NAME=servername, where servername is the server to remove. 5. Use the procedures in Importing and exporting configuration information on page 41 to copy the configuration information to the other cluster nodes.

Importing and exporting configuration information


The import feature allows you to define the setup configuration using an existing configuration file. The export feature allows you to save a copy of an existing configuration file. Use the import and export features to copy the XP Cluster Extension configuration file (ClxXpCfg) from one cluster node to another.

Exporting configuration settings using the GUI


1. Open the configuration tool. For instructions, see Starting the XP Cluster Extension configuration tool on page 37. 2. 3. 4. Click Export. When prompted, choose a save location, enter a file name, and then click Save. Click OK to save and close the configuration tool.

XP Cluster Extension Software Administrator Guide

41

Exporting configuration settings using the CLI


1. 2. Open a command window. Enter CLXXPCONFIG EXPORT /FILE=filepath, where filepath specifies the save location and file name.

Importing configuration settings using the GUI


1. Open the configuration tool. For instructions, see Starting the XP Cluster Extension configuration tool on page 37. 2. 3. 4. Click Import. When prompted, choose the configuration file, and then click Open. Click OK to save and close the configuration tool.

Importing configuration settings using the CLI


1. 2. Open a command window. Enter CLXXPCONFIG IMPORT /FILE=filepath, where filepath specifies the file location and name.

Adding an XP Cluster Extension resource


IMPORTANT: In Cluster Administrator (Windows Server 2003), resources are added to resource groups. In Failover Cluster Management (Windows Server 2008/2008 R2), the term resource groups changed to services and applications. In this guide, the term services and applications refers to resource groups for Windows Server 2003 and services and applications for Windows Server 2008/2008 R2. To use XP Cluster Extension, you must add an XP Cluster Extension resource. Use one of the following procedures to add an XP Cluster Extension resource: For Windows Server 2003, use the Cluster Administrator GUI or cluster commands in the CLI. For instructions, see Adding an XP Cluster Extension resource using the Cluster Administrator GUI (Windows Server 2003) on page 43 or Adding an XP Cluster Extension resource using the Microsoft CLI cluster commands on page 44. For Windows Server 2008/2008 R2, use the Failover Cluster Management GUI or cluster commands in the CLI. For instructions, see Adding an XP Cluster Extension resource using the Failover Cluster Management GUI (Windows Server 2008/2008 R2) on page 44 or Adding an XP Cluster Extension resource using the Microsoft CLI cluster commands on page 44. For Server Core or Hyper-V Server, use the Failover Cluster Management GUI on the remote management station or cluster commands in the CLI. For instructions, see Adding an XP Cluster Extension resource using the Failover Cluster Management GUI (Windows Server 2008/2008 R2) on page 44 or Adding an XP Cluster Extension resource using the Microsoft CLI cluster commands on page 44. XP Cluster Extension resource names and service and application names must consist of only one word. The pair/resync monitor cannot interface with a resource or a service and application that has space characters in its name.

42

Configuring XP Cluster Extension for Windows

CAUTION: Do not use the following characters in XP Cluster Extension resource names: \ / : * ? " < > |. Using these characters might affect the creation of the resourcename.online file, which is used for the XP Cluster Extension resource health check mechanism. If the file creation fails and the pair/resync monitor is not configured, the cluster will report a failed state for the XP Cluster Extension resource.

Adding an XP Cluster Extension resource using the Cluster Administrator GUI (Windows Server 2003)
Use the procedure in this section to add a resource using the Cluster Administrator GUI. For instructions on using the CLI, see Adding an XP Cluster Extension resource using the Microsoft CLI cluster commands on page 44. 1. 2. 3. Add a resource group in the Cluster Administrator GUI, as described in your Microsoft documentation. From the File menu, select File > New > Resource. Enter the following values, and then click Next: Name: The name of the resource. Description: As appropriate for the resource. Resource type: Select Cluster Extension XP from the list. Group: Select a group to associate with the resource. 4. Add or remove possible resource owners, and then click Next. The Dependencies window appears. 5. Do not add any dependencies. Click Next to open the Parameters window. The Parameters window contains values entered during the XP Cluster Extension configuration steps.

6.

Modify the resource property values of the new XP Cluster Extension resource as needed, and then click Finish.

XP Cluster Extension Software Administrator Guide

43

Adding an XP Cluster Extension resource using the Failover Cluster Management GUI (Windows Server 2008/2008 R2)
Use the procedure in this section to add a resource using the Failover Cluster Management GUI. For instructions on using the CLI, see Adding an XP Cluster Extension resource using the Microsoft CLI cluster commands on page 44. 1. 2. Add a service or application in the Failover Cluster Management GUI, as described in your Microsoft documentation. Right-click the service or application and select Add a resource > More resources > Add Cluster Extension XP.

Adding an XP Cluster Extension resource using the Microsoft CLI cluster commands
You can use the cluster commands in this section with Windows Server 2003, Windows Server 2008/ 2008 R2, Server Core, and Hyper-V Server. Use the following command to add an XP Cluster Extension resource: cluster resource resource_name /create /group:service_or_application_name /type:"Cluster Extension XP" Example This command adds an XP Cluster Extension resource called clx_fileshare to the CLX_SHARE service or application.
cluster resource clx_fileshare /create /group:CLX_SHARE /type:"Cluster Extension XP"

Changing an XP Cluster Extension resource name


When changing resource names, observe the following rules: Do not change an XP Cluster Extension resource name when the resource is online and the pair/resync monitor is enabled for the resource. Changing the name when the resource is online might cause problems with the pair/resync monitor functionality. Changing the resource name is allowed when a resource is offline, or when a resource is online and the pair/resync monitor is disabled. XP Cluster Extension resource names and service and application names must consist of only one word. The pair/resync monitor cannot interface with a resource or a service and application that has space characters in its name. Do not use the following characters in XP Cluster Extension resource names: \ / : * ? " < > |. Using these characters might affect the creation of the resourcename.online file, which is used for the XP Cluster Extension resource health check mechanism. If the file creation fails and the pair/resync monitor is not configured, the cluster will report a failed state for the XP Cluster Extension resource.

Changing an XP Cluster Extension resource name (Windows Server 2003)


In this procedure, you use the Cluster Administrator GUI to change a resource name. For instructions on using CLI commands to change a resource name, see Setting XP Cluster Extension resource properties using the CLI on page 63.

44

Configuring XP Cluster Extension for Windows

1. 2. 3. 4.

Open Cluster Administrator. Open the resource Properties window and click the General tab. Enter a new name in the Name field. Click OK to save your changes and close the window.

Changing an XP Cluster Extension resource name (Windows Server 2008/2008 R2)


In this procedure, you use the Failover Cluster Management GUI to change a resource name. For Server Core or Hyper-V Server, use the MMC to run the Failover Cluster Management GUI from a remote node or use cluster commands in the CLI to change the resource name. See Setting XP Cluster Extension resource properties using the CLI on page 63 for instructions. 1. 2. 3. 4. Open Failover Cluster Management. Open the resource Properties window and click the General tab. Enter a new name in the Resource Name field. Click OK to save your changes and close the window.

Configuring XP Cluster Extension resources


IMPORTANT: In Cluster Administrator (Windows Server 2003), resources are added to resource groups. In Failover Cluster Management (Windows Server 2008/2008 R2), the term resource groups changed to services and applications. In this guide, the term services and applications refers to resource groups for Windows Server 2003 and services and applications for Windows Server 2008/2008 R2. XP Cluster Extension resource properties are configured using Cluster Administrator (Windows Server 2003), Failover Cluster Management (Windows Server 2008/2008 R2), or cluster commands in the CLI. For information about MSCS and Microsoft Failover Cluster Service properties that affect XP Cluster Extension, see Setting Microsoft cluster-specific resource and service or application properties on page 46. For information about XP Cluster Extension-specific properties, see Setting XP Cluster Extensionspecific resource properties on page 49. Before configuring XP Cluster Extension resources, review the XP Cluster Extension objects described in Chapter 8 on page 123. When configuring XP Cluster Extension resources, note the following: If the Cluster Administrator or Failover Cluster Management GUI is used to configure an XP Cluster Extension resource, configuring the resource using a user configuration file (UCF) is not required. If the resource is not configured to use the pair/resync monitor, XP Cluster Extension creates a file called resource_name.online in the directory specified by the ApplicationDir resource property. If the resource is taken offline, this file will be removed, or the device group associated with the service or application will be removed from the pair/resync monitor list, if the pair/resync monitor

XP Cluster Extension Software Administrator Guide

45

is configured for that resource. If the device group is the last monitored disk pair, and you take the resource offline, the pair/resync monitor will be stopped. Windows Server 2008 only: If an XP Cluster Extension resource is not configured, the resource icon in the Failover Cluster Management GUI shows the message not configured next to the resource status. The XP Cluster Extension resource must be the first resource for all disk resources in the dependency list of a resource cluster group. If you have an application in a cluster that uses more than one physical disk from the same device group, configure a single XP Cluster Extension resource, and ensure that all of the application disks depend on that resource. If the disks are split into different device groups, you must configure multiple XP Cluster Extension resources since an XP Cluster Extension resource operates at the device-group level. The PendingTimeout value must be greater than the ResyncWaitTimeout value. The PendingTimeout must be greater than twice the wait time of all remote XP RAID Manager instances multiplied by the number of remote systems. Otherwise, the XP Cluster Extension resource will fail to go online if there is a complete remote data center failure. tonline > nremote systems x 2 x tWT where: tonline = resource online timeout nremote systems = number of remote systems configured to run XP RAID Manager instances tWT = wait time until remote error will be reported by local XP RAID Manager instance If a post-executable is specified, the PendingTimeout must be greater than the number of remote systems multiplied by three times tWT.

Setting Microsoft cluster-specific resource and service or application properties


Microsoft allows you to set specific failover parameter and threshold values for a service or application, as well as for a resource. Some of these values must be changed for XP Cluster Extension to enable manual recovery actions in case of a disaster. To set Microsoft cluster-specific resource properties: For Windows Server 2003, use the Cluster Administrator GUI or cluster commands in the CLI. For Windows Server 2008/2008 R2, use the Failover Cluster Management GUI or cluster commands in the CLI. For Server Core or Hyper-V Server, use cluster commands in the CLI. TIP: You can use the GUI option for Server Core or Hyper-V Server by using the MMC to manage a cluster remotely. For more information about using the MMC, see your Microsoft documentation.

XP Cluster Extension requirements for Cluster Administrator and Failover Cluster Management resource properties are described in Table 1 on page 47. If there is no required value for a property, the valid and/or default values are specified. Set these properties in the resource properties window or the CLI. If you use the CLI, use the following command: cluster.exe resource ResourceName /prop PropertyName="PropertyValue".

46

Configuring XP Cluster Extension for Windows

For more information about setting resource properties, see your Microsoft documentation. Table 1 Setting resource properties and values in the GUI Property
Thorough Resource Health Check Interval (Windows Server 2008/2008 R2) Is Alive poll interval (Windows Server 2003) IsAlivePollInterval (CLI)

Format
Integer

Description
Used to poll Alive state for the resource. Decreasing this value allows faster resource failure detection but also consumes more system resources. Set this value in the Advanced Policies tab of the resource properties window in Failover Cluster Management, or in the Advanced tab of the resource properties window in Cluster Administrator.

Value
Windows Server 2008/2008 R2 GUI: 01:00 mm:ss (Default) Windows Server 2003 GUI: 60000 milliseconds (Default) CLI: 60000 milliseconds (Default)

Basic Resource Health Check Interval (Windows Server 2008/2008 R2) Looks Alive poll interval (Windows Server 2003) LooksAlivePollInterval (CLI)

Integer

Used to poll Alive state for the resource. Decreasing this value allows for faster resource failure detection but also consumes more system resources. Set this value in the Advanced Policies tab of the resource properties window in Failover Cluster Management, or in the Advanced tab of the resource properties window in Cluster Administrator.

Windows Server 2008/2008 R2 GUI: 00:05 mm:ss (Default) Windows Server 2008/2008 R2 CLI: 5000 milliseconds (Default) Windows Server 2003 GUI: 60000 milliseconds (Default) Windows Server 2003 CLI: 60000 milliseconds (Default) 0 (Required)

If a resource fails, attempt restart on current node Maximum restarts in the specified period (Windows Server 2008/2008 R2) Restart Threshold (Windows Server 2003) RestartThreshold (CLI) If restart is unsuccessful, fail over all resources in this service or application (Windows Server 2008/2008 R2) RestartAction (Windows Server 2003) RestartAction (CLI)

Integer

Defines whether a resource can be automatically restarted after it has failed. Set this value in the Policies tab of the resource properties window in Failover Cluster Management, or in the Advanced tab of the resource properties window in Cluster Administrator.

Integer

Defines whether resources will be failed over if a restart is unsuccessful. Set this value in the Policies tab of the resource properties window in Failover Cluster Management, or in the Advanced tab of the resource properties window in Cluster Administrator. Windows Server 2003 only: This value must affect the group. This ensures that the resource group fails over to another system if a resource is reported FAILED.

Windows Server 2008/2008 R2: Check (Required) Windows Server 2003: Restart and affect the group (Required, Default) CLI: 2 restart and affect the group (Required)

If a resource fails, attempt restart on current node Period for restarts (Windows Server 2008/2008 R2) RestartPeriod (Windows Server 2003) RestartPeriod (CLI)

Integer

Determines the amount of time for restart. Set this value in the Policies tab of the resource properties window in Failover Cluster Management, or in the Advanced tab of the resource properties window in Cluster Administrator.

Windows Server 2008/2008 R2: 15:00 mm:ss (Default) Windows Server 2003: 900 seconds (Default) CLI: 900000 milliseconds (Default)

XP Cluster Extension Software Administrator Guide

47

Property
Pending timeout (GUI) PendingTimeout (CLI)

Format
Integer

Description
Used to specify the timeout for status resolution. For more information, see Timing considerations for Microsoft Cluster Service on page 72. Set this value in the Policies tab of the resource properties window in Failover Cluster Management, or in the Advanced tab of the resource properties window in Cluster Administrator.

Value
Windows Server 2008/2008 R2: 03:00 mm:ss Windows Server 2003: 180 seconds (Default) CLI: 180000 milliseconds (Default)

XP Cluster Extension requirements for service or application properties are described in Table 2 on page 48. If no specific value is required, the default value is listed. Set these values in the Failover tab of the service or application properties window (Windows Server 2008/2008 R2), the resource group properties window (Windows Server 2003), or in the CLI. For more information about setting service or application properties, see your Microsoft documentation. TIP: To change the properties in Table 2 on page 48 with the CLI, use the following command: cluster group groupname /prop propertyname="propertyvalue".

Table 2 Service or application properties and values Property


GUI: Failback (Prevent failback or Allow failback) CLI: AutoFailbackType

Format
Integer

Description
Prevents automatic fail back of a service or application to its primary system. Transfer the service or application back manually after the failure has been recovered. This allows for recovery of all possible failure sources and pair resynchronization (if necessary) while the application service is still running. Determines the time (in hours) over which the cluster service attempts to fail over a service or application. See Timing considerations for Microsoft Cluster Service on page 72 for more information. Determines the number of failover attempts. The default value allows the cluster service to transfer the service or application to each system once in case of subsequent system failure. Due to the nature of this parameter, it is possible that the service or application automatically restarts on a system several times if all cluster systems are not members of the cluster at that time. If this value is set to a number higher than the current number of clustered systems for the cluster group, the service or application will continue to restart until either the FailoverThreshold value or the FailoverPeriod timeout value is reached.

Value
GUI: Prevent failback CLI: 0 (required)

GUI: Period CLI: FailoverPeriod

String

6 (Default)

GUI: Maximum failures in the specified period (Windows Server 2008/2008 R2), Threshold (Windows Server 2003) CLI: FailoverThreshold

Integer

Windows Server 2008/2008 R2: Number of nodes in the cluster minus 1. Windows Server 2003, CLI: 10 (Default)

48

Configuring XP Cluster Extension for Windows

Setting XP Cluster Extension-specific resource properties


Changes to resource properties take effect when the resource is brought online again. For instructions on changing resource properties, see: Setting XP Cluster Extension resource properties using the Cluster Administrator GUI (Windows Server 2003), page 49 Setting XP Cluster Extension resource properties using the GUI (Windows Server 2008/2008 R2, Server Core, and Hyper-V Server), page 54 Setting XP Cluster Extension resource properties using the MMC, page 62 Setting XP Cluster Extension resource properties using the CLI, page 63 Setting XP Cluster Extension properties using a UCF, page 64

Setting XP Cluster Extension resource properties using the Cluster Administrator GUI (Windows Server 2003)
You can set XP Cluster Extension properties by using the Parameters tab in the Cluster Administrator GUI.

Configuring XP RAID Manager instance numbers for XP RAID Manager service Use the Cluster Administrator Properties window to change XP RAID Manager instance numbers. 1. 2. 3. 4. Open Cluster Administrator and double-click the resource you want to edit. Click the Parameters tab. To remove an instance, select it and click Remove. To add an instance: a. b. 5. Click Add to open the Add RAID Manager instances window. Select one or more instances, and then click OK.

Click OK to save your changes and close the window.

Configuring the XP RAID Manager device group details 1. Open Cluster Administrator and double-click the resource you want to edit. 2. Click the Parameters tab.

XP Cluster Extension Software Administrator Guide

49

3. 4.

To change the device group details, select a new value in the RM XP device group menu. Click OK to save your changes and close the window.

Configuring XP RAID Manager device group advanced properties The Parameters tab of the XP Cluster Extension resource offers basic settings and is used to enter environment data, such as XP RAID Manager instances. The more advanced settings can be accessed through additional buttons in the Parameters tab. 1. 2. 3. Open Cluster Administrator and double-click the resource you want to edit. Click the Parameters tab. Click the Advanced button to open the Advanced Fence Level Failover Behavior window. The available settings in this window depend on the fence level used with your device groups. For the DATA fence level, you can update the Data lose mirror and DATA lose data center values. See DataLoseDataCenter on page 133 and DataLoseMirror on page 134 for more information about these values.

For the ASYNC fence level, you can update the ASYNC takeover timeout value. See AsyncTakeoverTimeout on page 130 for more information about this value.

50

Configuring XP Cluster Extension for Windows

For the journal fence level, you can update the Journal data currency on S-VOL and ASYNC takeover timeout values. See JournalDataCurrency on page 136 and AsyncTakeoverTimeout on page 130 for more information about these values.

4. 5. Notes

Update the settings as needed, and then click OK to close the window. Click OK to save your changes and close the window.

After a device group is configured in the resource configuration utility, do not change the device group name or swap the name with another device group name in the HORCM file. If you do this, restart the HORCM manager instance and reconfigure the XP Cluster Extension resource. Do not use HORCM commands to change the device group property for a device group that is configured for an XP Cluster Extension resource. If you do this, the changed property is not reflected immediately in the Parameters tab. To work around this situation, re-select the device group from the XP RM device group menu in the Parameters tab. Configuring server data center assignments 1. Open Cluster Administrator and double-click the resource you want to edit. 2. 3. 4. 5. 6. Click the Parameters tab. To remove a data center assignment, select the assignment, and then click Remove. To modify a data center assignment, select the assignment, and then click Modify. Enter the new Data center name in the Modify Node in Data Center List window, and then click OK. To add a data center assignment, click Add. Select a host and a data center, and then click OK. Click OK to save your changes and close the window.

Changing failover and failback behavior 1. Open Cluster Administrator and double-click the resource you want to edit. 2. Click the Parameters tab.

XP Cluster Extension Software Administrator Guide

51

3.

Click Failover/Failback to display the Failover/Failback window.

4. 5.

Update the ApplicationStartup and AutoRecover values as needed, and then click OK. Click OK to save your changes and close the window.

Activating the pair/resync monitor The pair/resync monitor detects and responds to suspended XP Continuous Access links if the ResyncMonitor object is set to YES. If the ResyncMonitorAutoRecover object is set to YES, automatic disk pair resynchronization is also activated. When the resource is taken offline, the monitor is stopped for the XP RAID Manager device group used for this resource. CAUTION: If a resource cannot be taken offline manually, and goes into a failed state, the cluster administrator must disable monitoring of the device group for this resource. To avoid data corruption, this task must be part of the recovery procedure when XP Cluster Extension is deployed in an MSCS/Failover Cluster Service environment. See Stopping the pair/resync monitor on page 115. You must ensure that the pair/resync monitor does not monitor and resynchronize the disk pair (device group) from both disk array sites. To use the pair/resync monitor with an XP Cluster Extension resource: 1. 2. 3. Open Cluster Administrator and double-click the resource you want to edit. Click the Parameters tab. Click Pair ResyncMon to open the Pair/Resync Monitor Properties window.

4.

Select the Use pair/resync monitor check box to set the ResyncMonitor object to YES.

52

Configuring XP Cluster Extension for Windows

5. 6. 7. 8.

Select the Pair/resync monitor autoRecovery check box to set the ResyncMonitorAutoRecover object is to YES. If you want to change the monitoring interval (ResyncMonitorInterval), enter a value in the Monitor interval box. Click OK to save your changes and close the Pair/Resync Monitor Properties window. Click OK to save your changes and close the Properties window.

TIP: You can activate ResyncMonitor from cluster commands in the CLI. For example, if your XP Cluster Extension resource is clx_fileshare, enter the following command: cluster resource clx_fileshare /privprop ResyncMonitor=yes.

Configuring takeover actions Pre-executables and post-executables can be defined to be executed before or after XP Cluster Extension invokes its takeover functions. 1. 2. 3. Open Cluster Administrator and double-click the resource you want to edit. Click the Parameters tab. Click Pre/Post Exec to display the Pre/Post Executable Properties window.

4.

Update the PreExecScript, PostExecScript, and PostExecCheck values as needed, and then click OK. When configuring pre/post takeover executable paths, enter the full path to the script. If a script fails, the XP Cluster Extension resource will fail.

5.

Click OK to save your changes and close the Properties window or Resource Configuration tool.

Configuring rolling disaster protection To configure rolling disaster protection for an XP Cluster Extension resource: 1. 2. Open Cluster Administrator and double-click the resource you want to edit. Click the Parameters tab.

XP Cluster Extension Software Administrator Guide

53

3.

Click Rolling Disaster to display the Rolling Disaster Protection window.

4.

Add mirror units to each data center: a. b. c. Click Add MU # to DC A. Select mirror units from the list, and click OK. Repeat the previous steps for Data Center B.

5. 6.

Update the BCResyncEnabledA, BCResyncEnabledB, BCResyncMuListA, and BCResyncMuListB values as needed, and then click OK. Click OK to save your changes and close the Properties window.

NOTE: For more information, see Setting objects to enable rolling disaster protection on page 141.

Setting XP Cluster Extension resource properties using the GUI (Windows Server 2008/2008 R2, Server Core, and Hyper-V Server)
This section describes the procedures for setting XP Cluster Extension resource properties with a GUI. You can perform these procedures through the resource configuration utility using the Failover Cluster Management GUI or the standalone resource configuration tool. For instructions on using the two GUI options, see the following sections: Using Failover Cluster Management to set resource properties (Windows Server 2008/2008 R2), page 55 Using the resource configuration tool to set resource properties (Server Core and Hyper-V Server), page 55

54

Configuring XP Cluster Extension for Windows

TIP: For information on managing XP Cluster Extension resources from a remote management station through the MMC, see Setting XP Cluster Extension resource properties using the MMC on page 62.

Using Failover Cluster Management to set resource properties (Windows Server 2008/2008 R2) For Windows Server 2008/2008 R2, use the Failover Cluster Management GUI to set resource properties. 1. 2. 3. 4. Open Failover Cluster Management. Double-click the XP Cluster Extension resource in the summary pane to open the Properties window. Click the Parameters tab. Make the necessary parameter changes, and then click OK.

Using the resource configuration tool to set resource properties (Server Core and Hyper-V Server) For Server Core or Hyper-V Server, use the XP Cluster Extension resource configuration tool to set resource properties. When using the resource configuration tool: You must run the tool on a Server Core or Hyper-V cluster node. You cannot run the tool on a remote management station. You cannot use the resource configuration tool to add or delete a resource.

XP Cluster Extension Software Administrator Guide

55

You can use the tool to configure multiple resources at one time. This saves time because you can switch resources from the tool menu. The resource configuration tool is recommended for Hyper-V and Server Core environments because the properties you enter are validated. When you configure XP Cluster Extension resource properties from a remote management station or through the CLI, the properties you enter are not validated.

56

Configuring XP Cluster Extension for Windows

To use the resource configuration tool: 1. 2. 3. Open a command window and enter ClxXpResConfig.exe. Select the resource you want to change in the XP CLX resource menu. Make the necessary parameter changes, and then click OK.

Configuring XP RAID Manager instance numbers for XP RAID Manager service To configure XP RAID Manager instance numbers from the Failover Cluster Management Parameters tab or the resource configuration tool: 1. To add an instance: a. b. 2. 3. Click Add to open the Add RAID Manager instances window. Select one or more instances and click OK.

To remove an instance, select it and click Remove. Click OK to save your changes and close the window.

XP Cluster Extension Software Administrator Guide

57

Configuring the XP RAID Manager device group details To configure XP RAID Manager device group details from the Failover Cluster Management Parameters tab or the resource configuration tool: 1. 2. Select a value in the RM XP device group menu. Click OK to save your changes and close the window.

Configuring XP RAID Manager device group advanced properties The Parameters tab of the XP Cluster Extension resource offers basic settings and is used to enter environment data, such as XP RAID Manager instances. The more advanced settings can be accessed through additional buttons in the Parameters tab. To configure XP RAID Manager advanced properties from the Failover Cluster Management Parameters tab or the resource configuration tool: 1. Click the Advanced button to open the Advanced Fence Level Failover Behavior window. The available settings in this window depend on the fence level used with your device groups. For the DATA fence level, you can update the Data lose mirror and DATA lose data center values. See DataLoseDataCenter on page 133 and DataLoseMirror on page 134 for more information about these values.

58

Configuring XP Cluster Extension for Windows

For the ASYNC fence level, you can update the ASYNC takeover timeout value. See AsyncTakeoverTimeout on page 130 for more information about this value.

For the journal fence level, you can update the Journal data currency on S-VOL and ASYNC takeover timeout values. See JournalDataCurrency on page 136 and AsyncTakeoverTimeout on page 130 for more information about these values.

2. 3. Notes

Update the settings as needed, and then click OK to close the window. Click OK to save your changes and close the window.

After a device group is configured in the resource configuration utility, do not change the device group name or swap the name with another device group name in the HORCM file. If you do this, restart the HORCM manager instance and reconfigure the XP Cluster Extension resource. Do not use HORCM commands to change the device group property for a device group that is configured for an XP Cluster Extension resource. If you do this, the changed property is not reflected immediately in the Parameters tab. To work around this situation, re-select the device group from the XP RM device group menu in the Parameters tab. Configuring server data center assignments To configure server data center assignments from the Failover Cluster Management Parameters tab or the resource configuration tool: 1. 2. 3. To remove a data center assignment, select the assignment, and then click Remove. To modify a data center assignment, select the assignment, and then click Modify. Enter the new Data center name in the Modify Node in Data Center List window, and then click OK. To add a data center assignment, click Add. Select a host and a data center, and then click OK.

XP Cluster Extension Software Administrator Guide

59

4.

Click OK to save your changes and close the window.

Changing failover and failback behavior To configure failover and failback behavior from the Failover Cluster Management Parameters tab or the resource configuration tool: 1. Click Failover/Failback to display the Failover/Failback window.

2. 3.

Update the ApplicationStartup and AutoRecover values as needed, and then click OK. Click OK to save your changes and close the Properties window or Resource Configuration tool.

Activating the pair/resync monitor The pair/resync monitor detects and responds to suspended XP Continuous Access links if the ResyncMonitor object is set to YES. If the ResyncMonitorAutoRecover object is set to YES, automatic disk pair resynchronization is also activated. When the resource is taken offline, the monitor is stopped for the XP RAID Manager device group used for this resource. CAUTION: If a resource cannot be taken offline manually, and goes into a failed state, the cluster administrator must disable monitoring of the device group for this resource. To avoid data corruption, this task must be part of the recovery procedure when XP Cluster Extension is deployed in an MSCS environment. See Stopping the pair/resync monitor on page 115. You must ensure that the pair/resync monitor does not monitor and resynchronize the disk pair (device group) from both disk array sites. To activate the pair/resync monitor from the Failover Cluster Management Parameters tab or the resource configuration tool:

60

Configuring XP Cluster Extension for Windows

1.

Click Pair ResyncMon to open the Pair/Resync Monitor Properties window.

2. 3. 4. 5. 6.

Select the Use pair/resync monitor check box to set the ResyncMonitor object to YES. Select the Pair/resync monitor autoRecovery check box to set the ResyncMonitorAutoRecover object is to YES. If you want to change the monitoring interval (ResyncMonitorInterval), enter a value in the Monitor interval box. Click OK to save your changes and close the Pair/Resync Monitor Properties window. Click OK to save your changes and close the Properties window or Resource Configuration tool.

TIP: You can activate ResyncMonitor from the Microsoft CLI. For example, if your XP Cluster Extension resource is clx_fileshare, enter the following command: C:\>cluster resource clx_fileshare /privprop ResyncMonitor=yes.

Configuring takeover actions Pre-executables and post-executables can be defined to be executed before or after XP Cluster Extension invokes its takeover functions. To configure takeover actions from the Failover Cluster Management Parameters tab or the resource configuration tool: 1. Click Pre/Post Exec to display the Pre/Post Executable Properties window.

XP Cluster Extension Software Administrator Guide

61

2.

Update the PreExecScript, PostExecScript, and PostExecCheck values as needed, and then click OK. When configuring pre/post takeover executable paths, enter the full path to the script. If a script fails, the XP Cluster Extension resource will fail.

3.

Click OK to save your changes and close the Properties window or Resource Configuration tool.

Configuring Rolling Disaster Protection To configure rolling disaster protection from the Failover Cluster Management Parameters tab or the resource configuration tool: 1. Click Rolling Disaster to display the Rolling Disaster Protection window.

2.

Add mirror units to each data center: a. b. c. Click Add MU # to DC A. Select mirror units from the list, and click OK. Repeat the previous steps for Data Center B.

3. 4.

Update the BCResyncEnabledA, BCResyncEnabledB, BCResyncMuListA, and BCResyncMuListB values as needed, and then click OK. Click OK to save your changes and close the Properties window or Resource Configuration tool.

NOTE: For more information, see Setting objects to enable rolling disaster protection on page 141.

Setting XP Cluster Extension resource properties using the MMC


If you are using Server Core or Hyper-V Server, you can manage a cluster remotely by using the MMC to run Failover Cluster Management.

62

Configuring XP Cluster Extension for Windows

NOTE: When you configure XP Cluster Extension resource properties from a remote management station through the MMC, which uses the standard Microsoft Properties tab, the properties you enter are not validated, so you must enter the property values accurately, and verify them against the XP Cluster Extension documentation. When you use this option, you will see the default Microsoft properties page instead of the XP Cluster Extension Parameters tab. For more information about using the MMC, see Remote management of XP Cluster Extension resources in a cluster (Windows Server 2008/2008 R2) on page 73 and your Microsoft documentation.

Setting XP Cluster Extension resource properties using the CLI


The cluster commands in this section can be used with Windows Server 2003, Windows Server 2008/2008 R2, Server Core, or Hyper-V Server. The MSCS default properties for a resource can be changed using the following command: cluster resource resource_name /privprop [object_name=value|"value1 value2 ..."]. NOTE: When you configure XP Cluster Extension resource properties using the CLI, the properties you enter are not validated, so you must enter the property values accurately, and verify them against the XP Cluster Extension documentation. You can display all attributes of the XP Cluster Extension resource clx_fileshare with the following command: cluster resource clx_fileshare /privprop

XP Cluster Extension Software Administrator Guide

63

The following example changes the FenceLevel property of the XP Cluster Extension resource clx_fileshare:
C:\>cluster resource clx_fileshare /privprop FenceLevel=data

The following example changes the XP RAID Manager instance used for the XP Cluster Extension resource clx_fileshare from 10 to 99, and then adds instance 22 to provide redundancy:
C:\>cluster resource clx_fileshare /privprop RaidManagerInstances="99 22"

The following example changes the name of XP Cluster Extension resource XP Cluster Extension resource1 to XP Cluster Extension resource2:
cluster resource "XP Cluster Extension resource1" /ren:"XP Cluster Extension resource2"

Setting XP Cluster Extension properties using a UCF


You can use a UCF to configure certain XP Cluster Extension properties for Windows. Properties that you can configure in a UCF include: LogLevel ClusterNotifyCheckTime ClusterNotifyWaitTime LocalDCLMForNonPAIRDG StatusRefreshInterval

IMPORTANT: If you plan to use the default values for these properties, no UCF is required. To configure properties using a UCF: 1. 2. 3. Take the XP Cluster Extension resource offline. Open the sample UCF.cfg file located in %HPCLX_PATH%\sample. Update the file with the property values you want to use. For more information on the available properties, see Chapter 8 on page 123. 4. 5. Save the file and copy it to the following directory on all cluster nodes: %HPCLX_PATH%\conf. Bring the XP Cluster Extension resource online.

Adding dependencies on an XP Cluster Extension resource


XP Cluster Extension Software must be the first resource in the resource chain of a MSCS service or application. All resources that depend on the disk resource, such as a file share, and all disk resources (physical disks), must be configured for dependency on the XP Cluster Extension resource. When adding dependencies: For Windows Server 2003, use the Cluster Administrator GUI or cluster commands in the CLI.

64

Configuring XP Cluster Extension for Windows

For Windows Server 2008/2008 R2, use the Failover Cluster Management GUI, cluster commands in the CLI, or the MMC for remote management. For Server Core or Hyper-V Server, use cluster commands in the CLI or the MMC.

Adding dependencies using Cluster Administrator (Windows Server 2003)


1. 2. 3. 4. 5. 6. Open Cluster Administrator. Select the Resources folder in the console-tree. Double-click the disk resource you want to edit. Click the Dependencies tab, then click Modify. Add the XP Cluster Extension resource to the Dependencies of the disk resource. Click OK to finish your modifications.

Adding dependencies using Failover Cluster Management (Windows Server 2008/2008 R2)
You can add dependencies with the GUI on a local node or by using the MMC to run the Failover Cluster Management application. 1. 2. 3. 4. 5. Open Failover Cluster Management. Select a service or application that has an XP Cluster Extension resource. Double-click a disk in the summary pane. Click the Dependencies tab, and then click Insert. Select the XP Cluster Extension resource in the Resource menu.

6.

Click OK to add the selected dependency.

XP Cluster Extension Software Administrator Guide

65

Adding dependencies using the CLI


The cluster commands in this section can be used with Windows Server 2003, Windows Server 2008/2008 R2, Server Core, or Hyper-V Server. To add a dependency on an XP Cluster Extension resource using the CLI, use the following command: cluster resource physical_disk_resource / adddependency:Cluster_Extension_XP_resource The following command adds a dependency on the XP Cluster Extension clx_fileshare resource to the physical disk resource Disk_32b_00b: cluster resource Disk_32b_00b /adddependency:clx_fileshare

Disaster-tolerant configuration example using a file share


The following example describes a configuration in which: Your environment consists of four systems (host1_DCA, host2_DCA, host3_DCB and host4_DCB). Your environment includes two XP disk arrays with serial numbers 35014 and 35013. You have configured clxfileshare as device group in the XP RAID Manager c:\windows\ horcm101.conf file and in the c:\windows\horcm102.conf file. A pre-executable clxpre.exe will be invoked by XP Cluster Extension. You use the default failover behavior for the cluster group. The resource CLX_FILESHARE is part of the service group CLX_SHARE and must be brought online before the physical disk resource Disk_32b_00b. Figure 4 on page 67 illustrates failover options and shows a second cluster group CLX_IIS. Figure 5 on page 67 is a sample CLX_FILESHARE resource screen shot, and Figure 6 on page 68 is an example of the resource tree for service or application CLX_SHARE.

66

Configuring XP Cluster Extension for Windows

Figure 4 Service or application example (quorum service control disks not shown)
.

Figure 5 CLX_FILESHARE resource sample


.

XP Cluster Extension Software Administrator Guide

67

Figure 6 XP Cluster Extension resource tree for CLX_SHARE


.

XP Cluster Extension is configured as a single resource to enable read/write access to the physical disk resource used for the CLX_SHARE cluster group. The physical disk resource depends on the XP Cluster Extension resource and can be brought online only when the XP Cluster Extension resource is already online. Independent of this resource tree, the network card will be configured with the CLX_SHARE service or application's (resource group's) IP address and network name. If all those resources have been brought online, the file share can be started. To configure the XP Cluster Extension resource according to the configuration in Figure 6 on page 68: 1. 2. 3. 4. 5. Log in to the host3_DCB system with the Administrator account. Create the file share service or application with all previously mentioned resources and its dependencies, except the XP Cluster Extension resource on host3_DCB. Create a new resource of type XP Cluster Extension and add systems host2_DCA, host3_DCB, and host4_DCB to its possible owners. Change the restart behavior of the XP Cluster Extension resource so that the resource can be restarted and so that the restart affects the group. Set the number of restarts to 0. Edit the resource properties, including the following information: XP RAID Manager instances XP RAID Manager device group details Server data center assignments Click the Pre/Post Exec button and add clxpre.exe with its full path. (The clxpre.exe program is an example. It is not included in the XP Cluster Extension product.) Add a dependency on the XP Cluster Extension resource CLX_FILESHARE to the physical disk resource Disk_32b_00b. Check the cluster service, group, and resource settings with the following commands: C:\>cluster group CLX_SHARE /prop C:\>cluster resource CLX_FILESHARE /prop

6. 7. 8.

68

Configuring XP Cluster Extension for Windows

9.

For Windows Server 2003 only: Set the XP Cluster Extension resource property RestartAction to zero (0), or check the Do not restart check box in the resource's Advanced tab window, and then use the following commands to check if the value has changed. For example: C:\>cluster resource CLX_FILESHARE /prop RestartAction=0 C:\>cluster resource CLX_FILESHARE /prop If you are using the CLI to set resource properties, the equivalent command is cluster res CLX_FILESHARE /prop RestartAction=0.

10. For Windows Server 2008/2008 R2 only: Enable the XP Cluster Extension resource property If restart is unsuccessful, fail over all resources in this service or application. This value is set in the Policies tab in the Failover Cluster Management Properties window. If you are using the CLI to set resource properties, the equivalent command is cluster res CLX_FILESHARE /prop RestartAction=0. 11. Bring the service or application online on host3_DCB by using the Failover Cluster Management GUI, Cluster Administrator GUI, or the following cluster command in the CLI: C:\>cluster group CLX_SHARE /online:host3_DCB 12. Verify that the XP Cluster Extension resource and all other CLX_SHARE application resources are brought online: C:\>cluster group CLX_SHARE 13. Take the service or application offline, and verify that all resources are stopped: C:\>cluster group CLX_SHARE /offline C:\>cluster group CLX_SHARE 14. Bring the service or application online again and verify that all resources are available: C:\>cluster group CLX_SHARE /online:host3_DCB C:\>cluster group CLX_SHARE 15. Check the cluster service settings of system host4_DCB, and the group and resource settings. 16. Move the service or application to system host4_DCB and verify that all resources are available: C:\>cluster group CLX_SHARE /moveto:host4_DCB C:\>cluster group CLX_SHARE 17. Check the cluster service settings of system host2_DCA, and the group and resource settings. 18. Move the service or application to system host2_DCA and verify that all resources are available: C:\>cluster group CLX_SHARE /moveto:host2_DCA C:\>cluster group CLX_SHARE 19. Check the cluster service settings of system host1_DCA, and the group and resource settings. 20. Take the service or application offline, and verify that all resources are stopped: C:\>cluster group CLX_SHARE /offline C:\>cluster group CLX_SHARE

XP Cluster Extension Software Administrator Guide

69

21. Change the XP Cluster Extension resource to be able to restart on another system: C:\>cluster resource CLX_FILESHARE /prop RestartAction=2 C:\>cluster resource CLX_FILESHARE /prop

Managing XP Cluster Extension resources


You can manage resources by bringing them online and offline, or by deleting them.

Bringing a resource online


Resources are usually brought online automatically when the service or application is brought online. You might need to move the service or application to the node where you want to bring the resource online. When bringing resources online: For Windows Server 2008/2008 R2, use the GUI, MMC, or CLI. For Server Core or Hyper-V Server, use the CLI or the MMC. For Windows Server 2003, use the GUI or CLI. For more information on using this command, see your Microsoft documentation.

Taking a resource offline


Resources are usually taken offline automatically when the service or application is taken offline. Taking a resource offline causes resources that depend on that resource to go offline. When taking resources offline: For Windows Server 2008/2008 R2, use the GUI, MMC, or CLI. For Server Core or Hyper-V Server, use the CLI or the MMC. For Windows Server 2003, use the GUI or CLI. For more information on using this command, see your Microsoft documentation.

Deleting a resource
Deleting a running resource causes the resource and its dependents to go offline. CAUTION: Deleting a running XP Cluster Extension resource does not remove the resource_name.online file and does not remove the device group from the list of monitored device groups if the pair/resync monitor is used to monitor the XP Continuous Access Software link. Therefore, the device group must be deleted from the list of monitored device groups manually using the clxchkmon command after deleting the XP Cluster Extension resource. See Stopping the pair/resync monitor on page 115.

70

Configuring XP Cluster Extension for Windows

CAUTION: Failure to delete the monitored device group from the list of monitored device groups can cause data corruption if the ResyncMonitorAutoRecover attribute is set to YES. When deleting resources: For Windows Server 2008/2008 R2, use the GUI or CLI. For Server Core or Hyper-V Server, use the CLI or the MMC. For Windows Server 2003, use the GUI or CLI. For more information on deleting resources, see your Microsoft documentation.

Using Hyper-V Live Migration with XP Cluster Extension


Live migration is a managed failover of VM resources. Live migration should be performed when all of the solution constituents are in a healthy state, all the servers and systems are running, and all the links are up. Ensure that the underlying infrastructure is in a healthy state before performing live migration. XP Cluster Extension has the capability of discovering unfavorable storage-level conditions for performing live migration. In response to these conditions, XP Cluster Extension will stop or cancel the live migration process and inform the user. This is accomplished with no VM downtime. For example, if live migration is initiated while VM data residing on the storage arrays is still merging and not in sync, XP Cluster Extension will proactively cancel the live migration and inform the user to wait until the merge is in progress. Without this feature, live migration might fail or the VM might come online in the remote data center with inconsistent data. The XP Cluster Extension StatusRefreshInterval property, which you can configure in a UCF for each application, specifies the time interval between consecutive array status gathering operations before the live migration to the target cluster node occurs. By adjusting this property, you can increase the probability of getting the correct XP array status to ensure a successful live migration. The default StatusRefreshInterval value is 300 seconds. For more information about configuring this property, see Setting XP Cluster Extension properties using a UCF on page 64. XP Cluster Extension cancels live migration operations within the local data center when the device group is not in PAIR status. Use the LocalDCLMForNonPAIRDG property, which can be configured in a UCF for each application, to change the setting to allow live migration to occur within the local data center even if the device group is not in PAIR status. Hyper-V Live Migration is supported with XP Cluster Extension for Windows Server 2008 R2 using only the synchronous fence levels DATA and NEVER. The asynchronous and journal fence levels are not supported. Using Hyper-V Live Migration with Cluster Shared Volumes is not supported with XP Cluster Extension. TIP: For more information about using Hyper-V Live Migration with XP, see the white paper Live Migration across data centers and disaster tolerant virtualization architecture with HP StorageWorks Cluster Extension and Microsoft Hyper-VTM on the white papers website: www.hp.com/storage/whitepapers.

XP Cluster Extension Software Administrator Guide

71

Timing considerations for MSCS


XP Cluster Extension gives priority to XP disk array operations over cluster software operations. If XP Cluster Extension invokes a disk pair resynchronization operation or gathers information about the remote XP disk array, XP Cluster Extension waits until the requested status information is reported. This ensures the priority of data integrity over cluster software failover processes. This behavior can lead to failed XP Cluster Extension resources as described below: XP Cluster Extension uses XP RAID Manager instances to communicate with the remote XP disk array. Depending on the setting of the XP RAID Manager instance timeout parameter and the number of remote instances, the online operation could time out. This can occur if the local XP RAID Manager instance cannot reach the remote XP RAID Manager instance. XP Cluster Extension tries to resynchronize disk pairs and waits until the XP RAID Manager device group is in PAIR state if the ApplicationStartup resource property is set to RESYNCWAIT. XP RAID Manager and the XP firmware fully support delta resynchronization; however, the delta between the primary and secondary disks could be large enough for the copy process to exceed the resource PendingTimeout value. The ResyncWaitTimeout object can cause XP Cluster Extension resources to fail if its value is higher than the resource PendingTimeout value. If running in fence level ASYNC, the default value of AsyncTakeoverTimeout can cause the resource to fail because its value exceeds the resource PendingTimeout value. The takeover process for fence level ASYNC can take much longer when slow communications links are in place. To prevent takeover commands from being terminated by the resource PendingTimeout, measure the time required to copy the installed XP disk array cache and adjust the resource PendingTimeout value. When measuring the copy time, measure only the slowest link used for XP Continuous Access Software. This ensures that the XP disk array cache can be transferred from the remote XP disk array, even in the event of a single surviving replication link between the XP disk arrays. In general, because the failover environment is dispersed into two (or more) data centers, the failover time cannot be expected to be the same as that in a single data center with a single shared disk device. Therefore, the following values of the XP Cluster Extension resource and the service and application using that resource must be adjusted, based on failover tests performed to verify the proper configuration setup: FailoverPeriod, RestartPeriod, PendingTimeout, LookAlive, and IsAlive. In addition, the service or application's FailoverPeriod value must be higher than the resources RestartPeriod value, and both must be higher than the resources PendingTimeout value. MSCS provides two parameters to adjust state change recognition/resolution: IsAlive LookAlive XP Cluster Extension automatically calls the IsAlive function whenever the cluster service calls the LookAlive function. Therefore, both functions must be set to the same value.

Bouncing service or application


XP Cluster Extension will alternate (start and fail) between local nodes if the ApplicationStartup property has been set to FASTFAILBACK and no remote system is available until the service or application restart limit has been reached. For more information, see ApplicationStartup on page 129. The FastFailbackEnabled property is not used by the XP Cluster Extension integration with MSCS.

72

Configuring XP Cluster Extension for Windows

Administration
XP Cluster Extension administration includes remote management of resources and monitoring of system resources and logs.

Remote management of XP Cluster Extension resources in a cluster (Windows Server 2008/2008 R2)
You can use the MMC with Failover Cluster Management to manage clusters and configure XP Cluster Extension resources. Note the following when configuring XP Cluster Extension resources by using the MMC from a remote management station: When you use the MMC to remotely configure XP Cluster Extension resource properties in a Server Core or Hyper-V Server cluster node, the Failover Cluster Management GUI on the remote management station displays the standard Microsoft Properties tab instead of the customized XP Cluster Extension Parameters tab. For more information about the Properties tab, see Setting XP Cluster Extension resource properties using the MMC on page 62. When you install XP Cluster Extension into a Windows Server 2008/2008 R2 environment, the resource extension DLL is registered by default, which prevents you from configuring an XP Cluster Extension resource from a remote management station. If you need to remotely configure an XP Cluster Extension resource in a Windows Server 2008/2008 R2-based cluster, unregister clxmscsex.dll from the cluster node, which allows you to configure the XP Cluster Extension resource using the standard Microsoft Properties tab. Use the command cluster /UNREGADMINEXT:clxmscsex.dll to unregister the DLL. CAUTION: Configuring XP Cluster Extension resources using the MMC from a remote management station is supported using only the standard Microsoft Properties tab. Do not try to use the customized XP Cluster Extension Parameters tab for this purpose. If you see the customized XP Cluster Extension Parameters tab when you try to configure an XP Cluster Extension resource from a remote management station using the MMC, you must unregister clxmscsex.dll from the cluster node. Use the command cluster /UNREGADMINEXT:clxmscsex.dll to unregister the DLL. Unregistering the DLL allows you to configure the resource using the standard Microsoft Properties tab. This situation might occur if you have a cluster with both Server Core or Hyper-V Server and Windows Server 2008/2008 R2 cluster nodes. When you configure XP Cluster Extension resource properties from a remote management station through the MMC, which uses the standard Microsoft Properties tab, the properties you enter are not validated, so you must enter the property values accurately, and verify them against the XP Cluster Extension documentation.

Remote management of XP Cluster Extension resources in a cluster (Windows Server 2003)


You can use Cluster Administrator to manage clusters and configure XP Cluster Extension resources. When using Cluster Administrator to configure XP Cluster Extension resources from a remote management station, note the following: In a Windows Server 2003 cluster with XP Cluster Extension installed, when you try to use Cluster Administrator to configure an XP Cluster Extension resource from a remote management station,

XP Cluster Extension Software Administrator Guide

73

you will see the customized XP Cluster Extension Parameters tab. The customized tab is displayed because the resource extension DLL is registered by default on Windows Server 2003 cluster nodes, which prevents you from configuring the XP Cluster Extension resource from a remote management station. If you need to configure an XP Cluster Extension resource remotely for a Windows Server 2003based cluster, unregister clxmscsex.dll from the cluster node, which allows you to remotely configure an XP Cluster Extension resource using the standard Microsoft Properties tab. Use the command cluster /UNREGADMINEXT:clxmscsex.dll to unregister the DLL. NOTE: Configuring XP Cluster Extension resources by using Cluster Administrator from a remote management station is supported using only the standard Microsoft Properties tab. Do not try to use the customized XP Cluster Extension Parameters tab for this purpose. When you configure XP Cluster Extension resource properties from a remote management station through the Cluster Administrator, which uses the standard Microsoft Properties tab, the properties you enter are not validated, so you must enter the property values accurately, and verify them against the XP Cluster Extension documentation.

System resources
Monitor the system resources on a regular basis as part of Windows administration. If any system resource usage by the cluster service is reaching maximum levels, stop and then restart the cluster service. This action automatically fails over the resources and resets system resources. See the MSCS documentation for information about how to stop a cluster service. An alternate method is to manually move all resources to another node in the cluster before stopping the cluster service. After all resources are successfully moved to another node, stop and then restart the cluster service; then, manually move back all resources.

Logs
If the XP Cluster Extension log files need to be cleared and reset (for example, to reduce disk space usage), you can delete the files. XP Cluster Extension automatically creates new log files. TIP: Archive the log files before deleting them.

Hyper-V Live Migration log entries


In the XP Cluster Extension log file (clxmscs.log), live migration messages include the prefix CLX_LM to help you differentiate live migration issues from XP Cluster Extension log messages. For example:
[10/12/09 20:13:02][2136][CLX_LM: CLXVMDISK04-App01][INFO] CLX detected that Live Migration for VM "Virtual Machine VM04" has begun. [10/12/09 20:13:02][2136][CLX_LM: CLXVMDISK04-App01][INFO] CLX started gathering VM "Virtual Machine VM04" specific storage information.

74

Configuring XP Cluster Extension for Windows

4 Configuring XP Cluster Extension for Solaris


HP StorageWorks XP Cluster Extension for VCS provides a resource agent to VCS. This allows cluster administrators to configure the XP disk array-specific failover behavior as easily as any other resource in VCS. XP Cluster Extension objects are configured as attributes of a resource in VCS. For information about how to install XP Cluster Extension, see the HP StorageWorks XP Cluster Extension installation guide. For supported configurations, see the HP SPOCK website: http://www.hp.com/storage/spock.

Configuration of the XP Cluster Extension agent


The XP Cluster Extension agent is preconfigured to fit most of your cluster configurations. It comes with a sample configuration that can be modified to fit your VCS and disk array environments. Before configuring the XP Cluster Extension agent, review VCS resource attributes of the XP Cluster Extension resource type.

Disaster tolerant configuration example using a web server


The following example describes a configuration in which: There are four systems: sunrise, dawn, sunset, and dusk. There are two disk arrays with serial numbers 35014 and 35013. web is configured as a device group in the XP RAID Manager /etc/horcm11.conf file. A pre-executable web_pre.sh and a post-executable web_post.sh will be invoked by XP Cluster Extension. You are using the default failover behavior for the service group. The resource clx_web is part of the service group CLX_WEB_SERVER, and must be brought online before the DiskGroup resources webdg and httpddg. The XP RAID Manager device group web includes all disks for the VxVM disk groups webdg and httpddg in the following example. Figure 7 on page 76 illustrates failover options and shows a second service group, CLX_ORACLE.

XP Cluster Extension Software Administrator Guide

75

Example of the clx_web resource


ClusterExtensionXP clx_web ( XPSerialNumbers = { 35014, 35013 } RaidManagerInstances = { 11 } DeviceGroup = web PreExecScript = "/etc/opt/hpclx/web_pre.sh" PostExecScript = "/etc/opt/hpclx/web_post.sh" DC_A_Hosts = { sunrise, dawn } DC_B_Hosts = { sunset, dusk } )

Figure 7 VERITAS Cluster Service configuration example


.

Figure 8 on page 77 shows an example resource graph of the CLX_WEB_SERVER service group. XP Cluster Extension is configured as a single resource to enable read/write access to the disk groups used for the web server service group. The DiskGroup resources depend on the XP Cluster Extension resource, and the Mount resources can be brought online only when the DiskGroup resources and the XP Cluster Extension resource are already online. Independent of this resource tree, the network card will be configured with the web server service group IP address. When all these resources have been brought online, the web server can be started.

76

Configuring XP Cluster Extension for Solaris

Figure 8 Sample resource graph of the CLX_WEB_SERVER service group


.

Configuring the XP Cluster Extension agent according to Figure 7


1. 2. 3. 4. Log in to system sunrise as root. Create the XP Cluster Extension resource (for example, clx_web) in the $VCS_CONF/config/ main.cf file, using the previous example. Link the new resource as a child resource to all disk resources in the service group. Edit the attributes in the file $VCS_CONF/config/main.cf to configure your XP Cluster Extension resource. Enter the XP RAID Manager instances, the XP RAID Manager device group, the XP serial numbers, DC_A_Hosts, and the DC_B_Hosts. Verify the syntax of the file $VCS_CONF/config/main.cf: #hacf verify $VCS_CONF/config 6. Start the VCS engine (had) on sunrise: #hastart 7. Verify that the XP Cluster Extension and all other web server service group resources are brought online: #hagrp -display 8. Take the service group offline, and verify that all resources are stopped: #hagrp offline CLX_WEB_SERVER sys sunrise #hagrp display 9. Bring the service group online again, and verify that all resources are available: #hagrp online CLX_WEB_SERVER sys sunrise #hagrp display

5.

XP Cluster Extension Software Administrator Guide

77

10. Start the VCS engine on dawn: #hastart 11. Start the VCS engine on sunset and dusk, and switch the web server service group to dawn and later to sunset and dusk. Before you switch the service group to the remote data center, make sure that the XP Continuous Access Software links are configured for bidirectional mirroring and that XP RAID Manager instances include the device group, configured for the web server service group. To switch the web server service group, enter: #hagrp switch CLX_WEB_SERVER to system_name 12. Verify that all XP Cluster Extension and web server service group resources are brought online: #hagrp -display

Configuring the pair/resync monitor


The pair/resync monitor is a service that verifies that disks are in the pair state, and resyncs them when necessary. The pair/resync monitor determines whether the requesting server is allowed access to the pair/resync monitor. To access the pair/resync monitor, you must update the remote access hosts file and configure the pair/resync monitor port.

Updating the remote access hosts file


Enter the names of the remote systems in a remote access hosts file. 1. 2. Open the /etc/opt/hpclx/conf/clxhosts file. Enter each host name on a separate line. You can leave blank lines, but do not enter comments. For example:
# cat /etc/opt/hpclx/conf/clxhosts dcBserver dcAserver

Configuring the pair/resync monitor port


Enter the port that the pair/resync monitor will monitor. 1. 2. Open the /etc/services file. Choose the port that the pair/resync monitor will use, and then add the following line to the services file: clxmonitor nnnnn /tcp where nnnnn is the port number. For example:
clxmonitor clxmonitor 22222/udp 22222/tcp # CLX Pair/Resync Monitor # CLX Pair/Resync Monitor

78

Configuring XP Cluster Extension for Solaris

Including the XP Cluster Extension resource type


See Importing the XP Cluster Extension resource types configuration file in the HP StorageWorks XP Cluster Extension Software Installation Guide.

Configuring the XP Cluster Extension resource


For VCS, you can configure an XP Cluster Extension resource using either the VCS CLI or the VCS Cluster Manager GUI. The XP Cluster Extension resource gathers all necessary information about the service group and the XP disk arrays if the XP Cluster Extension resource is brought online. Consider the following: If you use the default values for XP Cluster Extension COMMON objects, no user configuration file is required. If configured, a pair/resync monitor is started to monitor the XP Cluster Extension resource. If the resource is not configured to use the pair/resync monitor, a file is created in the directory specified by the ApplicationDir attribute: resource_name.online. If the resource is taken offline, the file is removed or the device group associated with the service group is removed from the pair/resync monitor list. If the device group is the last monitored disk pair, the monitor is stopped also. The resource type definition file, ClusterExtensionXPTypes.cf, must be included in the VCS configuration file main.cf. The XP Cluster Extension resource type definition is preconfigured for the most typical cluster configurations. It comes with a sample configuration that can be modified to fit your VCS and disk array environment.

XP Cluster Extension resource types


Before configuring the objects in the user configuration file, review the XP Cluster Extension objects described in Chapter 8 on page 123.

Resource type definition


To configure an XP Cluster Extension resource, use the following object definitions:
type ClusterExtensionXP ( static int MonitorInterval = 15 static int OfflineMonitorInterval = 3600 static int OnlineTimeout = 100 static str ArgList[] = { ApplicationDir, DeviceGroup, ResyncMonitorInterval, ResyncMonitor, ResyncMonitorAutoRecover, RaidManagerInstances,XPSerialNumbers, FenceLevel, DataLoseMirror, DataLoseDataCenter, AsyncTakeoverTimeout, JournalDataCurrency, AutoRecover, ApplicationStartup, ResyncWaitTimeout, FastFailbackEnabled, PostExecCheck, PreExecScript, PostExecScript, DC_A_Hosts, DC_B_Hosts, BCMuListA, BCMuListB, BCResyncMuListA, BCResyncMuListB,

XP Cluster Extension Software Administrator Guide

79

BCEnabledA, BCEnabledB BCResyncEnabledA, BCResyncEnabledB } str ApplicationDir = "/etc/opt/hpclx/" str XPSerialNumbers[] str RaidManagerInstances[] str DeviceGroup str DC_A_Hosts[] str DC_B_Hosts[] str FenceLevel = never str DataLoseMirror = no str DataLoseDataCenter = yes str JournalDataCurrency = yes int AsyncTakeoverTimeout = 1800 str ApplicationStartup = fastfailback int ResyncWaitTimeout = 300 str FastFailbackEnabled = yes str AutoRecover = no str ResyncMonitor = no str ResyncMonitorAutoRecover = no str ResyncMonitorInterval = 60 str PreExecScript str PostExecScript str PostExecCheck = no str BCMuListA[] str BCMuListB[] str BCResyncMuListA[] str BCResyncMuListB[] str BCEnabledA = no str BCEnabledB = no str BCResyncEnabledA = no str BCResyncEnabledB = no)

Adding an XP Cluster Extension resource


These procedures add a resource to an existing service group.

Adding an XP Cluster Extension resource using the VCS CLI


Syntax
hares add resource_name ClusterExtensionXP service_group The following example adds an XP Cluster Extension resource called clx_web to service group CLX_WEB_SERVER:
hares -add clx_web ClusterExtensionXP CLX_WEB_SERVER

Adding an XP Cluster Extension resource using the VCS Cluster Manager GUI
1. Open the Cluster Explorer.

80

Configuring XP Cluster Extension for Solaris

2.

Select the service group, right-click, and choose Add Resource; or, click Add Resource in the Cluster Explorer toolbar.

3. 4.

Enter the resource name in the Resource name box. Select ClusterExtensionXP from the Resource Type list.

XP Cluster Extension Software Administrator Guide

81

5.

To change attribute values of the new XP Cluster Extension resource, click the button in the Edit column of the value you want to change, and modify the values as desired in the Edit Attribute window.

6. 7.

Select the Critical and Enabled boxes in the Add Resource window. Click OK.

82

Configuring XP Cluster Extension for Solaris

Changing XP Cluster Extension attributes


XP Cluster Extension resource attributes can be changed after the configuration has been write-enabled. To change attribute values of the XP Cluster Extension resource the resource must be taken offline. Changing an attribute of the XP Cluster Extension resource while the resource is running is not supported.

Changing an attribute value using the VCS CLI


Syntax
hares modify ClusterExtensionXP_resource [ add | update ] attribute value The following example changes an XP Cluster Extension resource called clx_web to change the default FenceLevel attribute:
# hares -modify clx_web FenceLevel data

The following commands change the XP RAID Manager instance used for the XP Cluster Extension resource clx_web, and then add an additional instance to provide redundancy:
# hares -display clx_web -attribute RaidManagerInstances # hares -modify clx_web RaidManagerInstances -update 90 # hares -modify clx_web RaidManagerInstances -add 22

The following example displays all attributes of the XP Cluster Extension resource clx_web:
# hares -display clx_web

Changing an attribute value using the VCS Cluster Manager GUI


1. 2. Open the Cluster Explorer. Click the resource name.

XP Cluster Extension Software Administrator Guide

83

3.

Click the Properties tab in the View Panel.

4. 5.

Click Edit for the attribute you want to change. Enter changes to the attribute value. For nonscalar attributes, use the + and x buttons to add or remove elements. Do not change the attribute's scope to local; all XP Cluster Extension attributes are global in scope. Click OK to accept the change.

6.

Linking an XP Cluster Extension resource


XP Cluster Extension must be the first resource in the resource chain of a VCS service group. All resources depending on the disk resource (for example, Mount), including the disk resources (DiskGroup, Disk, DiskReservation), must be parent resources to the XP Cluster Extension resource. CAUTION: XP Cluster Extension does not support ServiceGroupHB resources in XP Continuous Access Software configurations because of the read/write mode differences between the primary and secondary disk in an XP disk array.

Linking other resources to the XP Cluster Extension resource


Syntax
hares link disk_group_resource ClusterExtensionXP_resource The following example links the disk group resource netscapedg to the clx_web_server resource:
# hares -link netscapedg clx_web_server

84

Configuring XP Cluster Extension for Solaris

Linking other resources using the VCS Cluster Manager GUI


1. 2. 3. 4. 5. Open the Cluster Explorer Click the Resources View tab on the View Panel. Click the resource icon of the resource that is to be the parent resource. Move the yellow line to the resource that is to be the (child) XP Cluster Extension resource and click to link the child to the parent resource selected in step 3. Click YES in the dialog box to confirm the dependency.

Bringing an XP Cluster Extension resource online


Resources are usually brought online automatically when the service group is brought online. To bring a resource group online manually, the service group must be enabled on the system and auto-enabled in the cluster. Finally, the resource must be enabled.

Enabling and bringing an XP Cluster Extension resource online using the CLI
Syntax
hares modify ClusterExtensionXP_resource Enabled 1 hares online ClusterExtensionXP_resource sys system_name The following example enables and brings the XP Cluster Extension resource clx_web online:
# hares -modify clx_web Enabled 1 # hares -online clx_web -sys sunrise

Enabling and bringing an XP Cluster Extension resource online using the VCS Cluster Manager GUI
1. 2. 3. Open the Cluster Explorer Right-click the resource name. Select online, and then select the system where you want to bring the resource online.

XP Cluster Extension Software Administrator Guide

85

4.

Click YES in the dialog box to confirm your selection.

Taking an XP Cluster Extension resource offline


Resources are usually taken offline automatically when the service group is taken offline. There are two ways to manually take a resource group offline: Take only the specified resource offline. Propagate the offline request to all parent resources, which takes all parent resources offline before the specified resource.

Syntax
hares offline ClusterExtensionXP_resource sys system_name hares offprop ClusterExtensionXP_resource sys system_name The following example takes the XP Cluster Extension resource clx_web offline or propagates the offline request to all its parent resources before taking it offline:
# hares -offline clx_web -sys sunrise # hares -offprop clx_web -sys sunrise

Taking an XP Cluster Extension resource offline using the VCS Cluster Manager GUI
1. 2. 3. 4. Open the Cluster Explorer Right-click the resource name. Select Offline or Offline Prop, and then the select system where you want to bring the resource offline. Click YES in the dialog box to confirm.

86

Configuring XP Cluster Extension for Solaris

Deleting an XP Cluster Extension resource


These procedures remove an XP Cluster Extension resource from an existing service group.

Deleting a resource using the VCS CLI


Syntax
hares delete ClusterExtensionXP_resource CAUTION: Deleting a running XP Cluster Extension resource does not remove the resource_name.online file and does not remove the device group from the list of monitored device groups if the pair/resync monitor is used to monitor the XP Continuous Access Software link. Therefore, the device group must be deleted from the list of monitored device groups manually using the clxchkmon command after deleting the XP Cluster Extension resource. See Stopping the pair/resync monitor on page 115.

CAUTION: Failure to delete the monitored device group from the list of monitored device groups can cause data corruption if the ResyncMonitorAutoRecover attribute is set to YES.

Deleting a resource using the VCS Cluster Manager GUI


1. 2. 3. 4. Open the Cluster Explorer. Right-click the name of the resource you want to delete. Select Delete. Click YES in the dialog box to confirm. The resource is deleted.

Disabling the XP Cluster Extension agent


Before you can disable the agent, you must first stop the service group or switch the service group to another system. To remove the XP Cluster Extension resource from the service group, you must first confirm whether the service group is online: If the service group is online, take the service group offline or switch the service group using one of the following commands from the VCS command line: #hagrp state service_group sys system_name #hagrp switch service_group to system_name or #hagrp offline service_group sys system_name

XP Cluster Extension Software Administrator Guide

87

If the service group is offline, you can remove the XP Cluster Extension resource from the service group. To remove the resource, see Deleting an XP Cluster Extension resource on page 87.

Pair/resync monitor integration


The pair/resync monitor is used to detect and react to suspended XP Continuous Access Software links. It is activated if the ResyncMonitor attribute value is set to YES. The automatic disk pair resynchronization feature is also activated if the ResyncMonitorAutoRecover attribute value is YES. When the resource is taken offline, the monitor is stopped for the XP RAID Manager device group used for this resource. The pair/resync monitor is not started if the ResyncMonitor attribute is changed to YES while the resource is online. However, a running ResyncMonitor is disabled for the resource if the ResyncMonitor attribute is changed to NO while the resource is online. CAUTION: If the resource group cannot be taken offline gracefully, the cluster administrator must disable monitoring of the device group for this resource. To avoid data corruption, this task must be part of the recovery procedure when XP Cluster Extension is deployed in the VCS environment. See Stopping the pair/resync monitor on page 115. VCS automatically attempts to stop the pair/resync monitor for the resource if it is running on more than one system. CAUTION: Ensure that the pair/resync monitor does not monitor and resynchronize the disk pair (device group) from both disk array sites.

Log-level reporting
The default setting for the pair/resync monitor's log facility is log level WARNING in the syslog. Solaris does not log warning messages to syslog by default. To receive messages from the pair/resync monitor in case of XP Continuous Access Software link failures, add the following line to the /etc/syslog.conf file: user.warning /var/adm/messages This line ensures that you will be notified of XP Continuous Access Software link failures if you use the pair/resync monitor.

Timing considerations for VCS


XP Cluster Extension gives priority to XP disk array operations over cluster software operations; if XP Cluster Extension invokes disk pair resynchronization operations or gathers information about the remote XP disk array, XP Cluster Extension waits until the requested status information is reported. This feature prioritizes data integrity over the cluster software's failover behavior.

88

Configuring XP Cluster Extension for Solaris

In some cases, this behavior could lead to failed XP Cluster Extension resources: XP Cluster Extension uses XP RAID Manager instances to communicate with the remote XP disk array. Depending on the settings of the XP RAID Manager instance timeout parameter and the number of remote instances, the online operation could time out. This can happen if the local XP RAID Manager instance cannot reach the remote XP RAID Manager instance. XP Cluster Extension tries to resynchronize disk pairs and waits until the XP RAID Manager device group is in PAIR state if the ApplicationStartup attribute is set to RESYNCWAIT. Depending on the XP RAID Manager version and the XP firmware version this could be a full resynchronization and may take longer than the online timeout interval. Even if the XP RAID Manager version and the XP firmware version allow a delta resynchronization, the delta between the primary and the secondary could be big enough for the copy process to exceed the online timeout value. The ResyncWaitTimeout attribute can automatically lead to failed XP Cluster Extension resources when set higher than the online timeout interval. If running in fence level ASYNC, the default value of the AsyncTakeoverTimeout can cause the resource to fail because its value is set beyond the resource online timeout interval. This is done because the takeover process for fence level ASYNC can take much longer when slow communications links are in place. To prevent takeover commands from being terminated by the takeover timeout before finishing, measure the time to copy the installed XP disk array cache and adjust the resource online timeout interval according to the measured copy time. When measuring the copy time, measure only the slowest link used for XP Continuous Access Software. This ensures that the XP disk array cache can be transferred from the remote XP disk array, even in the event of a single surviving replication link between the XP disk arrays. Because the failover environment is dispersed into two (or more) data centers, the failover time cannot be expected to be the same as it would be in a single data center with a single shared disk device. Therefore, adjust the online timeout values, the monitor interval of the XP Cluster Extension resource, and the service group using the XP Cluster Extension resource based on failover tests performed to verify the proper configuration setup.

Enabling/disabling service groups


Based on the XP disk array status information, XP Cluster Extension can change the cluster software behavior to automatically failover (or failback) the service group faster to the remote data center. For example, when the remote disk state is S-VOL_SSUS and the SSWS option has been set to indicate a prior takeover to the secondary disk set, if you have set the ApplicationStartup object to FASTFAILBACK, XP Cluster Extension would disable the service group for all systems in the respective data center(s) and VCS would transfer the service group back to the remote site rather than waiting for a pair resynchronization to be finished before the service group could start on the local site. This could happen only if you have not recovered the suspended disk pair after a prior takeover, where the PAIR state could not be maintained because of, for example, an XP Continuous Access Software link failure. This feature reduces application downtime because the service group (and the application) are not brought online on each system in the service group's system list. It is moved to the first available system listed in the service group's system list, which is connected to the remote XP disk array, which is done by enabling the VCS configuration file (main.cf) to be writable. The service group is disabled for all systems contained in either the DC_A_Hosts object or DC_B_Hosts object. Then, the VCS configuration file is saved (dumped). This feature can be disabled. If the FastFailbackEnabled object is set to NO, the standard VCS process is used and the XP Cluster Extension resource will fail on the local system (and so would the service group). VCS then tries to bring the service group online on the next system in the service group's

XP Cluster Extension Software Administrator Guide

89

system list (which should be a local system). This fails because the state of the local XP disk array has not changed. The service group fails until the service group is brought online on a system connected to the remote XP disk array. The service group online process takes longer and it does not access the VCS configuration file.

Restrictions for VCS with XP Cluster Extension


The following restrictions apply for VERITAS Cluster Server configurations when XP Cluster Extension is used to enable failover between two XP disk arrays: The XP Cluster Extension resource must be the first (child) resource for all other disk resources. Heartbeat disks cannot be used because of the P/S-VOL read/write behavior of XP Continuous Access Software. No service group heartbeat disks are allowed in the service group. The ServiceGroupHB resource is not supported in XP Cluster Extension configurations because of the P/S-VOL read/write behavior of XP Continuous Access Software. Only one XP Cluster Extension resource is allowed to be configured per service group. XP Cluster Extension must not be used with parallel service groups. If XP Cluster Extension is used in a parallel service group, all systems configured for this service group must be connected to the same XP disk arrays. A failover operation to the secondary XP disk array must be done manually only. In such a case, all active service groups must be brought offline before any of those service groups can be brought online on the secondary XP disk array. The ApplicationDir attribute value must not be changed when the resource is online. The ApplicationDir attribute defines the location of the application_dir/resource_name.online file. This file is created when the resource is brought online (if ResyncMonitor attribute is set to NO). The XP Cluster Extension resource monitors the file located in the location specified by ApplicationDir. Changing this attribute can cause the XP Cluster Extension resource to fail. The resource online timeout must be greater than the value specified for the ResyncWaitTimeout attribute. The resource online timeout should be greater than twice the wait time of all remote XP RAID Manager instances times the number of remote systems. Otherwise, the XP Cluster Extension resource fails to go online when there is a complete remote data center failure. If a post-executable is specified, the resource online timeout should be greater than the number of remote systems multiplied by three times tWT. tonline > nremote systems x 2 x tWT where: tonline = resource online timeout nremote systems = number of remote systems configured to run XP RAID Manager instances tWT = wait time until remote error will be reported by local XP RAID Manager instance

Unexpected offline conditions


In rare cases, XP Cluster Extension resources go offline after the following conditions all occur at the same time: The cluster has been stopped forcibly (without taking the resources offline). The XP Continuous Access links have failed. A remote XP RAID Manager instance is not available due to a full network outage.

90

Configuring XP Cluster Extension for Solaris

XP Cluster Extension resources go offline because the primary volume state changes from P-VOL_PAIR to P-VOL_PSUE and the secondary volume state changes from S-VOL_PAIR to EX_NORMT. The state combination P-VOL_PSUE and EX_NORMT is not designed to be handled automatically because the remote side (remote XP RAID Manager/ disk array), which has no status information available, could have more current data then the primary (P-VOL_PSUE) site. In this particular case, you are required to investigate data currency and determine the appropriate action to be taken.

Bringing the XP Cluster Extension resources online


Use one of the following procedures to bring XP Cluster Extension resources online: 1. 2. 3. Recover the XP Continuous Access link error and the network error, and restart XP RAID Manager on the remote site. Manually resynchronize the affected disk pairs. Bring the XP resources online. See Bringing an XP Cluster Extension resource online on page 85 for instructions.

or 1. Create the forceflag resource_name.forceflag in the ApplicationDir path. Default: /etc/opt/hpclx/ 2. 3. Bring the XP Cluster Extension resources online. Depending on the attributes set for the resources, you might need to manually resynchronize the XP Continuous Access disk pairs.

XP Cluster Extension Software Administrator Guide

91

92

Configuring XP Cluster Extension for Solaris

5 Configuring XP Cluster Extension for Linux


XP Cluster Extension supports integration with the following cluster software for Linux: Configuring XP Cluster Extension with RHCS, page 95 Configuring XP Cluster Extension with SLE HA, page 102 NOTE: For a list of XP Cluster Extension versions and the cluster software versions they support, see the HP SPOCK website: http://www.hp.com/storage/spock.

XP Cluster Extension for Linux: Sample configuration


Figure 9 on page 93 shows a sample configuration with RHCS or SLE HA and XP Cluster Extension.

Figure 9 Sample configuration


.

XP Cluster Extension Software Administrator Guide

93

The configuration example in Figure 9 on page 93 assumes the following information about the cluster: There are four nodes in the cluster: Host1, Host2, Host3, and Host4. There are two XP disk arrays with serial numbers 30047 and 30053. The device group clxwebvgs is configured in the XP RAID Manager /etc/horcm101.conf file. XP Cluster Extension invokes the pre-executable script clxweb_pre_takeover.sh and the postexecutable script clxweb_post_takeover.sh. These files can be an executable script or a program of your choice. For RHCS, the configuration file /etc/opt/hpclx/conf/CLXXP.config is associated with the RHCS service CLXWEB that is configured to use the XP Cluster Extension resource agent script. RHCS invokes the resource agent script to start the CLXWEB service, which checks the disk pair states before the volume groups vgweb and vghtdocs are activated and the web server is started. The XP RAID Manager device group clxwebvgs includes all disks for the LVM volume groups vgweb and vghtdocs. The sample CLXXP.config file shows the contents of the configuration file with the described failover behavior. For SLE HA, the XP Cluster Extension resource configuration file /etc/opt/hpclx/conf/ CLXXP.config is associated with the SLE HA resource CLXWEB. SLE HA invokes the resource agent script, /usr/lib/ocf/resource.d/heartbeat/CLXXP, which checks the disk pair states before the volume groups vgweb and vghtdocs are activated and the web server is started. The XP RAID Manager device group clxwebvgs includes all disks for the LVM volume groups vgweb and vghtdocs. The sample CLXXP.config file shows the contents of the configuration file with the described failover behavior. Sample configuration file:

COMMON LogLevel APPLICATION XPSerialNumbers RaidManagerInstances DeviceGroup DC_A_Hosts DC_B_Hosts #optional parameter FenceLevel ApplicationStartup AutoRecover DataLoseMirror DataLoseDataCenter PreExecScript PostExecScript

info CLXWEB 30047 30053 101 clxwebvgs Host1 Host2# Host3 Host4#

# values: error|info (optional) # == service (RHCS) or resource group (SLE HA)

# raid manager device group systems in data center A systems in data center B

(only necessary if other than default) data # values: data | never | async resyncwait # values: fastfailback | resyncwait yes # possible values: yes | no no # possible values: yes | no no # possible values: yes | no /etc/opt/hpclx/clxweb_pre_takeover.sh /etc/opt/hpclx/clxweb_post_takeover.sh

The ApplicationStartup object is set to RESYNCWAIT to configure the service (RHCS) or resource group (SLE HA) to wait for a pair resynchronization in the event that the service (RHCS) or resource group (SLE HA) fails over to an adoptive node. The AutoRecover object is set to YES, which means that you use XP Cluster Extension capabilities to automatically recover suspended disk pair states. The DataLoseMirror object and DataLoseDataCenter object are set to NO, which means XP Cluster Extension does not allow you to start the service (RHCS) or resource group (SLE HA) automatically if the disk pair is suspended or a takeover operation leads to a suspended disk pair.

94

Configuring XP Cluster Extension for Linux

XP Cluster Extension enables read/write access to the disk groups used for the web server service or resource group. Activation of the volume groups depends on a successful return code from XP Cluster Extension. The logical volumes can be mounted only when their volume groups are active and XP Cluster Extension allows read/write access to the disk group. After the file system for the web server's executables and content data is mounted and checked, the NIC is configured with the web server's IP address.

Configuring XP Cluster Extension with RHCS


XP Cluster Extension Software is integrated with the RHCS using an RHCS shared resource. XP Cluster Extension provides a resource agent script (clxxp.sh) that allows you to manage XP Cluster Extension resources. The executable clxxplxcs is called by XP Cluster Extension before volume group activation. This checks the status of the XP RAID Manager device group. If necessary, XP Cluster Extension takes appropriate actions to allow access to the disks before the cluster software accesses them.

Configuration overview
1. 2. 3. 4. Create an RHCS shared resource. For instructions, see Creating an RHCS XP Cluster Extension shared resource on page 95. Create an RHCS service using the XP Cluster Extension shared resource. For instructions, see Creating an RHCS service using the XP Cluster Extension shared resource on page 97. Configure the pair/resync monitor if you plan to use the pair/resync feature (optional). For instructions, see Configuring the pair/resync monitor on page 108. Activate the pair/resync monitor (optional). For instructions, see Activating the pair/resync monitor on page 109.

Creating an RHCS XP Cluster Extension shared resource


After XP Cluster Extension is installed, as described in the XP Cluster Extension Installation Guide, use Conga or the Cluster Configuration Tool (system-config-cluster) to create an XP Cluster Extension shared resource. This procedure is required as part of the initial XP Cluster Extension configuration procedure. After you complete this procedure, you do not need to repeat it when you add services. Use one of the following procedures: Using Conga to create a shared resource, page 95 Using system-config-cluster to create a shared resource, page 96

Using Conga to create a shared resource


To create an XP cluster extension shared resource using Conga: 1. 2. 3. 4. 5. 6. 7. Log in to Conga. Click the Cluster tab, and then select Cluster List. Click the name of the cluster you want to administer. Click Resources. Click Add a Resource. Select Script in the Select a Resource Type box. Enter a name for the XP Cluster Extension shared resource in the Name box. For example: CLXXP.

XP Cluster Extension Software Administrator Guide

95

8.

Enter /usr/share/cluster/clxxp.sh in the Full path to script file box.

9.

Click Submit.

Using system-config-cluster to create a shared resource


To create an XP cluster extension shared resource using system-config-cluster: 1. 2. 3. 4. 5. 6. 7. 8. Start system-config-cluster. Click the Cluster Configuration tab. Expand the Managed Resources tree. Select the Resources tree. Click Create a Resource to open the Resource Configuration dialog box. Select Script in the Select a Resource Type box. Enter CLXXP in the Name box. Enter /usr/share/cluster/clxxp.sh in the File (with path) box.

9.

Click OK.

10. Select File > Save to save the configuration changes. The service configuration in /etc/cluster/cluster.conf is updated. 11. Click Send to Cluster to propagate the cluster configuration to the other cluster nodes.

96

Configuring XP Cluster Extension for Linux

Creating an RHCS service using the XP Cluster Extension shared resource


After you create a shared resource, create an RHCS service using the XP Cluster Extension shared resource.

Configuration overview
1. Create a service at the root of the dependency tree using the XP Cluster Extension shared resource created in Creating an RHCS XP Cluster Extension shared resource on page 95. This ensures that the XP Cluster Extension resource is the first resource to start in a service. All other resources in this service should be configured as child resources to XP Cluster Extension. Use one of the following procedures: Using Conga to create a service, page 97 Using system-config-cluster to create a service, page 98 Create a configuration file. For instructions, see Creating the XP Cluster Extension resource configuration file, page 99. Test the service configuration. For instructions, see Testing the service configuration, page 100.

2. 3.

Using Conga to create a service


To create an XP Cluster Extension service using Conga: 1. 2. 3. 4. 5. 6. Log in to Conga. Click the Cluster tab, and then select Cluster List. Click the name of the cluster you want to administer. Click Services. Click Add a Service. The Add a Service page appears. Enter the service name in the Service name box. IMPORTANT: The service name must match the name that is defined for the APPLICATION property in the XP Cluster Extension configuration file CLXXP.config. 7. 8. 9. Select a failover domain. For information about the failover domain requirements, see Failover domains on page 18. Select Relocate for the recovery policy. Click Add a resource to this service to add the XP Cluster Extension shared resource. The Add a resource page appears.

10. Select an XP Cluster Extension shared resource from the Use an existing global resource menu.

XP Cluster Extension Software Administrator Guide

97

11. Click Submit. Conga saves the configuration information and updates all of the other cluster nodes. NOTE: To add additional resources to the service, use the Add a child feature.

Using system-config-cluster to create a service


To create an XP Cluster Extension service using system-config-cluster: 1. 2. 3. 4. Start the Cluster Configuration tool. Click the Cluster Configuration tab. Expand the Managed Resources tree. Select Services. The Service properties page appears. 5. Click Create a Service. The Add a Service dialog box appears.

98

Configuring XP Cluster Extension for Linux

6.

Enter the service name in the Name box, and then click OK. IMPORTANT: The service name must match the name that is defined for the APPLICATION property in the configuration file CLXXP.config. The Service Management dialog box appears.

7.

Click Add a Shared Resource to this service. The Resource Configuration dialog box appears.

8. 9.

Select CLXXP in the Select a Resource Type menu, and then click OK. To add additional resources to the service, select the XP Cluster Extension resource and click Attach a new Private Resource to the Selection. Select the resource to be configured and provide the required resource agent parameters.

10. Click Close to close the Service Management window. 11. Select File > Save to save the configuration changes. The service configuration in /etc/cluster/cluster.conf is updated. 12. Click Send to Cluster to propagate the cluster configuration to the other cluster nodes.

Creating the XP Cluster Extension resource configuration file


The procedure in this section is based the sample configuration in XP Cluster Extension for Linux: Sample configuration on page 93. Use this procedure as a guide for configuring your environment. 1. 2. Log in to system Host1 as root. Create the configuration file CLXXP.config in the /etc/opt/hpclx/conf directory, by copying and editing the sample file CLXXP.config provided in the /opt/hpclx/sample directory. $cp /opt/hpclx/sample/CLXXP.config /etc/opt/hpclx/conf/CLXXP.config

XP Cluster Extension Software Administrator Guide

99

3.

In the configuration file (CLXXP.config), enter the appropriate values for: XPSerialNumbers RaidManagerInstances DeviceGroup DC_A_Hosts DC_B_Hosts ResyncMonitor FenceLevel DataLoseMirror DataLoseDataCenter

NOTE: For more information about these values, see Chapter 8 on page 123. For example:
APPLICATION XPSerialNumbers RaidManagerInstances DeviceGroup DC_A_Hosts DC_B_Hosts ResyncMonitor FenceLevel DataLoseMirror DataLoseDataCenter CLXWEB 30060 30080 101 vgnetscape sys1A sys2A sys1B sys2B yes never yes yes

IMPORTANT: If you are using Device Mapper Multipath, configure the multipath_rescan.sh script as a PostExecScript. For more information, see Rescanning multipath devices on page 106. 4. Copy the updated CLXXP.config file to the other cluster nodes.

Testing the service configuration


The procedure and commands in this section are based the sample configuration in XP Cluster Extension for Linux: Sample configuration on page 93. Use this procedure as a guide for configuring your environment. 1. Use the Cluster User Service Administration Utility (clusvcadm) to start the service on Host1. #clusvcadm -e CLXWEB m Host1 2. Verify that the service started successfully. #clustat s CLXWEB 3. Stop the service and verify that the service stopped successfully. #clusvcadm s CLXWEB Or #clusvcadm d CLXWEB

100

Configuring XP Cluster Extension for Linux

4.

Start the service on Host2. #clusvcadm e CLXWEB m Host2

5.

Relocate the service to a remote data center node. a. Verify that the disks CLXWEB uses are in the PAIR state: #export HORCMINST=101 #pairdisplay fcx g clxwebvgs b. Move the service CLXWEB to Host3. Verify that service has successfully moved and started on Host3: #clusvcadm -r CLXWEB -m Host3 #clustat -s CLXWEB c. Verify that the disk pairs are now in read/write mode on the remote storage system: #pairdisplay fcx g clxwebvgs d. After verifying that the service CLXWEB, including XP Cluster Extension, can be run on each system in the cluster, move the service back to its primary system: #clusvcadm -r CLXWEB -m Host1 #clustat -s CLXWEB #pairdisplay fcx g clxwebvgs

Managing XP Cluster Extension services (RHCS)


This section includes the instructions for starting or stopping an RHCS service. Starting an RHCS service, page 101 Stopping or disabling an RHCS service, page 101

Starting an RHCS service


To start an XP Cluster Extension service using Cluster User Service Administration (clusvcadm), enter the following commands: #clusvcadm e service name #clusvcadm -e service name -m cluster node For instructions on starting an XP Cluster Extension service using Conga or the Cluster Configuration Tool, see the RHCS documentation.

Stopping or disabling an RHCS service


To stop an XP Cluster Extension service using Cluster User Service Administration (clusvcadm), enter the following command: #clusvcadm d service name To disable an XP Cluster Extension service using Cluster User Service Administration (clusvcadm), enter the following command: clusvcadm -d service name

XP Cluster Extension Software Administrator Guide

101

NOTE: For instructions on stopping or disabling an XP Cluster Extension service using Conga or the Cluster Configuration Tool, see the RHCS documentation.

Configuring XP Cluster Extension with SLE HA


XP Cluster Extension Software is integrated with SLE HA using a configuration file and a custom resource agent. The executable clxxplxcs is called by XP Cluster Extension before volume group activation. This checks the status of a device group configured for use with an SLE HA agent. If necessary, XP Cluster Extension takes appropriate actions to allow access to the volume group's physical disks before attempting to activate and mount the logical volume on a cluster node.

Configuration overview
1. 2. 3. Create and configure an XP Cluster Extension resource. For instructions, see Creating and configuring an XP Cluster Extension resource on page 102. Configure the pair/resync monitor if you plan to use the pair/resync feature (optional). For instructions, see Configuring the pair/resync monitor on page 108 Activate the pair/resync monitor (optional). For instructions, see Activating the pair/resync monitor on page 109.

Creating and configuring an XP Cluster Extension resource


Use the following procedure to create an XP Cluster Extension SLE HA resource: 1. 2. Create the configuration file. For instructions, see Creating the XP Cluster Extension resource configuration file on page 102. Create an XP Cluster Extension resource using the SLE HA GUI. Use one of the following procedures: Creating an XP Cluster Extension resource for Pacemaker, page 103 Creating an XP Cluster Extension resource for Heartbeat, page 105 Test the configuration. For instructions, see Testing the configuration, page 106.

3.

Creating the XP Cluster Extension resource configuration file


The procedure in this section is based the sample configuration in XP Cluster Extension for Linux: Sample configuration on page 93. Use this procedure as a guide for configuring your environment. 1. 2. Log in to system Host1 as root. Create the XP Cluster Extension resource configuration file CLXXP.config in the /etc/opt/ hpclx/conf directory by copying and editing the sample file CLXXP.config provided in the /opt/hpclx/sample directory. $cp /opt/hpclx/sample/CLXXP.config /etc/opt/hpclx/conf/CLXXP.config

102

Configuring XP Cluster Extension for Linux

3.

In the configuration file (CLXXP.config), enter the appropriate values for: XPSerialNumbers RaidManagerInstances DeviceGroup DC_A_Hosts DC_B_Hosts ResyncMonitor FenceLevel DataLoseMirror DataLoseDataCenter

NOTE: For more information about these values, see Chapter 8 on page 123. For example:
APPLICATION XPSerialNumbers RaidManagerInstances DeviceGroup DC_A_Hosts DC_B_Hosts ResyncMonitor FenceLevel DataLoseMirror DataLoseDataCenter CLXWEB 30060 30080 101 vgnetscape sys1A sys2A sys1B sys2B yes never yes yes

IMPORTANT: If you are using Device Mapper Multipath, configure the multipath_rescan.sh script as a PostExecScript. For more information, see Rescanning multipath devices on page 106. 4. Copy the updated file to the other cluster nodes.

Creating an XP Cluster Extension resource for Pacemaker


This procedure uses the Linux HA Management Client and Pacemaker. For specific instructions on using the GUI, see the SuSE Linux Enterprise High Availability Extension documentation. 1. 2. 3. Start the Linux HA Management Client. Select Add group from the Resources menu, and enter a group ID. Add XP Cluster Extension as a primitive group's first resource. NOTE: The resource hierarchy depends on the order in which resources are added. Always add XP Cluster Extension resources as the first resource in a group.

XP Cluster Extension Software Administrator Guide

103

4.

Select the following options for the XP Cluster Extension resource: Name
Class Provider Type

Value
ocf heartbeat CLXXP

5.

Configure the instance attributes for the resource by selecting the app parameter. In the Value box, enter the APPLICATION tag name configured in the XP Cluster Extension configuration file (/etc/opt/hpclx/CLXXP.config). Configure the start, stop, and monitor operations for the XP Cluster Extension resource. Add additional primitive resources to the group. For example: If LVM and File System are used as the second and third resources of the group, the Summary dialog box is similar to the following:

6. 7.

8. 9.

Add a resource colocation constraint between the resource group ID assigned in Step 2 and the last resource in the group hierarchy. Set location constraints for the group ID to achieve the required failover order for the group.

10. Set the operation defaults to control failover behavior. To specify that when a resource fails, the resources attempts to restart on the same node or another node in the cluster, use the following settings: Name
requires on-fail timeout

Value
nothing restart 30

11. Set the migration-threshold value. This value defines the number of failures that can occur on a node before the node becomes ineligible to host the resource and the resource fails over to another node. Set this value to 1 for XP Cluster Extension. 12. Disable automatic failback by using resource constraints and setting resource-stickiness to the lowest value compared with the other resource location constraints.

104

Configuring XP Cluster Extension for Linux

Creating an XP Cluster Extension resource for Heartbeat


This procedure uses the Linux HA Management Client and Heartbeat. For specific instructions on using the GUI, see the Linux HA Management Client documentation. 1. 2. Start the Linux HA Management Client. Add a group resource for XP Cluster Extension with the following settings: Name
ID Ordered Collocated

Value
Enter a resource group ID. true true

The Linux HA Management Client prompts you to enter the resource type details. 3. Set the value of the app parameter to the APPLICATION tag name configured in the XP Cluster Extension resource configuration file (/etc/opt/hpclx/conf/CLXXP.config). NOTE: The resource hierarchy depends on the order in which resources are added. Always add XP Cluster Extension resources as the first resource in a group. 4. 5. Add an LVM resource to the group created in Step 2. Set the value of the volgrpname parameter to the name of the volume group managed by the XP Cluster Extension resource. Add a Filesystem resource. Set the following values as appropriate for your environment: device directory fstype Configure the start, stop, and monitor operations for the XP Cluster Extension resource and all other resources. Add a location constraint to the resource group ID assigned in Step 2. Add an Expression to the location constraint. For information on the settings to enter, see the SLE HA documentation. Select the Expression you added in Step 8 and enter a value in the Score box. A high score indicates a high priority for the selected location constraint. For example, if there are three nodes N1, N2, and N3 and if N1 has highest priority followed by N2, and then N3, create three location constraints for the same resource, and assign the scores as 1000, 500, and 200, respectively. 10. Add a resource colocation constraint between the resource group ID assigned in Step 2 and the last resource in the group hierarchy. 11. Right-click the resource in the Linux HA Management Client GUI, and then select Start.

6. 7. 8. 9.

XP Cluster Extension Software Administrator Guide

105

Testing the configuration


The procedure and commands in this section are based the sample configuration in XP Cluster Extension for Linux: Sample configuration on page 93. Use this procedure as a guide for configuring your environment. Test the configuration by migrating the resource group to the remote data center nodes: 1. 2. Verify that the disk pairs are in read-only mode on the remote storage system. In the SLE HA GUI, click Management in the left pane. Right-click the XP Cluster Extension resource and select Start. This will bring the resource group online on one of the cluster hosts, based the configured resource constraints. 3. To migrate the resource, click Management in the left pane. Right-click the XP Cluster Extension resource, and then select Migrate Resource. Select a target node in the remote data center in the Migrate Resource dialog box, and then click OK. Verify that the disk pair is in source mode on the remote storage system. Migrate the resource to a node in the same data center and verify that the disk pair status has not changed.

4. 5.

Managing XP Cluster Extension services (SLE HA)


To manage an XP Cluster Extension resource: 1. 2. Click Management in the left pane of the Linux HA Management Client. Right-click the XP Cluster Extension resource and select Start or Stop to automatically initiate the requested operation on the each resource in the dependency tree. For more information, see the SLE HA documentation.

Rescanning multipath devices


IMPORTANT: The information in this section applies to Device Mapper Multipath Software users only. When a device group takeover occurs, the permission settings of the LUs in the device group change from read-only to read-write at the destination site. In Linux configurations with the Device Mapper Multipath Software, the hosts do not dynamically detect the LU permission change. In this situation, the disks used in the XP Cluster Extension setup fail to come online when the host OS does not detect the LU permission change. As a workaround, configure the XP Cluster Extension script multipath_rescan.sh as a PostExecScript to rescan the disks before they are brought online.

Configuring the rescan script


The multipath rescan script is available in the directory /opt/hpclx/sample/ multipath_rescan.sh. To configure the script to run as a PostExecScript:

106

Configuring XP Cluster Extension for Linux

1.

Copy the multipath_rescan.sh script to the /etc/opt/hpclx/conf folder, and rename the file as follows: RHCS: multipath_rescan_ServiceName.sh SLE HA: multipath_rescan_ResourceGroupName.sh

2.

Open the script file and enter the user-friendly names of all multipath devices that are in the volume groups configured for the RHCS service or SLE HA resource group. For instructions on finding the user-friendly name of a multipath device, see Finding the user-friendly name of a multipath device on page 107. In the following example, you specify the user-friendly names (mpathab, mpathac, and mpathad) for the variable MULTIPATH_DEVICES:
MULTIPATH_DEVICES=( mpathab mpathac mpathad )

3.

Enter the multipath_rescan.sh script for the PostExecScript object in the Cluster Extension resource configuration file. You must specify the full path name of the multipath_rescan.sh script. For example:

Finding the user-friendly name of a multipath device


The multipath_rescan.sh script requires that you enter the user-friendly names of the multipath devices. To obtain the user-friendly name of a multipath device: 1. Run the pvs command to view the multipath device names for your volume groups. In the following example, dm-21 and dm-23 are the multipath devices for the volume group vg01:
[root@node1 ]# pvs PV VG /dev/dm-21 vg01 /dev/dm-23 vg01 /dev/dm-24 vg02

Fmt lvm2 lvm2 lvm2

Attr aaa-

PSize PFree 1.82G 0 1.82G 0 1.82G 0

XP Cluster Extension Software Administrator Guide

107

2.

Obtain the SCSI ID for a multipath device. Use the scsi_id command for SUSE Linux Enterprise Server, and the hp_scsi_id command for Red Hat Enterprise Linux. SUSE Linux Enterprise Server:
[root@node1 ]# scsi_id -guns /block/dm-21 360060e8014424600000142460000039d

Red Hat Enterprise Linux:


[root@node1]# hp_scsi_id -guns /block/dm-14 360060e8014424600000142460000039d

3.

Use the multipath command to obtain the user-friendly name for the multipath device's generated SCSI ID. In the following example, mpathq is the user-friendly name of a multipath device:
[root@node1]# multipath -ll | grep 360060e8014424600000142460000039d | awk '{print $1}' mpathq

Configuring the pair/resync monitor


The pair/resync monitor is a service that verifies that disks are in the pair state, and resyncs them when necessary. The pair/resync monitor determines whether the requesting server is allowed access to the pair/resync monitor. To access the pair/resync monitor, you must update the remote access hosts file and configure the pair/resync monitor port.

Updating the remote access hosts file


Enter the names of the remote systems in a remote access hosts file. 1. 2. Open the /etc/opt/hpclx/conf/clxhosts file. Enter each host name on a separate line. You can leave blank lines, but do not enter comments. For example:
# cat /etc/opt/hpclx/conf/clxhosts dcBserver dcAserver

Configuring the pair/resync monitor port


Enter the port that the pair/resync monitor will monitor. 1. Open the /etc/services file.

108

Configuring XP Cluster Extension for Linux

2.

Choose the port that the pair/resync monitor will use, and then add the following line to the services file: clxmonitor nnnnn /tcp where nnnnn is the port number. For example:
clxmonitor clxmonitor 22222/udp 22222/tcp # CLX Pair/Resync Monitor # CLX Pair/Resync Monitor

Activating the pair/resync monitor


The pair/resync monitor detects and reacts to suspended XP Continuous Access Software links. To activate the pair/resync monitor, set the ResyncMonitor object to YES. To activate automatic disk pair resynchronization, set the ResyncMonitorAutoRecover object to YES. When a RHCS service or SLE HA resource group is stopped, the pair/resync monitor is stopped for the XP RAID Manager device group the service or resource group uses. The pair/resync monitor does not allow online changes in the XP Cluster Extension resource configuration file when the corresponding RHCS service or SLE HA resource group is online. If the ResyncMonitor object is changed to YES while the RHCS service or SLE HA resource group is running, the pair/resync monitor is not started. If the ResyncMonitor object is changed to NO while the RHCS service or SLE HA resource group is running, a running pair/resync monitor is not stopped. CAUTION: If a RHCS service or SLE HA resource group cannot be stopped gracefully, disable monitoring of the device group for the service or resource group. To avoid data corruption, this task must be part of the recovery procedure when XP Cluster Extension is deployed in the RHCS or SLE HA environment. See Stopping the pair/resync monitor on page 115. Ensure that the pair/resync monitor does not monitor and resynchronize the disk pair (device group) from both disk array sites.

Timing considerations
XP Cluster Extension gives priority to XP disk array operations over cluster software operations. If XP Cluster Extension invokes disk pair resynchronization or gathers information about the remote XP disk array, XP Cluster Extension waits until the requested status information is reported. This ensures the priority of data integrity over cluster software failover processes. This behavior can lead to failed resources, as follows: XP Cluster Extension uses XP RAID Manager instances to communicate with the remote XP disk array. Depending on the setting of the XP RAID Manager instance timeout parameter and the number of remote instances, the service or resource group start operation can time out. This can occur if the local XP RAID Manager instance cannot reach the remote XP RAID Manager instance. In an SLE HA environment, the timeout value defined for the start operation can be adjusted to the appropriate value to avoid this situation. In an RHCS environment, the timeout value depends on the timeout value specified in script resource agent (/usr/share/cluster/script.sh). XP Cluster Extension tries to resynchronize disk pairs and waits until the XP RAID Manager device group is in the PAIR state if the ApplicationStartup object is set to RESYNCWAIT. XP RAID Manager

XP Cluster Extension Software Administrator Guide

109

and the XP firmware fully support delta resynchronization; however, the delta between the primary and secondary disks can be large enough for the copy process to exceed the service or resource group startup timeout value. The ResyncWaitTimeout object can cause the resource to fail if its value is higher than the resource startup timeout value. If running in fence-level ASYNC, the default value of AsyncTakeoverTimeout can cause the resource to fail if its value is set beyond the recommended startup timeout value. This is done because the takeover process for fence-level ASYNC can take longer when communication links are slow. To prevent the takeover timeout from terminating the takeover commands, measure the time required to copy the installed XP disk array cache and adjust the resource startup timeout interval. When measuring the copy time, measure only the slowest link used for XP Continuous Access Software. This ensures that the XP disk array cache can be transferred from the remote XP disk array, even in the event of a single surviving replication link between the XP disk arrays.

NOTE: Because the failover environment is dispersed over two or more data centers, the failover time cannot be expected to be the same as that of a single data center with a single shared disk device. Therefore, you must adjust the service or resource group startup timeout value and the monitor interval of the XP RAID Manager device group based on failover tests you perform to verify the proper configuration setup.

110

Configuring XP Cluster Extension for Linux

6 XP Cluster Extension and CLI


XP Cluster Extension allows integration into almost any cluster software for commercial UNIX, Linux, and Windows operating systems. Use the XP Cluster Extension clxrun command to check proper functionality of XP Cluster Extension prior to integration with the cluster software. The CLI also allows integration of XP Continuous Access Software. For information on supported platforms, see the HP SPOCK website: http://www.hp.com/storage/spock.

Configuring the CLI


Using the XP Cluster Extension CLI requires the following configuration steps: 1. 2. 3. Create the XP Continuous Access Software environment. Create the XP RAID Manager configuration. Create and configure the user configuration file.

Creating the Continuous Access environment and configuring XP RAID Manager


HP support personnel are trained and authorized to set up XP Continuous Access Software. You can, however, configure and change XP disk pairs and XP RAID Manager instances using HP StorageWorks XP LUN Manager, HP StorageWorks Command View XP, XP Remote Web Console, Command View XP Advanced Edition Software, and XP RAID Manager. For detailed information on using these programs, see the HP StorageWorks XP LUN Manager User's Guide, HP StorageWorks Command View XP User's Guide, HP StorageWorks XP Remote Web Console User's Guide, HP StorageWorks Command View XP Advanced Edition Software Device Manager Web Client User's Guide, and HP StorageWorks XP RAID Manager User's Guide.

Timing considerations
XP Cluster Extension is designed to prioritize XP disk array operations over application service startup operations. If XP Cluster Extension invokes disk pair resynchronization operations or gathers information about the remote XP disk array, XP Cluster Extension waits until the requested status information is reported. This prioritizes data integrity over application service startup and failover behavior. Because the takeover timing depends on the configuration of your XP RAID Manager environment and the settings in UCF.cfg, these considerations must be evaluated: XP Cluster Extension uses XP RAID Manager instances to communicate with the remote XP disk array. Depending on the settings of the XP RAID Manager instance timeout parameter and the number of remote instances, the online operation could time out. This can also happen if clxrun is used in a script or called by another program and the local XP RAID Manager instance cannot reach the remote XP RAID Manager instance. See Setting up XP RAID Manager on page 20 for more information. If the ApplicationStartup attribute is set to RESYNCWAIT, XP Cluster Extension tries to resynchronize disk pairs and waits until the XP RAID Manager device group is in PAIR state. In some versions of XP RAID Manager and XP firmware, a full resynchronization is done. Depending on the amount

XP Cluster Extension Software Administrator Guide

111

of data to be transferred, it could take hours to resynchronize. If this is the case, clxrun may take some time to return. Do not stop clxrun; use it to check the status of the associated XP RAID Manager device groups. Even if the XP RAID Manager version and the XP firmware version allow a delta resynchronization, the amount of delta data to be transferred between the primary and the secondary could be long enough for the copy process to take a while. If running in fence level ASYNC, the default value of the AsyncTakeoverTimeout is set to a very high number. This is done because the takeover process for fence level ASYNC can take much longer when slow communications links are in place; adjust this value after measuring the XP Continuous Access Software environment. See AsyncTakeoverTimeout on page 130 for more details. To prevent premature termination of the takeover commands by the takeover timeout, measure the time to copy the installed XP family disk array cache and adjust the resource online timeout interval according to the measured copy time. Use only the slowest link XP Continuous Access Software link to measure the copy time. This ensures that the XP disk array cache can be transferred from the remote XP disk array, even in the event of a single surviving replication link between the XP family disk arrays. In general, because the failover environment is dispersed into two (or more) data centers, the failover time cannot be expected to be the same as it would be in a single data center with a single shared disk device.

Restrictions for customized XP Cluster Extension implementations


The following are some restrictions that apply when using the XP Cluster Extension CLI: The XP Cluster Extension CLI call clxrun must be invoked before the associated disk resources are activated. Associated disk resources must not be activated on any other system. If other disk resources are activated, XP Cluster Extension may remove write-access rights for those disk devices (putting them in read-only mode).

Creating and configuring the user configuration file


The CLI expects as an argument the name configured as the APPLICATION tag value. You do not need to specify the SearchObject object. The following is an example of a customized user configuration file when using clxrun:
# /etc/opt/hpclx/conf/UCF.cfg # This is the XP Cluster Extension User Configuration File (UCF.cfg). # The COMMON tag specifies the configuration for the # XP Cluster Extension core environment # COMMON LogLevel info #show disk state info in the logs # The APPLICATION tag specifies the configuration for the # XP Cluster Extension failover behavior APPLICATION sap #the application service DeviceGroup sapdg #RM dev group for the app service RaidManagerInstances 22 90 #RM instance number for dev group XPSerialNumbers 34001 34005 #local and remote XP Serial Numbers DC_A_Hosts eserv1 eserv2 #data center A hostnames DC_B_Hosts eserv3 eserv4 #data center B hostnames FenceLevel data #FenceLevel changed from default

112

XP Cluster Extension and CLI

APPLICATION netscape #the application service DeviceGroup netscapedg #RM dev group for the app service RaidManagerInstances 22 90#RM instance number for dev group XPSerialNumbers 34001 34005 #local and remote XP Serial Numbers DC_A_Hosts eserv1 eserv2 #data center A hostnames DC_B_Hosts eserv3 eserv4 #data center B hostnames

CLI commands
This section describes the following CLI commands: clxrun, page 113 clxchkmon, page 114

clxrun
Check disk set

Description
clxrun can be used to manually prepare the application service's disk set before an existing application service start procedure is invoked. When using clxrun, the status of the associated XP RAID Manager device group is checked to ensure that access to the disk set will occur under data consistency and concurrency situations only. clxrun must be invoked before the application service disk set can be activated; it is considered an online-only program. However, the CLI features provide the same disaster tolerance features as the integrated versions of XP Cluster Extension. NOTE: Execution of clxrun does not start the pair/resync monitor.

Syntax
clxrun [-version] [-forceflag] app_name

Arguments
version forceflag app_name Displays the XP Cluster Extension version Forces startup The application name configured in the user configuration file (UCF.cfg)

The clxrun program expects only one parameter as the default setting. This parameter is used to uniquely identify the application service in the APPLICATION section of the user configuration file. clxrun first checks for the forceflag option. When using clxrun, it is not necessary to create an application_name.forceflag file. This option, however, must be specified first if used.

XP Cluster Extension Software Administrator Guide

113

CAUTION: The forceflag option is implemented as an emergency switch to manually activate your XP disk set. If the forceflag option has been specified, XP Cluster Extension will not check any consistency or concurrency rules before activating the XP disk set.

Return codes
clxrun exits with one of the following return codes: 0 OK Application service can be started. ERROR_GLOBAL Application service should not start on any system in either site on either disk array. ERROR_DC Application service should not start on any system in the local site on the local disk array. ERROR_LOCAL Application service should not start on this system.

Example 1 # clxrun sap Example 1 is based on the assumption that you have defined an APPLICATION tag named sap in the UCF.cfg file and you have specified all necessary objects, including the DeviceGroup object, to map the XP disk set to the application service sap. XP Cluster Extension will check the disk set mapped to the application service sap, run the necessary takeover procedure and return one of the return codes mentioned in the return code table. Example 2 # clxrun -forceflag sap Example 2 is based on the assumption that you have defined an APPLICATION tag named sap in the UCF.cfg file and you have specified all necessary objects, including the DeviceGroup object, to map the XP disk set to the application service sap. XP Cluster Extension will check the XP disk set mapped to the application service sap, and run the necessary takeover procedure to enable read/write access to the XP disk set.

clxchkmon
Pair/resync monitor access program

Description
The clxchkmon utility program allows starting and stopping of the resynchronization features and queries to gather state information of the monitored device groups.

114

XP Cluster Extension and CLI

To update or remove a specific resource, use clxchkmon n resource_name g device_group. If clx is not specified, the command is applied only to non-XP Cluster Extension resources. To update all non-XP Cluster Extension resources, use clxchkmon t. To update XP Cluster Extension resources, use clxchkmon clx t.

Displaying resources
The following command displays all resources: clxchkmon show The following command displays XP Cluster Extension resources only: clxchkmon clx show

Removing resources
The following command removes only non-XP Cluster Extension resources: clxchkmon remove The following command removes all XP Cluster Extension resources: clxchkmon clx remove

Stopping the pair/resync monitor


The pair/resync monitor is stopped when all resources are removed from monitoring. 1. To check whether the pair/resync monitor is running, execute the following command: clxchkmon show 2. Select the application and device group combination you want to remove from the pair/resync monitor and remove it with the following command: clxchkmon -n [application_name | resource_group_name | resource_name] -g device_group_name remove where application_name|resource_group_name|resource_name is the resource name (as defined by the APPLICATION tag in the UCF.cfg file) of the XP Cluster Extension resource and should match the clxchkmon output. If the clx option is not specified, the command is executed only for non-XP Cluster Extension resources.

CAUTION: If you respond Y (yes) to remove the combination, the resource will be removed from the list of resources to be monitored in the pair/resync monitor. If this is not an emergency removal attempt and the XP Cluster Extension resource is online, the previous procedure will lead to a failed resource, which will take all dependent resources offline and eventually force your application offline. Do not use this command to take your XP Cluster Extension resources offline.

XP Cluster Extension Software Administrator Guide

115

Syntax clxchkmon [-clx] [-s host name] [-n resource_name g device_group] [[-t monitor_interval | -autorecover mode | -remove [-force] | -show | -pid | -stopsrv | -log [error | warning | info | trace]]] [-p port number] where: -s hostname n resource_name Specifies the name of a host. Specifies the resource (application) name as used in XP Cluster Extension. Specifies an XP RAID Manager group name. Specifies interval in seconds to update registered monitor resources. Specify YES to enable autorecovery, or NO to disable autorecovery for registered monitor resource. Executes the command only for XP Cluster Extension resources. Removes the resource from the monitor list. Disables user confirmation to remove resource. Displays monitored resources. Returns the process ID of the pair/resync monitor. Stops the pair/resync monitor socket server. Sets the log level for the pair/resync monitor. Specifies the port number to be used.

g device_group t monitor_interval autorecover mode

clx remove force show pid stopsrv log p port_number Return codes

clxchkmon exits with one of the following return codes: 0 1 2 3 4 10 Successful, or device group is in PAIR state. Device group is not in PAIR state. Resource/device group is not registered with the pair/resync monitor. Pair/resync monitor (clxchkd) is not running. Device group's pair status is pending. Pair/resync monitor internal error.

116

XP Cluster Extension and CLI

11 12 13 14 16

Invalid argument to pair/resync monitor. Pair/resync monitor received signal (control-c) interrupt. Unknown status for device group. No port number is specified in services file for clxmonitor. Invalid use of the clx option on a non-XP Cluster Extension resource or XP Cluster Extension resource specified without the clx option. XP RAID Manager error.

100 Related information

For more information, see Monitoring and resynchronizing device groups on page 143.

XP Cluster Extension Software Administrator Guide

117

118

XP Cluster Extension and CLI

7 XP Cluster Extension recovery procedures


XP disk pair states
Table 3 on page 119 provides basic XP disk pair state information. The XP disk pair state transition process is complex; see the HP StorageWorks XP Continuous Access and HP StorageWorks XP Continuous Access XP Journal user guides for more information. Table 3 XP disk pair states State
P-VOL S-VOL SMPL

Description
The primary (master) disk of a disk pair The secondary (slave) disk of a disk pair A disk with no pair affinity to any other disk (This could be shown in pairdisplay outputs for your XP Continuous Access Software disk if you accidentally exported the XP Business Copy Software environment variable HORCC_MRCF. In such a case, the MU number field will not be empty.)

PAIR

The disk is either a primary disk or a secondary disk. If both (P-VOL and S-VOL) disks are in PAIR state, XP Continuous Access Software updates the secondary disk based on the primary disk. If you see only one disk in PAIR state (while the second disk is in another state), one of the following has occurred: The pair affinity on only one site of the disk pair was deleted. A takeover command has been invoked on the secondary site, while no data has been written to the primary site and the XP Continuous Access Software link was down. A takeover command has been invoked on the primary site with the fence level configured to DATA to release the fenced disk, while the XP Continuous Access Software link was down. (The secondary disk would stay in PAIR state.)

PSUS

The pair affinity has been manually suspended or a takeover operation has been invoked on the secondary site with the fence level configured to NEVER. (In this case, the secondary disk would have the state SSUS-SSWS.) The pair affinity has been manually suspended or a takeover operation has been invoked on the secondary site. In this case, the secondary disk would have the state SSWS if you invoke pairdisplay with the fc option. In fence level ASYNC, the disk could also show PFUL or PFUS when using the fc option. Only the secondary disk could show SSUS. With the fc option of pairdisplay, you can check whether somebody manually suspended the pair or a takeover command had been invoked. A prior takeover command is indicated by the SSWS state. In this case, the secondary disk is mandatory and a resynchronization can be done only from the S-VOL site. The disk is in a failure mode. Either the XP Continuous Access Software link is down, or the disk must be replaced.

SSUS

SSUS - SSWS

PSUE

XP Cluster Extension Software Administrator Guide

119

State
PDUB

Description
The disk is in a failure mode. Either the XP Continuous Access Software link is down, or the disk must be replaced. This is a special state of PSUE. If you have configured several disks into a LUSE configuration, where several LDEVS are combined to create an extended size disk and one or more disks are in an error condition, this state will be shown. This state is used to indicate that a threshold of the side file area in the XP disk array cache has been reached. This state can be seen with fence level ASYNC only. See the HP XP Continuous Access Software documentation for more information. This state is used to indicate that the side file is full and the XP disk array was not able to transfer the cache content to the remote XP disk array for a certain time. The XP disk pair has been suspended to continue processing host I/O. This state can be seen with fence level ASYNC only. See the HP XP Continuous Access Software documentation for more information.

PFUL

PFUS

Recovery sequence
To recover from a certain server or XP Continuous Access Software link failure: 1. Start the XP RAID Manager instances on both local and remote servers: Linux/UNIX export HORCMINST=instance_number horcmstart.sh instance_number Windows set HORCMINST=instance_number HORCMSTART instance_number 2. Gather general pair status information: pairdisplay g device_group 3. Display the pair status information after a failed swap-takeover (the S-VOL state is SSWS): pairdisplay g device_group fc 4. To recover from these states, invoke the following command from the S-VOL side: pairresync swaps c 15 g device_group If the pair needs to be used on the old primary side, the following commands must be invoked from the primary side: pairresync swapp c 15 g device_group horctakeover g device_group

120

XP Cluster Extension recovery procedures

5.

Display the pair status information after a P-VOL takeover (local P-VOL PSUS; remote S-VOL PAIR): pairdisplay g device_group fc To recover from these states, invoke the following command from the P-VOL side: pairresync c 15 g device_group CAUTION: The application must be shut down and the file systems unmounted before a fenced disk in fence level DATA can be set in read/write mode again. After the P-VOL takeover, the file system must be checked before it can be mounted. Any other recovery procedure could lead to unrecoverable file systems. If a horctakeover command results in S-VOL, or P-VOL becomes SMPL and none of the disks in the device group has been written to, you can recover from the situation by splitting the remaining P-VOL or S-VOL to SMPL: pairsplit [-S | -R] -g device_group After splitting the pair, the pair can be re-created without copying its content using: paircreate -nocopy c 15 -f fence_level -g device_group -v [r | l] If a horctakeover command results in S-VOL, or P-VOL becomes SMPL and data was written to one of the disks in the device group, you can recover from the situation by splitting the remaining P-VOL or S-VOL to SMPL: pairsplit [-S | -R] -g device_group After being split, the pair can be re-created with a full copy using: paircreate c 15 -f fence_level -g device_group -v [r | l] To ensure that a certain pair state has been established, invoke the event wait command: pairevtwait -g device_group -t time_to_wait -s pair_state

XP Cluster Extension Software Administrator Guide

121

122

XP Cluster Extension recovery procedures

8 User configuration file and XP Cluster Extension objects


Objects (also called properties in this document) define the disk array environment and failover/failback behavior. Information comes directly from the cluster software, indirectly from the disk array through XP RAID Manager, and from a configuration file created by users. This file describes the dependencies between application services and XP RAID Manager device groups in one file for all application services in the cluster. The user configuration file provides customized and default values for supported parameters. You can specify all customizable XP Cluster Extension objects in the file, and a copy must exist on all nodes using XP Cluster Extension. XP Cluster Extension uses the information objects to match current disk states and configuration parameters and to invoke actions, including preparing disks to be activated or stopping the application startup.

User configuration file location


The user configuration file is placed in the configuration directory: UNIX (AIX, Solaris, and Linux) /etc/opt/hpclx/conf Windows %ProgramFiles%\Hewlett-Packard\Cluster Extension XP\conf For more information, see: Basic configuration example on page 139 Creating and configuring the user configuration file on page 112 HACMP The UCF.cfg file is required for IBM HACMP. You must maintain and copy the UCF.cfg file to all systems running XP Cluster Extension. The UCF.cfg file includes a COMMON section to configure the XP Cluster Extension environment and an APPLICATION section to configure the application service-dependent failover/failback behavior. The APPLICATION section is a multitag component; the APPLICATION tag and application-related objects can appear numerous times in the UCF.cfg. For more information, see User configuration file for HACMP on page 28. MSCS XP Cluster Extension integration with MSCS does not require a user configuration file when the standard environment for XP Cluster Extension is used. The XP Cluster Extension objects that are integrated with MSCS can be configured as resource-specific properties in the cluster software. For more information, see Configuring XP Cluster Extension resources on page 45.

XP Cluster Extension Software Administrator Guide

123

RHCS and SLE HA XP Cluster Extension integration with RHCS and SLE HA uses an XP Cluster Extension resource configuration file. The objects and format in the configuration file are the same as the UCF.cfg file. For more information, see Chapter 5 on page 93. VERITAS Cluster Server Integrating XP Cluster Extension with VERITAS Cluster Server does not require a user configuration file when the standard environment for XP Cluster Extension is used. The XP Cluster Extension objects that are integrated with VERITAS Cluster Server are configurable as resource attributes in the cluster software. For more information, see Chapter 4 on page 75.

File structure
The configuration file consists of a COMMON section and an APPLICATION section. These sections are distinguished by control tags. XP Cluster Extension uses the following objects as control tags: COMMON APPLICATION Objects have one of the following formats:
tag integer string A definition of an object; for example, COMMON or APPLICATION A number; for example, a timeout value A name, which can include alphabetic and numeric characters and underscores; for example, an application startup value A list of space-separated strings, for example, a list of host names (lists of numbers are stored as lists of strings)

list

Text that is a comment starts with the pound (#) symbol and continues until the end of the line. Comments can start on a new line or be part of a line specifying an object.

Specifying object values


When using the default configuration, you must provide values for the following objects: DeviceGroup: An XP RAID Manager device group DC_A_Hosts: A list of the cluster nodes in data center A DC_B_Hosts: A list of the cluster nodes in data center B RaidManagerInstances: A list of XP RAID Manager instances that XP Cluster Extension can use to communicate with the disk array XPSerialNumbers: The serial numbers of the primary and secondary XP disk arrays You do not need to change the default settings unless you want to change the degree of protection for your paired disks. If you change an object, you may need to change additional related objects. For example, if you change the FenceLevel object to DATA, you might need to change the DataLoseMirror object.

124

User configuration file and XP Cluster Extension objects

Objects are supported according to the requirements or capabilities of the cluster software, as shown in Table 4 on page 125. Table 4 Cluster software supported objects System Object
COMMON LogDir LogLevel SearchObject VcsBinPath APPLICATION ApplicationDir ApplicationStartup AsyncTakeoverTimeout AutoRecover BCEnabledA BCEnabledB BCMuListA BCMuListB BCResyncEnabledA BCResyncEnabledB BCResyncMuListA BCResyncMuListB ClusterNotifyCheckTime ClusterNotifyWaitTime DataLoseDataCenter DataLoseMirror DC_A_Hosts

CLI

HACMP

MSCS

VCS

RHCS, SLE HA

XP Cluster Extension Software Administrator Guide

125

System Object
DC_B_Hosts DeviceGroup FastFailbackEnabled FenceLevel Filesystems JournalDataCurrency LocalDCLMForNonPAIRDG PostExecCheck PostExecScript PreExecScript RaidManagerInstances ResyncMonitor ResyncMonitorAutoRecover ResyncMonitorInterval ResyncWaitTimeout StatusRefreshInterval Vgs XPSerialNumbers Supported

CLI

HACMP

MSCS

VCS

RHCS, SLE HA

COMMON objects
The COMMON section is used to set the environment of XP Cluster Extension. The COMMON tag can appear in the configuration file only once. The COMMON object does not require any value. Objects of the type COMMON can appear only one time. Those objects must be placed after the COMMON tag in the configuration file. If the default values fit your environment, there is no need to specify them in the file.

126

User configuration file and XP Cluster Extension objects

COMMON
Format Description tag Distinguishes between general (common) and application-specific objects.

LogDir
Format Description Default value String (Optional) Defines the path to the XP Cluster Extension log file. Linux/Unix /var/opt/hpclx/log Windows %ProgramFiles%\Hewlett-Packard\Cluster Extension XP\log

LogLevel
Format Description Valid values String (Optional) Defines the logging level used by XP Cluster Extension. error (default): Logs only error messages for events that are unrecoverable. warning: Logs error messages and warning messages for events that are recoverable. info: Logs error messages, warning messages, and additional information, such as disk status. debug: Logs error messages, warning messages, info messages, and messages that report on execution status; useful for troubleshooting.

SearchObject (HACMP only)


Format Description String (Optional) Searches for the application service if the user configuration file specifies multiple applications. This object is used for HACMP only. Vgs

Default value

VcsBinPath (VCS only)


Format Description Default value String (Optional) Defines the path to the VCS binaries. This object is used for VCS only. /opt/VRTSvcs/bin

XP Cluster Extension Software Administrator Guide

127

APPLICATION objects
The APPLICATION section defines the failover and failback behavior of XP Cluster Extension for each application service. APPLICATION is a multitag that can appear in the configuration file for each application service using XP Cluster Extension. The APPLICATION object requires the name of the application service as its value. The objects specified after an APPLICATION tag must appear only once per application. As with the COMMON objects, the APPLICATION objects have predefined default values. XP Cluster Extension uses the following rules to define objects: If you use the default value, you do not have to specify the object. XP Cluster Extension uses objects depending on the setting of other objects. For example, if you set the FenceLevel object to DATA, XP Cluster Extension uses the values specified for the DataLoseMirror or DataLoseDataCenter object. However, these objects are ignored if the FenceLevel object is set to NEVER. The pre-execution and post-execution functions in XP Cluster Extension are not processed if the associated object values are empty. (This is the default setting.) When setting APPLICATION object values: Use the VCS GUI for VCS. Use a user configuration file for the CLI and HACMP. Use the Microsoft Cluster Administrator GUI (Windows Server 2003) or the Failover Cluster Management GUI (Windows Server 2008/2008 R2) for MSCS.. Use an XP Cluster Extension configuration file for RHCS and SLE HA.

APPLICATION objects
This section describes the available APPLICATION objects for XP Cluster Extension.

APPLICATION
Format Description Tag Distinguishes between general and application-specific objects. Specify the name of the application service. The format of its value is equivalent to a string value.

ApplicationDir
Format Description String Specifies the directory where XP Cluster Extension searches for application-specific files, such as the force flag or online file. If ApplicationDir is set to a nonexistent drive and PairResyncMonitor is not enabled, XP Cluster Extension is unable to create the online file and cannot put the resource online. Windows If ApplicationDir is not set, XP Cluster Extension uses the local %HPCLX_PATH% values as defined in the registry.

128

User configuration file and XP Cluster Extension objects

Default values

Linux/UNIX online file: /etc/opt/hpclx force flag file: etc/opt/hpclx/conf Windows %HPCLX_PATH%

Files

resource_name.createsplitbrain resource_name.forceflag resource_name.online If specified in a user configuration file, resource_name is the value of the APPLICATION tag; otherwise, resource_name is the value of the XP Cluster Extension resource name.

ApplicationStartup
Format Description String (Optional) Specifies where a cluster group should be brought online. The ApplicationStartup object can be customized to determine whether an application service starts locally or is transferred back to the remote data center (if possible) to start immediately without waiting for resynchronization. This object is used only if an application service has already been transferred to the secondary site and no recovery procedure has been applied to the disk set (the disk pair has not been recovered and is not in PAIR state). This process is considered a failback attempt without prior disk pair recovery. XP Cluster Extension can detect the most current copy of your data based on the disk state information. If XP Cluster Extension detects that the remote XP disk array has the most current data, it orders a resynchronization of the local disk from the remote disk, or it stops the startup process to enable the cluster software to fail back to the remote XP disk array. If a resynchronization is ordered, XP Cluster Extension monitors the progress of the copy process. If the application service was running on a secondary XP disk array without a replication link, a large number of records may need to be copied. If the copy process takes longer than the configured application startup timeout value, the application startup will fail. MSCS If the ApplicationStartup resource property is set to FASTFAILBACK and the FailoverThreshold value is set to a number higher than the current number of clustered systems for the service or application, the service or application will restart on configured nodes until one of the following conditions is met: The resource is brought online in the remote data center. The resource failed because the FailoverThreshold value has been reached. The resource failed because the FailoverPeriod timeout value has been reached. CAUTION: Disable subsequent automated failover procedures for recovery failback operations.

XP Cluster Extension Software Administrator Guide

129

Valid values

FASTFAILBACK (default) The cluster group is brought online in the remote data center (if possible) without waiting for resynchronization. The application startup process is stopped locally and XP Cluster Extension reports a data center error. Depending on the cluster software, the application service cannot start on any system in the local data center, and the cluster software transfers the application service back to the remote data center. Use this value to provide the highest level of application service availability. Depending on the value configured for the AutoRecover object, XP Cluster Extension attempts to update the former primary disk based on the secondary disk and swaps the personalities of the disk pair so that the local disk will become the primary disk. In a two-node cluster, this process does not work because the target failback system is not available. In this case, the application service must be started manually, or the ApplicationStartup object must be set to RESYNCWAIT. In an XP Cluster Extension for MSCS integration, XP Cluster Extension can detect when there is no target failback system available in the remote data center. In this case, XP Cluster Extension behaves as if the ApplicationStartup resource property is set to RESYNCWAIT. RESYNCWAIT The online local cluster group must wait until the disk status is PAIR. XP Cluster Extension initiates a resynchronization of the local disk based on the remote disk. The copy process is monitored; if no copy progress is made after a monitoring interval expires, the copy process is considered failed and XP Cluster Extension returns a global error. If RESYNCWAIT has been specified for the ApplicationStartup object, the ResyncWaitTimeout object must be specified, in case XP Cluster Extension should wait for resynchronization changes for more or less than 90 seconds, which is the default.

AsyncTakeoverTimeout
Format Description Integer (Optional) Specifies the horctakeover command timeout in seconds. Must be adjusted based on disk mirroring link speed. This object is used only if the FenceLevel object value is ASYNC. The takeover operation for fence level ASYNC (XP Continuous Access Software) offers the option to stop the data transfer process after a specified time value. This is used to allow access to the remote copy if the data transfer process is stopped due to an XP Continuous Access Software link failure. All data that has been copied up to the moment the timeout value is reached is consistent and available to access at the secondary site.

130

User configuration file and XP Cluster Extension objects

CAUTION: Measure or calculate the full XP disk array cache copy time to use the gathered information for the AsyncTakeoverTimeout object. After a takeover command has been invoked, XP Continuous Access Software copies the side file area residing in the XP disk array cache to the site where the takeover command has been issued (the secondary disks). The side file area cannot exceed the installed cache size. The maximum time for the AsyncTakeoverTimeout object is the time to fully copy the amount of cache size data. The takeover timeout value is used to terminate the copy process to provide access to the secondary disks; for example, if all links or the primary XP disk array are unavailable to copy the side file area. The copy time depends on the performance of the XP Continuous Access Software link between your sites. The takeover or resynchronization operation could take longer than the timeout value for application service startup in the cluster software. The application service startup might fail in this case. However, the takeover or resynchronization command will continue in the background.

Default value

3600

AutoRecover
Format Description String (Optional) Recovers a suspended or deleted disk pair when the resource is brought online at application service startup time. If the AutoRecover object is set to YES, XP Cluster Extension will try to resynchronize the remote disk at application startup time. XP Cluster Extension will ignore the return code of the resynchronization command and allow access to the disk ensuring highest application availability. If the resynchronization attempt fails, XP Cluster Extension will not fail. The internal logic will first apply the concurrency and consistency rules to allow access to the disk set. If you configure fence level DATA for the device group and set the FenceLevel object to DATA, the AutoRecover object will change XP Cluster Extension's behavior. XP Cluster Extension will attempt to re-establish the PAIR state and wait for the PAIR state before it allows access to the disk. If the resynchronization or takeover process fails, XP Cluster Extension returns a global error. YES (default) NO

Valid values

BCEnabledA
Format Description String (Optional) Enables rolling disaster protection for data center A.

XP Cluster Extension Software Administrator Guide

131

Valid values

YES NO (default)

BCEnabledB
Format Description Valid values String (Optional) Enables rolling disaster protection for data center B. YES NO (default)

BCMuListA
Format Description List (Optional) Space-separated list defines the MU number of the XP Business Copy Software disk pairs in data center A.

BCMuListB
Format Description List (Optional) Space-separated list defines the MU number of the XP Business Copy Software disk pairs in data center B.

BCResyncEnabledA
Format Description String (Optional) Enables automatic resynchronization of XP Business Copy Software disk pairs in data center A. The automatic resynchronization function is supported only when the split XP Business Copy Software pair is located in the same data center where XP Cluster Extension is started. YES NO (default)

Valid values

BCResyncEnabledB
Format Description String (Optional) Enables automatic resynchronization of XP Business Copy Software disk pairs in data center B. The automatic resynchronization function is supported only when the split XP Business Copy Software pair is located in the same data center where XP Cluster Extension is started. YES NO (default)

Valid values

132

User configuration file and XP Cluster Extension objects

BCResyncMuListA
Format Description List (Optional) Space-separated list defines the MU number of the XP Business Copy Software disk pairs in data center A.

BCResyncMuListB
Format Description List (Optional) Space-separated list defines the MU number of the XP Business Copy Software disk pairs in data center B.

ClusterNotifyCheckTime
Format Description Integer Specifies how often XP Cluster Extension will check for VM live migration state changes. 10 seconds

Default value

ClusterNotifyWaitTime
Format Description Integer Specifies the amount of time that XP Cluster Extension will monitor for VM live migration state changes. 5 seconds

Default value

DataLoseDataCenter
Format Description String (Optional) Specifies whether a resource should be brought online while the disk pair is (or will be) suspended or deleted and there is no connection (XP Continuous Access and IP network) to the remote data center. Used only if the FenceLevel object value is DATA. XP RAID Manager is able to access its remote peer to invoke takeover actions for XP Continuous Access Software device groups. It is also able to invoke a swaptakeover operation of the device group from the secondary site. If no configured remote XP RAID Manager instance replies to a request of the local XP RAID Manager instance (remote status EX_ENORMT), all network connections between the local and the remote data center are considered DOWN. If the swap-takeover operation leads to a suspended state for the device group, the XP Continuous Access Software links are considered DOWN. Because redundant networks and XP Continuous Access Software links are necessary to build a disaster-tolerant environment, this situation can be considered as a data

XP Cluster Extension Software Administrator Guide

133

center failure. The DataLoseDataCenter object is used to allow/prohibit automatic application service startup in this particular case. The combination of setting the DataLoseMirror object to YES and the DataLoseDataCenter object to NO are contradictory. Valid values YES (default) NO

DataLoseMirror
Format Description String (Optional) Specifies whether a resource should be brought online while the disk pair is suspended or deleted. Used only if the FenceLevel object value is DATA and local and remote XP disk status information can be gathered. If the remote XP disk state information is not available (remote state EX_ENORMT), the setting of the DataLoseDataCenter object will be used. Depending on the value configured for the AutoRecover object, XP Cluster Extension will attempt to recover the PAIR state for the device group. XP Cluster Extension waits until the PAIR state has been established. If this operation fails, XP Cluster Extension returns a global error. Because the DATA fence level ensures no loss of concurrency, manual intervention is required to recover the PAIR state. The PAIR state must be re-established for all disks in the device group before you can start the application service. The combination of setting the DataLoseMirror object to YES and the DataLoseDataCenter object to NO are contradictory. YES NO (default)

Valid values

DC_A_Hosts (Required)
Format Description List This space-separated list defines the cluster nodes in data center A. VCS This object is a string-vector element. Add a new element to the list for each system name.

DC_B_Hosts (Required)
Format Description List This space-separated list defines the cluster nodes in data center B. VCS This object is a string-vector element. Add a new element to the list for each system name.

134

User configuration file and XP Cluster Extension objects

DeviceGroup (Required)
Format Description Files String XP RAID Manager device group, containing the application service disk set. Linux/UNIX /etc/horcmX.conf Windows: \winnt\horcmX.conf %system_root%\horcmX.conf where X is the XP RAID Manager instance number.

FastFailbackEnabled (VCS only)


Format Description String (Optional) Disables VCS service groups for the data center. This allows the immediate transferring of the service group back to the remote data center. To allow this operation, the VCS configuration file (main.cf) will be write-enabled and saved later. The service group will be disabled for all systems contained in either the DC_A_Hosts object or DC_B_Hosts object. Then, the VCS configuration file will be saved (dumped). YES (default) NO

Valid values

FenceLevel
Format Description String (Optional) The FenceLevel object specifies the fence level configured for the device group. XP Cluster Extension checks whether the current fence level reported by the XP disk array is the same as the configured (expected) fence level. This object is also used to make sure your configurations are supported based on consistency considerations. Different failover and recovery procedures are used for different fence levels. If you change the FenceLevel object value, also review the values of these objects: DataLoseMirror, DataLoseDataCenter, and AsyncTakeoverTimeout. DATA NEVER (default) ASYNC (includes JOURNAL)

Valid values

Filesystems (CLI and HACMP only)


Format List

XP Cluster Extension Software Administrator Guide

135

Description

Space-separated list of file systems.

JournalDataCurrency
Format Description String (Optional) Specifies whether a resource should be brought online while there could still potentially be a large amount of data on P-VOL Journal that cannot be transmitted to the secondary site due to the XP Continuous Access Software link being down. Used only if the FenceLevel object value is ASYNC and the local device is an S-VOL. XP Cluster Extension checks whether the current XP Continuous Access Software link status is >0 using the minimum active paths (MINAP) value returned by the XP RAID Manager pairvolchk command. If the minimum active paths equals 0, this indicates that the XP Continuous Access Software link is unavailable and that any data still located in the primary journal will not be replicated to the secondary volume. If JournalDataCurrency is set to YES then XP Cluster Extension will not perform the takeover operation and will not allow the application to access the data. YES (default) NO

Valid values

LocalDCLMForNonPAIRDG
Format Description String Specifies whether a live migration operation within the local data center is allowed when the device group is not in PAIR state. Set this property to YES to allow live migration operations in the local data center when the device group is not in PAIR state, the latest data is in the local data center, and the XP Cluster Extension resource can come online. For example, if the device group state is PVOL_COPY in the local data center and SVOL_COPY in the remote data center, setting this property to YES allows you to perform live migration to nodes within the local data center. Set this property to NO if you want to cancel live migration operations within the local data center when the device group is not in PAIR state. NOTE: Configure this parameter for each XP Cluster Extension resource associated with the VM cluster resource and the corresponding application cluster resource in the UCF file. If the VM group contains more than one XP Cluster Extension resource, and you want to use this parameter, you must set this parameter to the same value for each XP Cluster Extension resource. If you do not set the parameter to the same value, this parameter will default to a value of NO.

136

User configuration file and XP Cluster Extension objects

Valid values

YES NO (default)

PostExecCheck
Format Description String (Optional) The PostExecCheck object is used to configure XP Cluster Extension to gather XP disk pair status information after the takeover procedure. That information will be passed to the post-executable. In case of a remote data center failure, it could be time consuming to gather that information, especially if your post-executable does not need any XP status information. The arguments passed to the postexecutable will include only the local disk status if the PostExecCheck object is set to NO. See Setting up XP RAID Manager on page 20. YES NO (default)

Valid values

PostExecScript
Format Description String (Optional) Specifies an executable with its full path name to be invoked after the takeover action or failover procedure.

PreExecScript
Format Description String (Optional) Specifies an executable with its full path name to be invoked before the takeover action or failover procedure.

RaidManagerInstances (Required)
Format Description List A space-separated list of XP RAID Manager instances that XP Cluster Extension can use to communicate with the disk array. The instance numbers must be the same among all cluster systems. XP Cluster Extension can alternate between the specified instances. VCS This object is a string-vector element. Add a new element to the list for each system name. Linux/UNIX /etc/horcmX.conf Windows %systemroot%\horcmX.conf where X is the XP RAID Manager instance number.

Files

XP Cluster Extension Software Administrator Guide

137

ResyncMonitor
Format Description String (Optional) Starts the pair/resync monitor to monitor the disk pair status and resynchronize disk pairs if the ResyncMonitorAutoRecover attribute is set to YES. YES NO (default)

Valid values

ResyncMonitorAutoRecover
Format Description String (Optional) Automatically recovers disk pairs states if the disk pairs are monitored by the pair/resync monitor. YES NO (default)

Valid values

ResyncMonitorInterval
Format Description Integer (Optional) Specifies the monitor interval (in seconds) that the pair/resync monitor checks the disk pair status. 60

Default value

ResyncWaitTimeout
Format Description Integer (Optional) Specifies the timeout value (in seconds) for a disk pair resynchronization. It may take some time to resynchronize disks. The timer times out if there is no change in the percentage value of the copy status for the device group in the specified time interval. The timeout value is used if the ApplicationStartup object is set to RESYNCWAIT. 90

Default value

StatusRefreshInterval
Format Description Default value Integer Specifies how often XP Cluster Extension will gather XP storage array information. 300 seconds

138

User configuration file and XP Cluster Extension objects

Vgs (CLI and HACMP only)


Format Description List List of volume groups.

XPSerialNumbers (Required)
Format Description List A space-separated list of at least two serial numbers must be specified: the serial numbers of the primary and secondary XP disk arrays. XP Cluster Extension checks whether the local disk array is contained in this list. Serial numbers of the disk arrays of the connected cluster nodes (at least two). VCS This object is a string-vector element. Add a new element to the list for each system name.

Basic configuration example


The following is an example of a UCF.cfg file:
#/etc/opt/hpclx/conf/UCF.cfg #This is the XP Cluster Extension User Configuration File (UCF.cfg). #The COMMON tag specifies the configuration for the #XP Cluster Extension core environment COMMON LogLevel info #default (not necessary) APPLICATION sap #the application service Vgs sapdatavg saptmpvg #the volume groups (not necessary) Filesystems /sapdata /saptmp #the filesystems DeviceGroup sapdg #RM dev group for the app service RaidManagerInstances 22 #RM instance number for dev group DC_A_Hosts host1a host2a #Data center A DC_B_Hosts host3b host4b #Data center B

XP Cluster Extension Software Administrator Guide

139

140

User configuration file and XP Cluster Extension objects

9 Advanced XP Cluster Extension configuration


This chapter describes advanced XP Cluster Extension configuration procedures.

Implementing rolling disaster protection


To implement rolling disaster protection, create XP Business Copy Software disk pairs for the local XP Continuous Access Software disks. Create the XP Business Copy Software disk pairs using the m noread option of the paircreate command. This option ensures that XP Business Copy Software disks are unavailable to other services and reserved for rolling disaster protection only. Map the XP Business Copy Software S-VOLs to a backup server, not to the local cluster node. When XP Cluster Extension suspends the XP Business Copy Software pairs, they become available to the local server, which could result in duplicated volumes, disk group IDs, or signatures. CAUTION: You must ensure that at least one XP Business Copy Software disk pair is in PAIR state. If rolling disaster protection is enabled and none of the XP Continuous Access Software mirrored disk pairs have an XP Business Copy Software disk pair that is in PAIR state, XP Cluster Extension returns a global error, and you will not be able to activate the application service. You can use forceflag to start the application service. See Enabling write access regardless of disk pair state on page 144. In this case, XP Cluster Extension disables rolling disaster protection.

Using XP RAID Manager with rolling disaster protection


Rolling disaster protection does not require that you define XP Business Copy Software disk pairs in the XP RAID Manager horcmX.conf files. XP Cluster Extension uses the MU number to monitor and control associated XP Business Copy Software pairs. You must create an XP RAID Manager configuration file to control the XP Business Copy Software disk pairs that are outside XP Cluster Extension control. XP Cluster Extension Software cannot suspend XP Business Copy Software disk pairs on the remote XP disk array in the remote data center if the XP RAID Manager instance in the remote data center is not running or not reachable.

Setting XP Cluster Extension objects to enable rolling disaster protection


To enable rolling disaster protection with XP Business Copy Software, set the BCEnabledA and BCEnabledB objects for data centers A and B. When these objects are set to YES, rolling disaster protection is enabled and XP Cluster Extension checks whether the configured XP Business Copy Software disk pairs are in PAIR state. Before initiating the resynchronization operation, XP Cluster

XP Cluster Extension Software Administrator Guide

141

Extension suspends specified XP Business Copy Software disk pairs that are in PAIR state. For information on setting XP Cluster Extension objects, see Chapter 8 on page 123. When using rolling disaster protection, note the following: If the BCEnabledA and BCEnabledB objects are set to YES, you must configure specific XP Business Copy Software disk pairs using MU numbers. The MU number defines one of the many disk pair relationships you can create with XP Business Copy Software disk pairs. You can specify as many MU numbers as the XP Business Copy Software supports. Disk pair MU numbers are specified by the BCMuListA and BCMuListB objects for data centers A and B. To enable resynchronization of XP Business Copy Software disk pairs that have been split by XP Cluster Extension, use the BCResyncEnabledA and BCResyncEnabledB objects for data centers A and B. XP Cluster Extension maintains a list of all associated XP Business Copy Software disk pairs that were in PAIR state before a resynchronization attempt. If pairs were suspended, XP Cluster Extension automatically resynchronizes those disk pairs after the XP Continuous Access Software remote mirrored disk pairs have been paired. This feature supports automatic resynchronization of locally split XP Business Copy Software disk pairs only. You must specify MU numbers for resynchronization by using the BCResyncMuListA and BCResyncMuListB objects for data centers A and B.

Setting automatic recovery for rolling disaster protection


If the AutoRecover object is set to YES, XP Cluster Extension automatically resynchronizes the XP Continuous Access Software disk pairs to update the remote disks. If rolling disaster protection is also enabled, it suspends the XP Business Copy Software disk pair that is attached to the remote XP Continuous Access Software disk. If the remote XP RAID Manager instance is not running or cannot be reached, the remote XP Business Copy Software disk pair cannot be suspended. If this occurs, XP Cluster Extension continues the application service activation without automatic resynchronization of the XP Continuous Access Software disk pair and without the suspending of the XP Business Copy Software disk pair. In this case, the XP Continuous Access Software disk pair must be recovered manually.

Using the pair/resync monitor with rolling disaster protection


If the ResyncMonitor object is set to YES, the pair/resync monitor does not use XP Business Copy Software pairs to recover suspended or failed XP Continuous Access Software disk pairs. To protect the remote volume of an out-of-sync XP Continuous Access Software disk pair against rolling disasters, use the pair/resync monitor's default settings. Resynchronize the XP Continuous Access Software disk pair manually after splitting off the XP Business Copy Software disk pair.

Restoring server operation for rolling disaster protection


Rolling disaster protection automatically recovers the PAIR state of the XP Continuous Access Software disk pair of an application service. Before you fail over (or fail back) an application service from one data center to the other, you must restore the server operation. After you restart the server, also start the XP RAID Manager instance used to manage the XP Continuous Access Software disk pairs on those servers. This enables rolling disaster protection to work correctly during a recovery failover/failback operation. Figure 10 on page 143 depicts a fully configured XP Cluster Extension environment that uses rolling disaster protection. The XP Business Copy Software disk pairs are specified as 0 in the XP Cluster Extension BCMuListA and BCMuListB objects. See APPLICATION objects on page 128 for more information about these objects.

142

Advanced XP Cluster Extension configuration

Figure 10 Disaster-tolerant configuration with rolling disaster protection


.

Monitoring and resynchronizing device groups


The pair/resync monitor can either only monitor or both monitor and resynchronize the state of the XP RAID Manager device group for an application service.

XP Cluster Extension Software Administrator Guide

143

CAUTION: If the application service stops, the cluster software or your customized solution must be able to stop the monitoring or resynchronization utility. Without this ability, the use of the pair/resync monitor is not supported. HP recommends that you disable application service failover during a disk pair recovery (resynchronization). When the pair/resync monitor is enabled, XP Cluster Extension takes immediate action to recover any reported suspended disk pair. If, at any time, the resynchronization process is running on both disk array sites, data corruption might occur. Turn the pair/resync monitor (clxchkd) on or off using the ResyncMonitor object. For information on setting XP Cluster Extension objects, see Chapter 8 on page 123. If the ResyncMonitorAutoRecover object is set to YES, the monitor tries to resynchronize the remote disk based on the local disk. Resynchronization occurs only if the disks are in a P-VOL/S-VOL or S-VOL/P-VOL relationship. If one or both disk pairs are in the SMPL state or the device group state is mixed, automatic resynchronization is not attempted. The ResyncMonitorAutoRecover object set to YES is supported only if the minimum disk array firmware version is 01-11-xx (XP512/XP48) or 21.01.xx (XP128/XP1024), and the minimum XP RAID Manager version is 01.04.00. The monitor interval is specified with the ResyncMonitorInterval object. Do not set the monitor interval below the XP RAID Manager timeout parameter (HORCM_MON in the horcmX.conf file). If the link for the device group is broken, the pair/resync monitor notifies you by using the syslog facility (Linux/UNIX) and the Event Log (Windows). The monitor recognizes a broken link only when data is to be written to disk; otherwise, the data is the same on the primary and secondary disk, and the device group state is reported as PAIR.

Enabling write access regardless of disk pair state


The force flag forces XP Cluster Extension to skip the internal logic and enables write access to the local volume, regardless of the disk pair state. This flag can be set when you are sure that the local volume contains the latest data, even though a previous application service startup process failed because XP Cluster Extension discovered a disk pair status that could not be handled automatically. To use the force flag: 1. 2. 3. Ensure that the application service is not running. Create a file called application_name.forceflag in the directory specified by the ApplicationDir object. Start the application service. XP Cluster Extension removes the forceflag file after detecting it.

You cannot use the force flag if the local disk state is S-VOL_COPY, which indicates that a copy operation is in progress. When a copy operation is in progress, a disk cannot be activated, and XP Cluster Extension returns a global error. Using the force flag does not enable the automatic recovery features of XP Cluster Extension. After using the force flag, you must recover the suspended or broken disk pairs using XP RAID Manager commands as described in Recovery sequence on page 120.

144

Advanced XP Cluster Extension configuration

Executing programs before and after an XP Cluster Extension takeover


XP Cluster Extension can invoke other programs, such as Perl scripts, before or after an XP Cluster Extension takeover. These programs can be any executable, and must be able to provide return codes to XP Cluster Extension. If the programs add significant execution time to the application service startup, the timeout values for the startup must be adjusted in the cluster software. XP Cluster Extension transfers information as command-line arguments to the pre-execution and post-execution programs. Pre-executables and post-executables must be specified by full path in the PreExecScript and PostExecScript objects. If no executable is specified (empty value for the object), no preprocessing or postprocessing, is done. The pre-executable and post-executable path names can include spaces and environment variables. The environment variables will be expanded to form the full path name for the executable. To use Perl scripting with MSCS, the Perl script must be called from a Windows batch file; therefore, two scripts are needed: the calling batch file and the called Perl script. In the following example, c:\tmp\preExec.bat is the calling batch file, and c:\tmp\preExec.pl is the called Perl script:
Windows batch file: c:\tmp\preExec.bat @echo off c:\perl\bin\perl.exe c:\tmp\preExec.pl %3 %4 %5 exit /B %ERRORLEVEL%

Arguments
The following arguments are transferred to the scripts in this order: 1. 2. 3. 4. 5. Name Vgs (HACMP only) RaidManagerInstances DeviceGroup local device group state (check) Pre-executable status before failover and post-executable status after failover 6. local device group state (display) Pre-executable status before failover and post-executable status after failover IMPORTANT: An empty string is returned if parameter #5 is not SSWS, PSUE, or PDUB. 7. remote device group state (check) Pre-executable status before failover and post-executable status after failover

XP Cluster Extension Software Administrator Guide

145

8.

remote device group state (display) Pre-executable status before failover and post-executable status after failover IMPORTANT: An empty string is returned if parameter #7 is not SSWS, PSUE, or PDUB.

9.

current fence level

10. disk array serial numbers (local) 11. reserved 12. reserved 13. disk array firmware version (local) 14. XP RAID Manager version (local) 15. application directory path (ApplicationDir object) 16. log file location (LogDir object) 17. DC_A_Hosts node names 18. DC_B_Hosts node names

Pre-executable return codes


Pre-executables must give a return code. These return codes determine whether a takeover function must be called. 0 PRE_OK_TAKEOVER Pre-executable OK and takeover action allowed. 1 PRE_ERROR_GLOBAL Pre-executable failed; no takeover; stop application service cluster-wide. 2 PRE_ERROR_DC Pre-executable failed; no takeover; stop application service in this data center. 3 PRE_ERROR_LOCAL Pre-executable failed; no takeover; stop application service on this system. 4 PRE_ERROR_TAKEOVER Pre-executable failed; takeover action allowed. 5 PRE_OK_NOTKVR_NOPST Pre-executable ok; no takeover; no post-exec.

146

Advanced XP Cluster Extension configuration

CAUTION: If the pre-execution program returns 1, 2, 3, or 5, a post-executable will not be executed. If a takeover function fails, the post-executable will not be executed.

Post-executable return codes


Post-executables must give a return code. These return codes determine whether the application is stopped. 0 POST_OK Post-executable OK; continue. 1 POST_ERROR_GLOBAL Post-executable failed; stop application service cluster-wide. 2 POST_ERROR_DC Post-executable failed; stop application service in this data center. 3 POST_ERROR_LOCAL Post-executable failed; stop application service on this system. 4 POST_ERROR_CONTINUE Post-executable failed; continue without error.

XP Cluster Extension Software Administrator Guide

147

148

Advanced XP Cluster Extension configuration

10 Troubleshooting
To troubleshoot problems with XP Cluster Extension, you must understand XP Continuous Access Software environments. Many issues can be attributed to incompatible disk pair states. See the XP Continuous Access Software and XP RAID Manager documentation before assuming that a problem has been caused by XP Cluster Extension. For more information on XP Continuous Access Software, see the HP StorageWorks XP Continuous Access Software user guide. XP Cluster Extension logs messages to the cluster-specific log location. However, it always keeps its own log file in its default log location. CAUTION: XP Cluster Extension is not able to handle XP device group states automatically and correctly when they result from manual manipulations. XP Cluster Extension will try to automatically recover suspended XP RAID Manager device group states if the AutoRecover object is set to YES. However, if the recovery procedure experiences a problem, XP Cluster Extension will not stop unless fence level DATA is used or the ApplicationStartup object is set to RESYNCWAIT. Therefore, ensure that the device group PAIR state has been recovered before the next failure occurs. Always disable automatic application service failover when resynchronizing disk pairs. A failure of the resynchronization source while resynchronizing can lead to unrecoverable data on the resynchronization target. The resynchronization process does not copy data in transactional order. For more information, see Implementing rolling disaster protection on page 141.

XP Cluster Extension log facility


The logging module of XP Cluster Extension provides log messages to the cluster software as well as to the XP Cluster Extension log file. The XP Cluster Extension log file includes disk status information. The XP Cluster Extension log file is located in the following directories: Linux/UNIX /var/opt/hpclx/log Windows By default, the location is defined as: %ProgramFiles%\Hewlett-Packard\Cluster Extension XP\log\ For the configuration tool, the clxXPcfg.log file resides in: %ProgramFiles%\Hewlett-Packard\Cluster Extension XP\log\ If the log file needs to be cleared and reset, for example, to reduce disk space usage, archive the log file and then delete it. A new log file is generated automatically. For information about log levels, see LogLevel on page 127.

XP Cluster Extension Software Administrator Guide

149

Start errors
Start errors can occur when the path to the XP RAID Manager binaries has not been set in the PATH environment variable. If a user configuration file is not found in the correct directory location, XP Cluster Extension returns a local error. A start error occurs if the APPLICATION name tag value in the XP Cluster Extension resource configuration file does not match the service name (RHCS) or the App value of XP Cluster Extension resource (SLE HA). XP Cluster Extension returns a local error if it does not find the XP Cluster Extension resource configuration file for in the correct directory location (RHCS and SLE HA).

Failover error handling


XP Cluster Extension automatically fails over application services if the system the application service is running on becomes unavailable. This also means that if a problem with the XP disk array state occurs, an application service startup process will be stopped. The behavior of XP Cluster Extension is highly configurable. Depending on the customer setting, XP Cluster Extension is used to prevent application services from starting automatically under the wrong conditions. Therefore, XP Cluster Extension will return local, data center-wide, or even cluster-wide errors to prevent accidental access to the XP disk array disk set. XP Cluster Extension provides the following error return codes for failover operations:
local error Prohibits an application service startup on the local system. This can be caused by the inability of XP Cluster Extension to enable disk access, or misconfiguration of the disk array environment. Prohibits an application service startup on any system in the local data center. This error is returned if the disk state indicates that it makes no sense to allow any other system connected to the same disk array to access the disks. A global error is returned if the configuration or the disk state does not allow an automatic application service startup process. In such cases, manual intervention is required.

data center error

global error

When XP Cluster Extension is integrated, an error message string and integer value are displayed. For the CLI, a return code is displayed. For more information, see CLI commands on page 113.

HACMP-specific error handling


XP Cluster Extension related messages are logged by HACMP to the following locations: /usr/adm/cluster.log This is the general HACMP log file, which gives an overview of all events processed and whether they were successful or unsuccessful. /tmp/hacmp.out This is a detailed HACMP log file containing process logs of all event scripts. The output of XP Cluster Extension can also be found in this file.

150

Troubleshooting

The XP Cluster Extension log file is named clxhacmp.log.

Start errors
HACMP will go into a loop and wait until the problem is solved and until the file /etc/opt/hpclx/ application_name.LOCK has been removed. This process has been adopted from HACMP, which will also run in an endless loop if there is a failure and until you recover all errors and start the application manually. After all errors have been recovered, you can invoke the command clruncmd to return control back to the cluster software. If the program is in a very early state of processing and experiences a problem before resolution of the application name, it may return an error return code. The /etc/opt/hpclx/UNKNOWN.LOCK file is created and must be removed after the problem has been resolved.

Failover errors
As mentioned previously, the HACMP error handling of the XP Cluster Extension will create a .LOCK file for the resource group (for example, /etc/opt/hpclx/OracleRG.LOCK). Messages are logged to the log files /var/opt/hpclx/log/clxhacmp.log and /tmp/hacmp.out. The file can be removed after the problem has been solved. HACMP can then continue to start the resource group. This file will be created for any error XP Cluster Extension returns. However, XP Cluster Extension will specify whether the error is a local, data center, or cluster-wide error. The following example demonstrates the behavior of XP Cluster Extension for HACMP if a pair state is discovered (which does not allow for an automatic takeover operation by XP Cluster Extension). In this case, the pairs have been manually suspended. It is impossible for XP Cluster Extension to determine which copy of the mirrored data is the most current. The output in /tmp/hacmp.out will be similar to the following example:
clxHACMP: > Fri Dec 15 16:35:19 NFT 2000 clxHACMP: > Arguments: oracle ora1vg ora2vg 0 oracle PVOL_PSUS PSUS SVOL_SSUS SSUS DATA 30368 30380 01-11-22/00 01.04.01 clxHACMP: > number of arguments: 14 clxHACMP: > 1: oracle clxHACMP: > 2: ora1vg ora2vg clxHACMP: > 3: 0 clxHACMP: > 4: oracle clxHACMP: > 5: PVOL_PSUS clxHACMP: > 6: PSUS clxHACMP: > 7: SVOL_SSUS clxHACMP: > 8: SSUS clxHACMP: > 9: DATA clxHACMP: > 10: 30368 clxHACMP: > 11: 30380 clxHACMP: > 12: clxHACMP: > 13: 01-11-22/00 clxHACMP: > 14: 01.04.01 clxHACMP > ===PRE=============================================== clxHACMP: pre-exec script successful (rc=0). clxHACMP: ERROR - no takeover action found. clxHACMP: ERROR - global cluster failure occurred - waiting! clxHACMP: ERROR clxHACMP: ERROR - ================================================================ clxHACMP: ERROR - XP Cluster Extension takeover procedure FAILED. clxHACMP: ERROR -

XP Cluster Extension Software Administrator Guide

151

clxHACMP: ERROR - Pair state of device group "oracle" might be clxHACMP: ERROR - incorrect. Manual checking and correction within clxHACMP: ERROR - Continuous Access XP is required. clxHACMP: ERROR - Remove file "/etc/opt/hpclx/OracleRG.LOCK" in order clxHACMP: ERROR - to continue with HACMP specific recovery actions. =================================================================

The last message is repeated every 5 minutes. XP Cluster Extension will stop any further processing until the you remove the application_name.LOCK file to transfer control back to HACMP. This enables you to check the status of the data on each copy and decide whether it is safe to continue or not. Depending on the amount of time needed for checking the configuration and the XP disk pair status, the HACMP timeout could be reached. This will automatically cause the event config_too_long to be called by HACMP. The following message will appear in the log file /tmp/hacmp.out:
WARNING: Cluster MYCLUSTER has been running recovery program '/usr/es/sbin/cluster/ events/node_up.rp' for 1110 seconds. Please check cluster status.

If you think the XP Cluster Extension configuration is correct, and the XP disk pair status allows you to manually continue the process for starting the application, remove the application lock file /etc/ opt/hpclx/oracle.LOCK mentioned in the previous error message. When this file has been removed, XP Cluster Extension transfers control back to HACMP. The event get_disk_vg_fs and all the subsequent events within the main event node_up_local will be processed. Because XP Cluster Extension as a pre-event of get_disk_vg_fs has produced an error, the main event node_up_local will fail as well. The following HACMP event event_error will be called:
node_up_local[30] [ 0 -ne 0 ] node_up_local[8] exit 1 Dec 15 17:07:17 EVENT FAILED:1: node_up_local node_up[326] [ 1 -ne 0 ] node_up[328] cl_log 650 node_up: Failure occurred while processing Resource Group OracleRG. Manual intervention required. node_up OracleRG *************************** Dec 15 2000 17:07:17 !!!!!!!!!! ERROR !!!!!!!!!! *************************** Dec 15 2000 17:07:17 node_up: Failure occurred while processing Resource Group OracleRG. Manual intervention required. node_up[329] STATUS=1 node_up[337] [ AIX1 != AIX1 ] node_up[356] exit 1 Dec 15 17:07:18 EVENT FAILED:1: node_up AIX1

To continue any further processing of HACMP, you must invoke the HACMP command clruncmd to recover from the status event_error. Example
# clruncmd aix1

This will bring the cluster into normal status again. All subsequent events (for example, node_up_complete) will be processed.

152

Troubleshooting

MSCS-specific error handling


XP Cluster Extension related messages are logged by MSCS to the following locations: %ClusterLog%\cluster.log. The XP Cluster Extension log file is named clxmscs.log. The XP Cluster Extension configuration tool log resides in the %ProgramFiles%\HewlettPackard\Cluster Extension XP\log\ directory.

Resource start errors


MSCS configurations do not require a UCF.cfg file if the default COMMON objects are used (recommended). MSCS will fail the XP Cluster Extension resource on the local system if the clxpcf file is not present. If the program is in a very early state of processing, the operation might fail and XP Cluster Extension will not show the resource name in the error message.

Failover errors
XP Cluster Extension's integration with MSCS returns a local error and fails the resource if a configuration error occurs. This could be a problem with the XP RAID Manager instance configuration or an error, which will probably require starting the resource group on another system. XP Cluster Extension resources return a data center error and fail the resource if the XP disk array status indicates that the problem experienced locally would not be solved on another system connected to the same XP disk array. This means all systems specified in the DC_A_Hosts resource property or the DC_B_Hosts resource property would fail to bring the resource group online. Depending on the resource group and resource property values, the resource tries to start on different nodes several times. If the remote data center is down, this would look like the resource group is alternating between the surviving systems. This happens until the previously mentioned resource and resource group property values are reached or you disable the restarting of the resource. This could be also the case if the ApplicationStartup resource property has been set to FASTFAILBACK. If an XP disk array state has been discovered that does not allow bringing the resource group online on any system in the cluster, a cluster error would be reported and the resource would fail on all systems. This could lead to the same behavior as described for an XP Cluster Extension data center error. Examples of such a state could be a SMPL state on both primary and secondary disks, a suspended (PSUS/SSUS) state on either site, or a state mismatch in the device group for this resource group. None of the previously mentioned scenarios will allow automatic recovery because the XP Cluster Extension resource cannot decide which copy of the data is the most current copy. In those cases, a storage or cluster administrator must investigate what happened to the environment. In any case, restarting a failed resource group without investigating the problem is not recommended. A failed XP Cluster Extension resource indicates the need to check the status of the XP disk pair on each copy and decide whether it is safe to continue or not. Figure 11 on page 154 shows examples of an incompatible XP disk pair state shown in the clxmscs.log file. The same messages can be found in the MSCS cluster log file if the XP Cluster Extension LogLevel object is set to INFO; this, however, requires creating a UCF.cfg file.

XP Cluster Extension Software Administrator Guide

153

Figure 11 Incompatible XP disk pair state


.

Using the Domain user account (Windows Server 2008/2008 R2 only)


When using the Domain user account to manage the cluster, modifying HORCM files might not be possible, and XP Cluster Extension tools might not run as expected. If you experience any of these issues, turn off UAC. To turn off UAC, select Control panel > User Accounts, and click Turn User Account Control on or off. Clear the User Account Control (UAC) to help protect your computer check box. This might resolve the issue and allow you to use the XP Cluster Extension tools with the Domain user account.

VCS-specific error handling


XP Cluster Extension related messages are logged by VCS to the following locations: VCS 1.3.0 or later: /var/VRTSvcs/log/engine_A.log VCS 1.1.2: /var/VRTSvcs/log/engine.A_log

154

Troubleshooting

This is a general VCS engine log file, which gives an overview of all cluster-related activities and whether they were successful or unsuccessful. VCS 1.3.0 or later: /var/VRTSvcs/log/ClusterExtensionXP_A.log VCS 1.1.2: /var/VRTSvcs/log/ClusterExtensionXP.log_A This XP Cluster Extension agent log file of VCS shows agent-related error information. The XP Cluster Extension log file is named clxvcs.log.

Start errors
VCS will fail the resource and disable the service group on the local system if it the clxpcf file is not present. If the program is in a very early state of processing, this operation might fail, and XP Cluster Extension will not show the service group in the error message. However, VCS will fail the resource.

Failover errors
XP Cluster Extension's integration with VCS disables service groups on the local system if a configuration error occurs. In this case, XP Cluster Extension will return a local error. The service group is disabled in the data center if the XP disk array status indicates the problem experienced locally cannot be solved on another system connected to the same XP disk array. All systems specified in the DC_A_Hosts object or DC_B_Hosts object are disabled to bring the service group online. This could be also the case if the ApplicationStartup object has been set to FASTFAILBACK. If an XP disk array state has been discovered (which does not allow bringing the service group online on any system in the cluster), a cluster error is reported and all systems are disabled to bring the service group online. Such state could be a SMLP state on both primary and secondary disks, a suspended (PSUS/SSUS) state on either site, or a state mismatch in the device group for this service group. None of the scenarios allows automatic recovery because XP Cluster Extension cannot determine which copy of the data is the most current. In these cases, a storage or cluster administrator must investigate what happened to the environment. CAUTION: HP does not recommend that you enable the service group again and try to bring the prior failed service group online without investigating the problem. When a failed XP Cluster Extension resource occurs, check the status of the XP disk pair on each copy, and decide whether it is safe to continue. Examples Figure 12 on page 156 and Figure 13 on page 156 show examples of an incompatible XP disk pair state shown in the VCS Cluster Manager Log Desk window.

XP Cluster Extension Software Administrator Guide

155

Figure 12 Incompatible XP disk pair state (VCS Cluster Manager Log Desk window)
.

Figure 13 on page 156 shows detailed information for the current XP disk pair state, which will be displayed in the VCS Log Desk only if the XP Cluster Extension LogLevel object is set to INFO.

Figure 13 Detailed information of the XP disk pair state (VCS Log Desk)
.

Linux-specific error handling


XP Cluster Extension messages are logged by RHCS and SLE HA to the following location: /var/ log/messages. The XP Cluster Extension log file is called clxxplxcs.log.

156

Troubleshooting

Failover errors
XP Cluster Extension will fail to bring an RHCS service or SLE HA resource group online on the local system if a configuration error occurs. In this case, XP Cluster Extension returns a local error. The RHCS service or SLE HA resource group will go into a failed state after a startup attempt on any system in the same data center if the disk array status indicates that a problem experienced locally would not be solved on another system connected to the same disk array. In this case, XP Cluster Extension returns a data center error. This error could also occur if the ApplicationStartup object is set to FASTFAILBACK. If a disk array state that does not allow starting the RHCS service or SLE HA resource group on any system in the cluster is discovered, a cluster error is reported and none of the systems will be allowed to run the service or resource group. Such a state could be an SMLP state on both primary and secondary disks, a suspended (PSUS/SSUS) state on either site, or a state mismatch in the device group for this RHCS service or SLE HA resource group. None of these scenarios allows automatic recovery because XP Cluster Extension cannot determine which copy of the data is the most current. In these cases, a storage or cluster administrator must investigate the problem. CAUTION: Do not start the RHCS service or SLE HA resource group again or try to start the failed RHCS service or SLE HA resource group without investigating the problem. When an RHCS service or SLE HA resource group using XP Cluster Extension fails, check the status of the XP disk pair on each copy and decide whether it is safe to continue.

The FC link is down (RHCS)


In RHCS, the detection of a storage outage due to failure of all paths to the storage depends on the monitoring capability of resources configured in the RHCS service. For example, the LVM and filesystem resource agents distributed with RHCS can detect the loss of storage and take appropriate actions. The stop operation on a service might fail due to the inability to stop individual resources cleanly. This may be caused by the loss of paths to the storage. When the stop operation on a service fails, RHCS marks the service as failed and the service does not automatically fail over to another node. To recover from this situation, use the following procedure: 1. 2. Remove the node that lost access to the storage by shutting down the node. Follow the steps required to bring up a service in a failed state, as documented in the RHCS administration guide. This process involves disabling the service, and then enabling it on the node where the service is allowed to come online.

XP Cluster Extension Software Administrator Guide

157

3.

Restart the node that was shut down. NOTE: The time to detect a storage outage due to failure of all paths to storage depends on the setting for no_path_retry in the multipath software configuration. A value of fail does not queue I/O in the event of a failure in all paths and returns an immediate failure. For information about the recommended value for your environment, see the DM-Multipath documentation. Some resource agents, such as LVM, offer a mechanism called self_fence to take themselves out of a cluster through node reboot when an underlying logical volume can no longer be accessed. For supported options, see the RHCS documentation.

A storage replication link is down (RHCS)


If an Cluster Extension configuration uses DR groups with failsafemode enabled, the array disables access to the disk when it cannot replicate the I/O to the remote array. In this situation, if a replication link is broken, the resource agents of configured resources, such as lvm or fs, may be able to detect and take appropriate actions. The stop operation on a service might fail due to the inability to stop individual resources cleanly because the disk is no longer accessible for read/write operations. When the stop operation on a service fails, RHCS marks the service as failed and the service does not automatically fail over to another node. To recover from this situation, use the following procedure: 1. 2. Remove the node that lost access to the storage by shutting down the node. Follow the steps required to bring up a service in a failed state, as documented in the RHCS administration guide. This process involves disabling the service, and then enabling it on the node where the service is allowed to come online. Restart the node that was shut down. NOTE: The time to detect a storage outage due to failure of all paths to storage depends on the setting for no_path_retry in the multipath software configuration. A value of fail does not queue I/O in the event of a failure in all paths and returns an immediate failure. For information about the recommended value for your environment, see the DM-Multipath documentation. Some resource agents, such as LVM, offer a mechanism called self_fence to take themselves out of a cluster through node reboot when an underlying logical volume can no longer be accessed. For supported options, see the RHCS documentation.

3.

A data center is down (SLE HA and RHCS)


RHCS and SLE HA expect an acknowledgement from the fencing device before services are failed over to another node. In the event of complete site failure, including fencing devices, clusters do not automatically fail over services to surviving cluster nodes at the remote site. Manual intervention is required in this situation. For instructions on bringing a service online, see the cluster software documentation.

158

Troubleshooting

Pair/resync monitor messages in syslog/errorlog/messages/event log


Using the pair/resync monitor will cause a message in the system log file of your operating system (for any non-PAIR state of the device group being monitored). Those messages might indicate the following: The XP RAID Manager instance is not running or cannot be used to gather device group state information. The device group is not in the PAIR state. This could be caused by XP Continuous Access Software link failures or manual manipulation of the disk pair state. TIP: Recover the PAIR state immediately, because replication of your data is not possible. Check monitored XP disk pairs by invoking the following command from the command line: clxchkmon n application_name g device_group show TIP: Disable application service failover for the time of the XP disk pair recovery (resynchronization). XP Cluster Extension's logic is based on the assumption that if the monitor is enabled, immediate action will be taken to recover a suspended XP disk pair.

Problem Resource XYZ: XP Cluster Extension: device group XYZ is not in PAIR state. This message appears even though the device group is in PAIR state. Solution If you are using the pair/resync monitor, the ResyncMonitorInterval must be less than or equal to the resource monitor interval for the XP Cluster Extension resource to prevent erroneous logging. The ResyncMonitorInterval in XP Cluster Extension defines when the pair/resync monitor checks the actual device group state. This state will be valid and shown until the next update (ResyncMonitorInterval) occurs. If the actual XP disk pair state changes between two ResyncMonitorInterval(s), the PAIR state shown by the pair/resync monitor will not be correct. The resource monitor checks the status of the XP Cluster Extension resource at the resource monitor interval of the cluster software. The XP Cluster Extension resource reports the status of the device group at that interval based on the current state in the pair/resync monitor. If the ResyncMonitorInterval is set to a higher value than the resource monitor interval for the XP Cluster Extension resource, the pair/resync monitor will update the device group state less often. However, the XP Cluster Extension resource logs messages only if the device group is not in PAIR state or if an XP RAID Manager error occurred (for example, if XP RAID Manager is not running).

XP Cluster Extension Software Administrator Guide

159

Example Set the XP Cluster Extension agent's MonitorInterval attribute to 60 seconds (the default value); then set the XP Cluster Extension resource ResyncMonitorInterval attribute to less than 60 seconds.

160

Troubleshooting

11 Support and other resources


Contacting HP
For worldwide technical support information, see the HP support website: http://www.hp.com/support Before contacting HP, collect the following information: Product model names and numbers Technical support registration number (if applicable) Product serial numbers Error messages Operating system type and revision level Detailed questions

Subscription service
HP recommends that you register your product at the Subscriber's Choice for Business website: http://www.hp.com/go/e-updates After registering, you will receive e-mail notification of product enhancements, new driver versions, firmware updates, and other product resources.

New and changed information in this edition


The following additions and changes have been made for this edition: The following information has been updated: Configuring XP Cluster Extension for Linux

Related information
The following documents and websites provide related information: HP HP HP HP HP HP StorageWorks StorageWorks StorageWorks StorageWorks StorageWorks StorageWorks XP Cluster Extension Software Installation Guide XP RAID Manager User's Guide XP Continuous Access Software User Guide XP Continuous Access Software Journal User Guide XP Business Copy Software User Guide SAN Design Reference Guide

You can find these documents on the Manuals page of the HP Business Support Center website: http://www.hp.com/support/manuals

XP Cluster Extension Software Administrator Guide

161

In the Storage section, click Storage Software, and then select your product.

White papers
The following white papers are available at www.hp.com/storage/whitepapers: Live Migration across data centers and disaster tolerant virtualization architecture with HP StorageWorks Cluster Extension and Microsoft Hyper-VTM Considerations in HP StorageWorks XP Cluster Extension configurations to stop automatic XP CA disk pair resynchronization when CA link is suspended Implementing HP StorageWorks Cluster Extension for Windows in a VMware Virtual Machine Migrating HP StorageWorks XP Cluster Extension Quorum Filter Service Implementations to Microsoft Majority Node Set Quorum Configurations Migrating an HP Serviceguard for Linux Cluster to a Novell SUSE Linux Enterprise High Availability Extension Cluster Migrating an HP Serviceguard for Linux Cluster to Red Hat Cluster Suite in Red Hat Enterprise Linux 5 Advanced Platform

HP websites
For additional information, see the following HP websites: http://www.hp.com http://www.hp.com/go/storage http://www.hp.com/service_locator http://www.hp.com/support/manuals http://www.hp.com/storage/spock www.hp.com/storage/whitepapers http://docs.hp.com/en/ha.html

Typographic conventions
Table 5 Document conventions Convention
Blue text: Table 5 Blue, underlined text: http://www.hp.com

Element
Cross-reference links and e-mail addresses Website addresses Keys that are pressed

Bold text

Text typed into a GUI element, such as a box GUI elements that are clicked or selected, such as menu and list items, buttons, tabs, and check boxes Text emphasis File and directory names

Italic text

Monospace text

System output Code Commands, their arguments, and argument values

162

Support and other resources

Convention
Monospace, italic text Monospace, bold text

Element
Code variables Command variables Emphasized monospace text

CAUTION: Indicates that failure to follow directions could result in damage to equipment or data.

IMPORTANT: Provides clarifying information or specific instructions.

NOTE: Provides additional information.

TIP: Provides helpful hints and shortcuts.

HP product documentation survey


Are you the person who installs, maintains, or uses this HP storage product? If so, we would like to know more about your experience using the product documentation. If not, please pass this notice to the person who is responsible for these activities. Our goal is to provide you with documentation that makes our storage hardware and software products easy to install, operate, and maintain. Your feedback is invaluable in letting us know how we can improve your experience with HP documentation. Please take 10 minutes to visit the following web site and complete our online survey. This will provide us with valuable information that we will use to improve your experience in the future. http://www.hp.com/support/storagedocsurvey Thank you for your time and your investment in HP storage products.

XP Cluster Extension Software Administrator Guide

163

164

Support and other resources

Glossary
CHA Channel adapter. A device that provides the interface between the array and the external host system. Occasionally, this term is used synonymously with the term channel host interface processor (CHIP). Command-line interface. An interface comprised of various commands which are used to control operating system responses. A volume on the disk array that accepts HP StorageWorks Continuous Access or HP StorageWorks Business Copy control operations which are then executed by the array. Control unit. Custom volume size. CVS devices (OPEN-x CVS) are custom volumes configured using array management software to be smaller than normal fixed-size OPEN system volumes. Synonymous with volume size customization (VCS). Dynamic-link library. Device Specific Module. Disconnecting a failed unit or path and replacing it with an alternative unit or path to continue functioning. Fibre Channel. A network technology primarily used for storage networks. A method of setting rejection of XP Continuous Access Software write I/O requests from the host according to the condition of mirroring consistency. Graphical User Interface. Globally unique identifier. High Availability Cluster Multi-Processing. An IBM application for AIX software. Host bus adapter. A periodic synchronization signal issued by cluster software or hardware to indicate that a node is an active member of the cluster. Logical device. An LDEV is created when a RAID group is carved into pieces according to the selected host emulation mode (that is, OPEN-3, OPEN-8, OPEN-9). The number of resulting LDEVs depends on the selected emulation mode. The term LDEV is also known as term volume. Logical unit. Logical unit number.

CLI command device

CU CVS

DLL DSM failover FC fence level GUI GUID HACMP HBA heartbeat LD, LDEV

LU LUN

XP Cluster Extension Software Administrator Guide

165

LUSE

The LUSE feature is available when the HP StorageWorks LUN Manager product is installed, and allows a LUN, normally associated with only a single LDEV, to be associated with 1 to 36 LDEVs. Essentially, LUSE makes it possible for applications to access a single large pool of storage. See also LD, LDEV Logical Volume Manager. Minimum active paths. Microsoft Management Console. Majority node set. Microsoft Cluster Service. Mirror unit. Network interface card. A device that handles communication between a device and other devices on a network. A path is created by associating a port, a target, and a LUN ID with one or more LDEVs. Also known as a LUN. Product Configuration File. A physical connection that allows data to pass between a host and the disk array. The number of ports on an XP disk array depends on the number of supported I/O slots and the number of ports available per I/O adapter. The XP family of disk arrays supports FC ports as well as other port types. Ports are named by port group and port letter, such as CL1-A, in which CL1 is the group, and A is the port letter. The data center location that owns the cluster group (quorum resource). Pair suspended-split. Primary volume. In MSCS, a cluster resource that has been configured to control the cluster, maintaining essential cluster data and recovery information. In the event of a node failure, the quorum acts as a tie-breaker and is transferred to a surviving node to ensure that data remains consistent within the cluster. Redundant array of independent disks. Small Computer Systems Interface. A standard, intelligent parallel interface for attaching peripheral devices to computers, based on a device-independent protocol. The data center location with the mirror copy of the quorum disk pair. System Manager Information Tool. Simplex.

LVM MINAP MMC MNS MSCS MU NIC path PCF port

primary site PSUS P-VOL quorum

RAID SCSI

secondary site SMIT SMPL

166

Glossary

split-brain syndrome SPOCK

A state of data corruption that can occur if a cluster is re-formed as subclusters of nodes at each site, and each subcluster assumes authority, starting the same set of applications and modifying the same data. Single Point of Connectivity Knowledge website. SPOCK is the primary portal used to obtain detailed information about supported HP StorageWorks product configurations. Single point of failure. Secondary or remote volume. The copy volume that receives the data from the primary volume. Service processor. A notebook computer built into the disk array. The SVP provides a direct interface to the disk array and used only by the HP service representative. Target mode SCSI. User account control. User configuration file. VERITAS Cluster Server. On the XP array, a volume is a uniquely identified virtual storage device composed of a control unit (CU) component and a logical device (LDEV) component separated by a colon. For example 00:00 and 01:00 are two uniquely identified volumes; one is identified as CU = 00 and LDEV = 00, and the other as CU = 01 and LDEV = 00; they are two unique separate virtual storage devices within the XP array. , Volume size customization. Also known as CVS. Virtual Machine. HP StorageWorks XP Business Copy Software. XP Business Copy Software lets you maintain up to nine local copies of logical volumes on the disk array. HP StorageWorks XP Continuous Access Software. XP Continuous Access Software lets you create and maintain duplicate copies of local logical volumes on a remote disk array.

SPOF S-VOL SVP

TMSCSI UAC UCF VCS volume

VSC VM XP Business Copy Software XP Continuous Access Software

XP Cluster Extension Software Administrator Guide

167

168

Glossary

Index
A
agent configuring, 75 disabling, 87 APPLICATION section description, 128 application service failover, 159 ApplicationDir object description, 128 ApplicationStartup object description, 129 AsyncTakeoverTimeout object description, 130 AutoFailbackType description, 48 automatic recovery, 149 AutoRecover object description, 131 rolling disaster protection, 142 cluster software integration with XP Cluster Extension, 11 ClusterNotifyCheckTime description, 133 UCF requirement, 64 ClusterNotifyWaitTime description, 133 UCF requirement, 64 clxhosts updating, 108 COMMON section description, 127 configuration configuration tool Windows, 37 consolidated site, 13 Linux, 97, 102 Microsoft Cluster Service, 37 one to one, 12 configuration information exporting, 41 importing, 41 configuration tool Windows, 37 contacting HP, 161 conventions document, 162

B
Basic Resource Health Check Interval description, 47 BCEnabledA object description, 131 BCEnabledB object description, 132 BCMuListA object description, 132 BCMuListB object description, 132 BCResyncEnabledA object description, 132 BCResyncEnabledB object description, 132 BCResyncMuListA object description, 133 BCResyncMuListB object description, 133

D
DataLoseDataCenter object description, 133 DataLoseMirror object description, 134 DC_A_Hosts object description, 134 DC_B_Hosts object description, 134 deleting a device group, 87 dependencies adding (CLI), 66 adding (Windows Server 2008/2008 R2), 65 Device Mapper Multipath Software Rescanning devices, 106 DeviceGroup object description, 135

C
CLI XP Cluster Extension, 12

XP Cluster Extension Software Administrator Guide

169

disaster tolerance, 11 disk pairs XP Continuous Access, 11 document conventions, 162 related documentation, 161 documentation HP website, 161 providing feedback, 163

H
HACMP bringing resource groups online, 30 configuring resources, 25 custom cluster events, 32, 33 failover error handling, 150 failure, 35 integrating XP Cluster Extension, 26 integration with XP Cluster Extension, 25 pair/resync monitor, 32 restrictions, 35 taking resource groups offline, 31 timing considerations, 34 help obtaining, 161 HP technical support, 161 Hyper-V Live Migration, 71, 74

E
enabling a service group, 155 error return codes failover, 150 exporting configuration information, 41

F
FailoverPeriod description, 48 FailoverThreshold description, 48 fast failback XP Continuous Access Asynchronous Software, 12 FASTFAILBACK value description, 130 FastFailbackEnabled object description, 135 VCS setting, 89 features XP Cluster Extension, 11 fence levels XP Continuous Access, 14 FenceLevel object description, 135 files clxhosts, 108 event log, 159 force flag, 144 services, 26, 78, 108 Filesystems object description, 135 force flag file, 144 forceflag option, 114

I
importing configuration information, 41 instances starting and stopping, 22 IsAlivePollInterval description, 47

J
JournalDataCurrency object description, 136

L
LInux timing considerations, 109 Linux Device Mapper Multipath Software, 106 pair/resync monitor, 108 pair/resync monitor integration, 109 live migration, 64, 71, 74 LocalDCLMForNonPAIRDG description, 136 log file location, 149 log files Microsoft Cluster Service, 73 MSCS, 74 LogDir object description, 127 LogLevel UCF requirement, 64 LogLevel object description, 127

G
group names Microsoft Cluster Service, 42, 44

170

LooksAlivePollInterval description, 47

M
majority node set Microsoft Cluster Service, 15 MergeCheckInterval UCF requirement, 64 Microsoft Cluster Service adding dependencies, 64 administration, 73, 74 changing resource names, 44, 45 configuration example, 66 configuration file, 123 configuring XP RAID Manager advanced properties, 58 configuring XP RAID Manager device group details, 58 configuring XP RAID Manager instances, 57 data center assignments, 59 group names, 42, 44 integration with XP Cluster Extension, 37 majority node set, 15 resource names, 42, 44 Microsoft Management Console, 46 mounting a file system, 121 multipath_rescan script, 106

PendingTimeout description, 48 post-execution programs invoking, 145 return codes, 147 PostExecCheck object description, 137 PostExecScript object description, 137 pre-execution programs invoking, 145 return codes, 146 PreExecScript object description, 137 programs post-execution, 145 pre-execution, 145

R
RaidManagerInstances object description, 137 recommendations log files, 74 recovering PAIR state, 159 recovery disk pair states, 119 procedures, 119 sequence, 120 recovery procedure, 60 Red Hat Cluster Service, 97, 102 environment file, 97, 102 related documentation, 161 remote management, 46, 62 Windows Server 2003, 73 Windows Server 2008/2008 R2, 73 removing a combination, 115 resource groups HACMP, 30, 31 resource names Microsoft Cluster Service, 42, 44

N
names changing (Microsoft Cluster Service), 45 changing (MSCS), 44 network considerations XP RAID Manager, 21

O
objects APPLICATION section, 128 COMMON section, 126 XP Cluster Extension, 123

P
pair/resync monitor configuring for Linux, 108 configuring for Microsoft Cluster Service, 38 HACMP integration, 32 integration with Microsoft Cluster Service, 52, 60 invoking, 143 port, 26, 38, 40, 78, 108 troubleshooting, 159

XP Cluster Extension Software Administrator Guide

171

resources adding for Microsoft Cluster Service, 42 adding for VCS, 80 adding for Windows Server 2008/2008 R2, 44 adding with the CLI, 44 bringing online, 70 changing attributes for VCS, 83 changing properties for Microsoft Cluster Service, 49 configuring for Microsoft Cluster Service, 45 configuring for VCS, 79 deleting for MSCS, 70 linking for VCS, 84 Microsoft Cluster Service, 45 properties (CLI), 63 properties (UCF), 64 taking offline, 70 Response to resource failure description, 47 RestartAction description, 47 RestartPeriod description, 47 RestartThreshold description, 47 resynchronizing a disk pair, 88, 109 ResyncMonitor object description, 138 rolling disaster protection, 142 ResyncMonitorAutoRecover attribute, 71, 87 ResyncMonitorAutoRecover object description, 138 ResyncMonitorInterval object description, 138 RESYNCWAIT value description, 130 ResyncWaitTimeout object description, 138 return codes post-execution, 147 pre-executable, 146 RHCS configuration file, 124 rolling disaster protection, 12 automatic recovery, 142 configuration with XP Business Copy Software, 141 pair/resync monitor, 142 restoring server operation, 142 setting in user configuration file, 141

S
SearchObject object description, 127 server restoring operation, 142 service or application bouncing, 72 ServiceGroupHB resources, 84 SG recovery, 109 starting errors, 150 StatusRefreshInterval description, 138 UCF requirement, 64 Subscriber's Choice, HP, 161

T
takeover function failure, 147 technical support HP, 161 Thorough Resource Health Check Interval description, 47 timing HACMP considerations, 34 Microsoft Cluster Service considerations, 72 timing considerations Linux, 109 troubleshooting offline condition (VCS), 90 XP Cluster Extension problems, 149 typographic conventions, 162

U
user configuraiton file LocalDCLMForNonPAIRDG, 136

172

user configuration file APPLICATION section, 128 ApplicationDir object, 128 ApplicationStartup object, 129 AsyncTakeoverTimeout object, 130 AutoRecover object, 131 BCEnabledA object, 131 BCMuListA object, 132 BCMuListB object, 132 BCResyncEnabledA object, 132 BCResyncEnabledB object, 132 BCResyncMuListA object, 133 BCResyncMuListB object, 133 COMMON section, 126, 127 DataLoseDataCenter object, 133 DataLoseMirror object, 134 DC_A_Hosts object, 134 DC_B_Hosts object, 134 DeviceGroup object, 135 FASTFAILBACK value, 130 FastFailbackEnabled object, 135 FenceLevel object, 135 Filesystems object, 135 HACMP, 123 HACMP customization, 28 JournalDataCurrency object, 136 LogDir object, 127 LogLevel object, 127 object formats, 124 objects, 123 PostExecCheck object, 137 PostExecScript object, 137 PreExecScript object, 137 RaidManagerInstances object, 137 ResyncMonitor object, 138 ResyncMonitorAutoRecover object, 138 ResyncMonitorInterval object, 138 RESYNCWAIT value, 130 ResyncWaitTimeout object, 138 sample, 139 SearchObject object, 127 specifying object values, 124 structure, 124 VcsBinPath object, 127 Vgs object, 139 XPSerialNumbers object, 139

Vgs object description, 139

W
websites HP , HP Subscriber's Choice for Business, 161 product manuals, 161

X
XP Business Copy Software rolling disaster protection, 141 XP Cluster Extension CLI, 12 cluster software, 11 configurations consolidated-site, 13 one-to-one, 12 configuring with Microsoft Cluster Service, 37 dependency on XP RAID Manager, 20 environments, 14 features, 11 XP Continuous Access configurations, 14 fence levels, 14 pairs, 11 XP Continuous Access Asynchronous Software fast failback, 12 XP RAID Manager, 20 creating instances, 20 device groups, 21 network considerations, 21 rolling disaster protection, 141 setting up, 20 starting and stopping instances, 22 testing failover and failback, 22 XPSerialNumbers object description, 139

V
VCS recovery, 88 VcsBinPath object description, 127 VERITAS Cluster Server (VCS) configuration file, 124 integration with XP Cluster Extension, 75

XP Cluster Extension Software Administrator Guide

173

174

También podría gustarte