Oracle9i Administrator's Reference
Release 2 (9.2.0.1.0) for UNIX Systems: AIX-Based Systems, Compaq Tru64 UNIX, HP 9000 Series HP-UX, Linux Intel, and Sun Solaris Part No. A97297-01 |
|
Oracle Cluster Management Software (OCMS) is available with Oracle9i on Linux systems. This appendix contains the following sections:
OCMS is included with the Oracle9i Enterprise Edition for Linux. It provides cluster membership services, a global view of clusters, node monitoring, and cluster reconfiguration. It is a component of Oracle9i Real Application Clusters on Linux and is installed automatically when you choose Oracle9i Real Application Clusters. OCMS consists of the following components:
Watchdog Daemon
Cluster Manager
Figure F-1 shows how the Watchdog daemon provides services to the Cluster Manager.
Figure F-1 Oracle Instance and Components of OCMS
The Watchdog daemon (watchdogd
) uses a software-implemented Watchdog timer to monitor selected system resources to prevent database corruption.Watchdog timer is a feature of the Linux kernel. The Watchdog daemon is part of Oracle9i Real Application Clusters.
The Watchdog daemon monitors the Cluster Manager and passes notifications to the Watchdog timer at defined intervals. The behavior of the Watchdog timer is partially controlled by the CONFIG_WATCHDOG_NOWAYOUT configuration parameter of the Linux kernel.
If you use Oracle9i Real Application Clusters, you must set the value of the CONFIG_WATCHDOG_NOWAYOUT configuration parameter to Y. If the Watchdog Timer detects an Oracle instance or Cluster Manager failure, it resets the instance to avoid possible database corruption.
For information on how to set the CONFIG_WATCHDOG_NOWAYOUT parameter, see the /usr/src/linux/Documentation/configure.help
file in the Linux kernel source code. For more information on Watchdog devices, see the /usr/src/linux/Documentation/watchdog.txt
file in the Linux kernel source code.
The Cluster Manager maintains the status of the nodes and the Oracle instances across the cluster. The Cluster Manager process runs on each node of the Real Applications Cluster. Each node has one Cluster Manager. The number of Oracle instances for each node is not limited by Oracle9i Real Application Clusters. The Cluster Manager uses the following communication channels between nodes:
Private network
Quorum partition on the shared disk
During normal cluster operations, the Cluster Managers on each node of the cluster communicate with each other through heartbeat messages sent over the private network. The quorum partition is used as an emergency communication channel if a heartbeat message fails. A heartbeat message can fail for the following reasons:
The Cluster Manager terminates on a node
The private network fails
There is an abnormally heavy load on the node
The Cluster Manager uses the quorum partition to determine the reason for the failure. From each node, the Cluster Manager periodically updates the designated block on the quorum partition. Other nodes check the timestamp for each block. If the message from one of the nodes does not arrive, but the corresponding partition on the quorum has a current timestamp, the network path between this node and other nodes has failed.
Each Oracle instance registers with the local Cluster Manager. The Cluster Manager monitors the status of local Oracle instances and propagates this information to Cluster Managers on other nodes. If the Oracle instance fails on one of the nodes, the following events occur:
The Cluster Manager on the node with the failed Oracle instance informs the Watchdog daemon about the failure.
The Watchdog daemon requests the Watchdog timer to reset the failed node.
The Watchdog timer resets the node.
The Cluster Managers on the surviving nodes inform their local Oracle instances that the failed node is removed from the cluster.
Oracle instances in the surviving nodes start the Oracle9i Real Application Clusters reconfiguration procedure.
The nodes must reset if an Oracle instance fails. This ensures that:
No physical I/O requests to the shared disks from the failed node occur after the Oracle instance fails.
Surviving nodes can start the cluster reconfiguration procedure without corrupting the data on the shared disk.
See Also: "Configuring Timing for Cluster Reconfiguration" and "Watchdog Daemon and Cluster Manager Starting Options" for more information on the Cluster Manager. |
The following sections describe how to start OCMS:
Configuring Timing for Cluster Reconfiguration
Note: Oracle Corporation supplies the$ORACLE_HOME/oracm/bin/ocmstart.sh sample startup script. Run this script as the root user. Make sure that the ORACLE_HOME and PATH environment variables are set as described in the Oracle9i Installation Guide Release 2 (9.2.0.1.0) for UNIX Systems. After you are familiar with starting the Watchdog daemon and the Cluster Manager, you can use the script to automate the start-up process.
|
To start the Watchdog daemon, enter the following:
$ su root # cd $ORALE_HOME/oracm/bin # watchdogd
Note: Always start the Watchdog daemon as theroot user.
|
The default Watchdog daemon log file is $ORACLE_HOME/oracm/log/wdd.log
.
The Watchdog daemon does not have configuration files. Table F-1 describes the arguments that you can use when starting the Watchdog daemon.
Table F-1 Watchdogd Daemon Arguments
Argument | Valid Values | Default Value | Description |
---|---|---|---|
-l number
|
0 or 1 | 1 | If the value is 0, no resources are registered for monitoring. This argument is used for debugging system configuration problems.If the value is 1, the Cluster Managers are registered for monitoring. Oracle Corporation recommends using this option for normal operations. |
-m number
|
5000 to 180000 ms | 5000 | The Watchdog daemon expects to receive heartbeat messages from all clients (oracm threads) within the time specified by this value. If a client fails to send a heartbeat message within this time, the Watchdog daemon stops sending heartbeat message to the kernel Watchdog timer, causing the system to reset.
|
-d string
|
|
/dev/watchdog
|
Path of the device file for the Watchdog timer. |
-e string
|
|
$ORACLE_HOME/oracm/log/wdd.log
|
Filename of the Watchdog daemon log file. |
You must create the $ORACLE_HOME/oracm/admin/cmcfg.ora
Cluster Manager configuration file on each node of the cluster before starting OCMS. Include the following parameters in this file:
PublicNodeNames
PrivateNodeNames
CmDiskFile
WatchdogTimerMargin
HostName
Before creating the cmcfg.ora
file, verify that the /etc/hosts
file on each node of the cluster has an entry for the public network (public name) and an entry for the private network (private name for each node). The private network is used by the Oracle9i Real Application Clusters internode communication. The CmDiskFile parameter defines the location of the Cluster Manager quorum partition. The CmDiskFile parameter on each node in a cluster must specify the same quorum partition.
The following example shows a cmcfg.ora
file on the first node of a four node cluster:
PublicNodeNames=pubnode1 pubnode2 pubnode3 pubnode4 PrivateNodeNames=prinode1 prinode2 prinode3 prinode4 CmDiskFile=/dev/raw1 WatchdogTimerMargin=1000 HostName=prinode1
Table F-2 lists all of the configurable Cluster Manager parameters in the cmcfg.ora
file.
Table F-2 Cluster Manager Parameters of the cmcfg.oraFile
Parameter | Valid Values | Default Value | Description |
---|---|---|---|
CmDiskFile | Directory path, up to 256 characters in length | No default value. You must set the value explicitly. | Specifies the pathname of the quorum partition. |
MissCount | 2 to 1000 | 5 | Specifies the time that the Cluster Manager waits for a heartbeat from the remote node before declaring that node inactive. The time in seconds is determined by multiplying the value of the MissCount parameter by 3. |
PublicNodeNames | List of host names, up to 4096 characters in length | No default value. | Specifies the list of all host names for the public network, separated by spaces. List host names in the same order on each node. |
PrivateNodeNames | List of host names, up to 4096 characters in length | No default value. | Specifies the list of all host names for the private network, separated by spaces. List host names in the same order on each node. |
HostName | A host name, up to 256 characters in length | No default value. | Specifies the local host name for the private network. Define this name in the /etc/hosts file.
|
ServiceName | A service, up to 256 characters in length | CMSrvr | Specifies the service name to be used for communication between Cluster Managers. If a Cluster Manager cannot find the service name in the /etc/services file, it uses the port specified by the ServicePort parameter.
ServiceName is a fixed-value parameter in this release. Use the ServicePort parameter if you need to choose an alternative port for the Cluster Manager to use. |
ServicePort | Any valid port number | 9998 | Specifies the number of the port to be used for communication between cluster managers when the ServiceName parameter does not specify a service. |
WatchdogTimerMargin | 1000 to 180000ms | No default value | The same as the value of the soft_margin parameter specified at Linux softdog startup. The value of the soft_margin parameter is specified in seconds and the value of the WatchdogTimerMargin parameter is specified in milliseconds.
This parameter is part of the formula that specifies the time between when the Cluster Manager on the local node detects an Oracle instance failure or join on any node and when it reports the cluster reconfiguration to the Oracle instance on the local node. See "Configuring Timing for Cluster Reconfiguration" for information on this formula. |
WatchdogSafetyMargin | 1000 to 180000ms | 5000ms | Specifies the time between when the cluster manager detects a remote node failure and when the cluster reconfiguration is started.This parameter is part of the formula that specifies the time between when the Cluster Manager on the local node detects an Oracle instance failure or join on any node and when it reports the cluster reconfiguration to the Oracle instance on the local node. See "Configuring Timing for Cluster Reconfiguration" for information on this formula. |
To start the Cluster Manager:
Confirm that the Watchdog daemon is running.
Confirm that the host names specified by the PublicNodeNames and PrivateNodeNames parameters in the cmcfg.ora
file are listed in the /etc/hosts
file.
As the root
user, start the oracm
process as a background process. Redirect any output to a log file. For example, enter the following:
$ su root # cd $ORACLE_HOME/oracm/bin # oracm </dev/null >$ORACLE_HOME/oracm/log/cm.out 2>&1 &
In the preceding example, all of the output messages and error messages are written to the $ORACLE_HOME/oracm/log/cm.out
file.
The oracm
process spawns multiple threads. To list all of the threads, enter the ps -elf
command.
Table F-3 describes the arguments of the oracm
executable.
Table F-3 Arguments for the oracm Executable
Argument | Values | Default Value | Description |
---|---|---|---|
/a:action
|
0,1 | 0 | Specifies the action taken when the LMON process or another Oracle process that can write to the shared disk terminates abnormally.
If |
/l :filename
|
Any | /$ORACLE_HOME/oracm/log/cm.log
|
Specifies the pathname of the log file for the Cluster Manager. The maximum pathname length is 192 characters. |
/?
|
None | None | Shows help for the arguments of the oracm executable. The Cluster Manager does not start if you specify this argument.
|
/m
|
Any | 25000000 | The size of the oracm log file in bytes.
|
To avoid database corruption when a node fails, there is a delay before the Oracle9i Real Application Clusters reconfiguration commences. Without this delay, simultaneous access of the same data block by the failed node and the node performing the recovery can cause database corruption. The length of the delay is defined by the sum of the following:
Value of the WatchdogTimerMargin parameter
Value of the WatchdogSafetyMargin parameter
Value of the Watchdog daemon -m
command-line argument
See also: Table F-2 for more information on the WatchdogTimerMargin and WatchdogSafetyMargin parameters, and Table F-1 for more information on the Watchdog daemon-m command-line argument.
|
If you use the default values for the Linux kernel soft_margin and Cluster Manager parameters, the time between when the failure is detected and the start of the cluster reconfiguration is 70 seconds. For most workloads this time can be significantly reduced. The following example shows how to decrease the time of the reconfiguration delay from 70 seconds to 20 seconds:
Set the value of WatchdogTimerMargin (soft_margin) parameter to 10 seconds.
Leave the value of the WatchdogSafetyMargin parameter at the default value, 5000ms.
Leave the value of the Watchdog daemon -m
command-line argument at the default value, 5000ms.
To change the values of the WatchdogTimerMargin (soft_margin) and the WatchdogSafetyMargin:
Stop the Oracle instance.
Reload the softdog
module with the new value of soft_margin. For example, enter:
#/sbin/insmod softdog soft_margin=10
Change the value of the WatchdogTimerMargin in the $ORACLE_HOME/oracm/admin/cmcfg.ora
file. For example, edit the following line:
WatchdogTimerMargin=50000
Restart watchdogd
with the -m
command-line argument set to 5000.
Restart the oracm
executable.
Restart the Oracle instance.
OCMS supports node fencing by completely resetting the node if an Oracle instance fails and the Cluster Manager thread malfunctions. This approach guarantees that the database is not corrupted.
However, it is not always necessary to reset the node if an Oracle instance fails. If the Oracle instance uses synchronous I/O, a node reset is not required. In addition, in some cases where the Oracle instance uses asynchronous I/O, it is not necessary to reset the node, depending on how asynchronous I/O is implemented in the Linux kernel. For a list of certified Linux kernels that do not require node-reset, see the Oracle Technology Network Web site at the following URL:
http://otn.oracle.com
The /a:
action
flag in the following command defines OCMS behavior when an Oracle process fails:
$ oracm /a:[action]
In the preceding example, if the action
argument is set to 0, the node does not reset.By default, the watchdog
daemon starts with the -l 1
option and the oracm
process starts with the /a:0
option. With these default values, the node resets only if the oracm or watchdogd
process terminates. It does not reset if an Oracle process that can write to the disk terminates. This is safe if you are using a certified Linux kernel that does not require node-reset.
In the preceding example, if the action
argument is set to 1, the node resets if the oracm
command, watchdogd
daemon, or Oracle process that can write to the disk terminates. In these situations, a SHUTDOWN ABORT command on an Oracle instance resets the node and terminates all Oracle instances that are running on that node.
|
![]() Copyright © 1996, 2002 Oracle Corporation All rights reserved |
|