OVERVIEW
SBD is a storage-based extended protection service that stands for STONITH Block Device.
The highest priority for highly available clusters is to protect data integrity. This protection is achieved by preventing uncoordinated parallel access to data stores. Clusters use several control mechanisms to achieve this goal.
However, electing several DCs in a cluster can result in network partitions or software failures. If this so-called "node splitting" is allowed to occur, data corruption may occur.
The main mechanism that can be used to avoid this situation is node shielding through STONITH. If SBD is used as a node shielding mechanism the node can be shut down in the event of a node split without the need for an external shutdown device.
SBD Components and Mechanisms
SBD Partition
In an environment where all nodes have access to shared storage, a small partition of the device is formatted for use with SBD. the size of this partition depends on the block size of the used disks (for example, for a standard SCSI disk with a block size of 512 bytes, the size of this partition would be 1 MB; a DASD disk with a block size of 4 KB would require a partition with a 4 MB size). The initialization process creates a message layout on the device, configuring message slots for up to 255 nodes.
SBD Daemon
After configuring the appropriate SBD daemon, bring it online on each node and start the cluster. It terminates after all other cluster components have shut down, ensuring that cluster resources are never activated without SBD supervision.
Messages
This daemon automatically assigns one of the message slots on the partition to itself and continuously monitors it for messages sent to itself. Upon receiving a message, the daemon will immediately execute a request, such as initiating a power-down or reboot cycle for masking.
In addition, this daemon continuously monitors connectivity to the storage device and terminates itself when it is unable to connect to the partition. This ensures that it does not disconnect from the mask message. If clustered data resides in the same logical unit in different partitions, the workload will terminate as soon as connectivity to storage is lost, so no additional points of failure are added.
softdog
Whenever you use SBD, you must ensure that the checkpoints are working properly. Newer systems support hardware checkpoints, which need to be "energized" or "fed" by a software component. The software component (in this case the SBD daemon) "feeds" the checkpoints by periodically writing service pulses to the checkpoints. If the daemon stops feeding checkpoints, the hardware forces the system to reboot. This prevents the SBD process itself from failing, such as losing response or getting stuck due to I/O errors.
If a Pacemaker cluster is active, SBD will not block itself when the majority of the device's nodes are lost. For example, suppose your cluster contains three nodes: A, B, and C. Due to network separation, A can only see itself, while B and C can still communicate with each other. In this case, there are two cluster partitions, one with a quorum due to the majority of nodes (B and C) and the other without (A). If this happens when the majority of the blocking devices are inaccessible, node A will immediately shut itself down, while nodes B and C will continue to operate.
SBD Usage Requirements
- Up to three SBD devices can be used for storage-based shielding. When using one to three devices, the shared storage must be accessible from all nodes.
- The path to the shared storage device must be permanent and consistent across all nodes in the cluster. Use stable device names, such as /dev/disk/by-id/dm-uuid-part1-mpath-abcedf12345.
- Shared storage can be connected via Fibre Channel (FC), Fibre Channel over Ethernet (FCoE), or even iSCSI, and zone-based RAID and multipathing are recommended for reliability.
- An SBD device can be shared between different clusters as long as the number of nodes sharing the device does not exceed 255.
Setting up the SBD device
SBD Device Initialization
To use SBD with shared storage, you must first create message layouts on one or three block devices. sbd create command writes metadata headers to one or more of the specified devices. It will also initialize message slots for up to 255 nodes. If the command is executed without any other options, the default timeout settings are used.
Note: Ensure that the device or devices to be used for SBD do not hold any important data. Execution of the sbd create command directly rewrites approximately the first MB of the specified block device(s) without further requests or backups.
- Decide which block device or block devices to use for SBD.
- Use the following command to initialize the SBD device:
root # sbd -d /dev/SBD create
- Please replace /dev/SBD with the actual path name, for example:
/dev/disk/by-id/scsi-ST2000DM001-0123456_Wabcdefg。
- Check what has been written to the device:
root # sbd -d /dev/SBD dump
Header version : 2.1
UUID : 619127f4-0e06-434c-84a0-ea82036e144c
Number of slots : 255
Sector size : 512
Timeout (watchdog) : 5
Timeout (allocate) : 2
Timeout (loop) : 1
Timeout (msgwait) : 10
==Header on disk /dev/SBD is dumped
To edit the SBD configuration file
- Open the file
/etc/sysconfig/sbd
- Search for the following parameter:
SBD_DEVICE
This parameter specifies the device to be monitored and used to exchange SBD messages.
Edit this line and replace SBD with your SBD device:
SBD_DEVICE="/dev/SBD"
If you need to specify more than one device, use a semicolon to separate the devices (the order of the devices is irrelevant):
SBD_DEVICE="/dev/SBD; /dev/SBD1;/dev/SBD2"
If you are not able to access the SBD device, the daemon will not be able to start the cluster and will be disabled.
-
Search for the following parameter:
SBD_DELAY_START
Enables or disables delays. Set SBD_DELAY_START to yes (if msgwait is relatively long and cluster nodes boot quickly). Setting this parameter to yes delays SBD startup at boot time. This delay is sometimes required for virtual machines.
Enable the SBD daemon after adding the SBD device to the SBD profile The SBD daemon is a critical part of the cluster. This daemon needs to be run when the cluster is running. Therefore, whenever you start the pacemaker service, as a dependency, you must also start the sbd service.
Enabling and Starting the SBD Service
On each node, enable the SBD service:
root # systemctl enable sbd
Whenever the Pacemaker service is started, the SBD service will be started along with the Corosync service.
On each node, restart the cluster:
root # pcs stonith sbd enable device="/dev/SBD"
root # pcs cluster restart
This action automatically triggers the start of the SBD daemon.
Testing the SBD Device
The following command dumps the node slots and their current messages from the SBD device:
root # sbd -d /dev/SBD list
You should now see all the cluster nodes that have booted with SBD listed here. For example, if you have a two-node cluster, the message slot should show clear for both nodes:
0 alice clear
1 bob clear
Try sending the test message to one of the nodes:
root # sbd -d /dev/SBD message alice test
This node will acknowledge receipt of the message in the syslog file:
May 03 16:08:31 alice sbd[66139]: /dev/SBD: notice: servant: Received command test from bob on disk /dev/SBD
This confirms that SBD is indeed running properly on the node and is ready to receive messages.