In this blog post, I want to expand a little bit further on my colleague, and DRBD Reactor developer, Roland’s post, High-Availability with DRBD Reactor & Promoter. As Roland covered, DRBD® Reactor and its promoter plugin can be used to build simple high-availability (HA) failover clusters.
Roland’s article includes an example with a single DRBD resource and then a file system mount on the DRBD resource. This example is relatively simple as it only starts one resource. However, in this article, I will also include an example from Pacemaker as a side-by-side comparison to demonstrate the simplicity of the DRBD Reactor promoter plugin.
As my colleague Roland did, I, too, want to stress that we do not see this as a replacement or a competing tool to Pacemaker. Pacemaker is very mature software that has been a core component of the Linux HA stack for over a decade. Pacemaker is far more robust in that it can manage resources without DRBD and implement proper node-level fencing methods. When properly configured and tested, Pacemaker can genuinely get you 99.999% uptime.
However, you might want a more straightforward tool for some very simple clusters where you only need to manage a few resources and don’t necessarily need the coveted “five nines” of uptime. This article aims to highlight just how less complex configuring DRBD Reactor can be compared to Pacemaker.
Example 1: DRBD Reactor
The examples here will just show configurations for managing a virtual machine that uses a DRBD device as its backing disk. To set this up with DRBD Reactor, an outline of the steps is:
- Configure a DRBD resource on all three nodes.
- Install a virtual machine (VM) on the DRBD resource.
- Configure DRBD Reactor to manage these two resources.
The first step is to configure the DRBD resources. The configuration in this example is:
resource r0 {
device /dev/drbd0;
disk /dev/vg_drbd/lv_r0;
meta-disk internal;
options {
auto-promote no;
quorum majority;
on-no-quorum io-error;
}
on kvm-0 {
address 192.168.222.20:7777;
node-id 0;
}
on kvm-1 {
address 192.168.222.21:7777;
node-id 1;
}
on kvm-2 {
address 192.168.222.22:7777;
node-id 2;
}
connection-mesh {
hosts kvm-0 kvm-1 kvm-2;
}
}
Then you need to promote that DRBD resource to a primary role on one of the nodes and install a VM on the DRBD device. The details are omitted here, but once that is created, you copy the resulting /etc/libvirt/qemu/kvm-test.xml
file to all three nodes. For good measure, I’d suggest testing a manual fail-over event before proceeding, to verify that the DRBD resource can become primary on another node.
Lastly, create a configuration snippet file for the promoter plugin in the /etc/drbd-reactor.d/
directory on all nodes, and then restart the drbd-reactor
service, again on all nodes.
[[promoter]]
[promoter.resources.r0]
start = ["ocf:heartbeat:VirtualDomain p_kvm-test config=/etc/libvirt/qemu/kvm-test.xml"]
Example 2: Pacemaker
The scenario in this example is the same as the last example: just a DRBD resource and a single guest VM instance. The “short” version here only requires one additional step, Corosync, but as the example will show, the configuration files for Corosync, or Pacemaker, are not trivial. An outline of the steps is:
- Configure a DRBD resource on all three nodes.
- Install a VM on the DRBD resource.
- Configure Corosync as the communication layer for Pacemaker.
- Configure Pacemaker to manage these two resources.
The first step is the DRBD resource. I’ll omit showing this configuration as it will be identical to the last example. The example will also skip showing the installing and setting up of the VM on the DRBD device because this will also be identical to the previous example. Again, the resulting VM configuration needs to be copied to all nodes in the cluster.
Next, you need to configure the communication layer, Corosync, that Pacemaker will use. To do this, you create a configuration file, /etc/corosync/conrosync.conf
, on all three nodes.
I am aware that pcs
and pcsd
can automatically generate this configuration file for you. However, that adds yet another layer here, and I want to keep this example as simple as possible.
totem {
version: 2
secauth: off
cluster_name: linbit-cluster
transport: udpu
}
nodelist {
node {
ring0_addr: 192.168.222.20
nodeid: 1
}
node {
ring0_addr: 192.168.222.21
nodeid: 2
}
node {
ring0_addr: 192.168.222.22
nodeid: 3
}
}
quorum {
provider: corosync_votequorum
}
logging {
to_syslog: yes
}
Next, you can start Pacemaker, and once it starts, you can begin creating a cluster information base (CIB), which can be thought of as the Pacemaker configuration file. However, it’s not a file that you should be editing by hand. Sure, you can import one from a file, but that would imply that you already had one lying around. So instead, you need to define the cluster resources and their ordering by using some front end utility (usually pcs
or crmsh
). Granted, you could instead use the lower-level crm
built-in commands, but in this example, you will use the CRM shell (crmsh
) in interactive mode.
crm(live)configure# primitive p_drbd_r0 ocf:linbit:drbd \
params drbd_resource="r0" \
op monitor interval="29s" role="Master" \
op monitor interval="31s" role="Slave"
crm(live)configure# ms ms_drbd_r0 p_drbd_r0 \
meta master-max="1" master-node-max="1" \
clone-max="3" clone-node-max="1" \
notify="true"
crm(live)configure# primitive p_kvm-test ocf:heartbeat:VirtualDomain \
params config=/etc/libvirt/qemu/kvm-test.xml \
op monitor interval="30" timeout="30s" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="120s"
crm(live)configure# order o_drbd_promote_before_virtdom_kvm-test \
ms_drbd_r0:promote p_virtdom_kvm-test
crm(live)configure# colocation c_virtdom_kvm-test_on_drbd_master \
inf: p_virtdom_kvm-test ms_drbd_r0:Master
Conclusion
Again, I want to stress that I am by no means implying that using DRBD Reactor and its promoter plugin is in any way better than using Pacemaker, just that it’s simpler. Simplicity does come at a price, though. While the example above doesn’t implement node-level fencing or STONITH, Pacemaker can do that. DRBD Reactor has no option for these features. So, if the DRBD backed VM in this example doesn’t want to stop cleanly, there is no way we can escalate, and a systems administrator would have to force things to stop.
Still, the whole DRBD Reactor configuration is a simple four lines (or two lines without the comments) compared to the 20+ lines to do essentially the same thing in Corosync and Pacemaker.
If you want to learn more about DRBD Reactor and using it for HA scenarios, there are how-to technical guides available for downloading on the LINBIT® website:
- Deploying an HA NFS Cluster with DRBD and DRBD Reactor on RHEL 9 or AlmaLinux 9
- Using LINSTOR and DRBD Reactor to Deploy a Highly Available MariaDB Service
Or if you want to discuss using DRBD Reactor for the particulars of your needs and environment, you can also contact the experts at LINBIT.
This video by Yusuf Yıldız, LINBIT Solution Architect, reviews the scope of functionality, differences in handling, unique strengths, and demos of DRBD Reactor and Pacemaker.