In this blog post, I want to expand a little bit further on Roland’s post, DRBD Reactor – Promotor. As Roland covered, the promoter plugin of DRBD Reactor can be used to build simple HA failover clusters.
Roland’s article includes an example with a single DRBD resource and then a filesystem mount on the DRBD resource. This example is relatively simple as it only starts one resource. However, in this article, we will also include an example from Pacemaker as a side-by-side comparison to demonstrate the simplicity of the DRBD Reactor’s promoter plugin.
As my colleague Roland did, I, too, want to stress that we do not see this as a replacement or competing tool compared to Pacemaker. Pacemaker is very mature software that has been a core component of the Linux HA stack for over a decade. Pacemaker is far more robust in that it can manage resources without DRBD and implement proper node-level fencing methods. When properly configured and tested, Pacemaker can genuinely get you 99.999% uptime.
A more straightforward tool might be desired for some very simple clusters where we only need to manage a few resources and don’t necessarily need the coveted “five nines” of uptime. This article aims to highlight just how less complex DRBD Reactor can be compared to Pacemaker.
Example 1: DRBD Reactor
Our examples here will just be managing a virtual machine that uses a DRBD device as its backing disk. To set this up with DRBD Reactor, the process, in short, is:
- Configure a DRBD resource on all three nodes
- Install a VM on the DRBD resource
- Configure DRBD Reactor to manage these two resources
The first step is the DRBD resources. The configuration in our example is:
resource r0 {   device /dev/drbd0;   disk /dev/vg_drbd/lv_r0;   meta-disk internal;   options { auto-promote no; quorum majority;  on-no-quorum io-error;   }   on kvm-0 { address  192.168.222.20:7777; node-id  0;   }   on kvm-1 { address  192.168.222.21:7777; node-id  1;   }   on kvm-2 { address  192.168.222.22:7777; node-id  2;   }   connection-mesh {     hosts kvm-0 kvm-1 kvm-2;   } }
Then we need to promote that DRBD resource to primary on one of the nodes and install a VM on the DRBD device. We’ll omit the details here, but once that is created, we copy the resulting `/etc/libvirt/qemu/kvm-test.xml` file to all three nodes. For good measure, I’d suggest testing a manual failover before proceeding.
Lastly, we create a configuration in the `/etc/drbd-reactor.d/` directory (on all nodes) and restart the drbd-reactor service everywhere.
[[promoter]] # Specify which resource should be watched. For example resource 'foo': [promoter.resources.r0] start = [ Â Â "ocf:heartbeat:VirtualDomain p_kvm-test config=/etc/libvirt/qemu/kvm-test.xml" ]
Example 2: Pacemaker
Same as the last example. Just a DRBD resource and a single guest VM instance. The “short” version here only requires one additional step, Corosync, but as we’ll see, the configuration files for Corosync, or Pacemaker, are not trivial.
- Configure a DRBD resource on all three nodes
- Install a VM on the DRBD resource
- Configure Corosync as the communication layer for Pacemaker
- Configure Pacemaker to manage these two resources
The first step is the DRBD resource. I’ll omit this configuration as it will be identical to the last example. We’ll also skip installing and setting up the VM on the DRBD device because this will also be identical to the previous example. Again, the resulting VM configuration needs to be copied to all nodes in the cluster.
Next, we need to configure the communication layer, Corosync, that Pacemaker will use. To do this, we generate a `/etc/corosync/conrosync.conf` configuration on all three nodes.Â
I am aware that pcs and pcsd can automatically generate this configuration file for you. However, that adds yet another layer here, and I want to keep this example as simple as possible.
totem {   version: 2   secauth: off   cluster_name: linbit-cluster   transport: udpu   #rrp_mode: passive } nodelist {   node { ring0_addr: 192.168.222.20 nodeid: 1   }   node { ring0_addr: 192.168.222.21 nodeid: 2   }   node { ring0_addr: 192.168.222.22 nodeid: 3   } } quorum {   provider: corosync_votequorum   #two_node: 1 } logging {   to_syslog: yes }
Next, you can start Pacemaker, and once it starts, you can begin crafting up a CIB (cluster information base), which can be thought of as Pacemaker configuration, but it’s not a file you should be editing by hand. Sure, you can import one from a file, but that would imply you already had one lying around. So instead, you need to define the cluster resources and their ordering via some frontend (usually pcs or crmsh). Granted, you could instead use the lower-level crm builtin commands, but we’ll use the crmsh interactive shell for this example.
crm(live)configure# primitive p_drbd_r0 ocf:linbit:drbd \ params drbd_resource="r0" \ op monitor interval="29s" role="Master" \ op monitor interval="31s" role="Slave" crm(live)configure# ms ms_drbd_r0 p_drbd_r0 \ meta master-max="1" master-node-max="1" \ clone-max="3" clone-node-max="1" \ notify="true" crm(live)configure# primitive p_kvm-test ocf:heartbeat:VirtualDomain \ params config=/etc/libvirt/qemu/kvm-test.xml \ op monitor interval="30" timeout="30s" \ op start interval="0" timeout="240s" \ op stop interval="0" timeout="120s" crm(live)configure# order o_drbd_promote_before_virtdom_kvm-test \ ms_drbd_r0:promote p_virtdom_kvm-test crm(live)configure# colocation c_virtdom_kvm-test_on_drbd_master \ inf: p_virtdom_kvm-test ms_drbd_r0:Master
Conclusion
Again, I want to stress that I am by no means implying that DRBD Reactor’s promoter plugin is in any way better than Pacemaker, just that it’s simpler. Simplicity does come at a price, though. While the example above doesn’t implement node-level fencing/stonith, Pacemaker can do so. DRBD Reactor has no option for this. So, if the DRBD backed VM in this example doesn’t want to stop cleanly, there is no way we can escalate, and an admin would have to force things to stop.
Still, the whole DRBD Reactor configuration is a simple three lines compared to the 20+ lines to do essentially the same thing in Corosync and Pacemaker.