DRBD® is an important open source software component that can replicate data across multiple servers and can help make applications highly available. Traditionally, system administrators have used tools such as Pacemaker and DRBD Reactor to manage DRBD resources in a cluster and to facilitate high availability (HA) by handling fail-over situations. There are several use cases though where you might want (or need) to avoid these cluster resource managers (CRMs).
The point of this post is to demonstrate how to safely manage DRBD resources with systemd as an alternative to using a CRM. Whether due to certain requirements or because you want a simplified design, this tutorial will walk you through setting up DRBD with the systemd units provided by drbd-utils
. drbd-utils
is an open source collection of user space utilities for DRBD. This includes the CLI client, drbdadm
, for example, and various scripts and systemd unit files. You can install the drbd-utils
collection of utilities from the project’s open source repository hosted on GitHub, or else by using a package manager on a supported RPM or DEB based system1.
If you have developed your own Linux cluster management solution and need a unified way to bootstrap DRBD and its dependent services, I recommend deploying the systemd template units shipped with drbd-utils
. You can use these systemd units to manage DRBD resources efficiently because the units provide the means for starting, stopping, and monitoring DRBD resources.
These units were initially targeted to be used with DRBD Reactor, an open source program that automates the management of DRBD resources and events. However, you can use these same systemd units in your environment and without needing to use DRBD Reactor. By using systemd units to manage DRBD resources, you can automate the starting and stopping of the DRBD services within your cluster. This can increase both reliability and maintainability.
Additionally, this solution makes it easier if, later on, you decide to implement DRBD Reactor into your system. It would involve very few changes because your setup would already be compatible with the needs DRBD Reactor has. This will not only make for smooth operations now but will also make it so that you can easily scale up your infrastructure and integrate advanced management tools for DRBD at some later time, whenever you might need to.
Terminology
Distributed Replicated Block Device (DRBD): A distributed replicated storage system for the Linux platform. DRBD mirrors block devices between multiple hosts, functioning transparently to applications on the host systems. This replication can involve any type of block device, such as hard disk drives, partitions, RAID setups, or logical volumes.
Pacemaker: An open source CRM that orchestrates services across multiple servers, ensuring they run reliably and handle failovers for high availability. It is complex software and not easy to set up for the first time.
DRBD Reactor: An open source LINBIT®-developed service and DRBD events manager that automates managing DRBD resources, for example, to handle failover events. DRBD Reactor reads DRBD events and works with systemd to simplify control and monitoring. You might think of it as a “lightweight Pacemaker with a focus on DRBD.”
systemd: A system and service manager for Linux that initializes the system and manages processes and services during operation. systemd is a software suite that provides an array of system components for Linux operating systems. The main aim is to unify service configuration and behavior across Linux distributions.
Preparing the System
You need to have at least one DRBD resource to work with. To install DRBD, you have several options. You can learn about different ways to install DRBD in the DRBD 9 User Guide.
After successfully installing DRBD and initializing DRBD kernel modules, you need to configure your DRBD resources. Refer to the DRBD 9 User Guide for details.
To continue, this article assumes that you have completed the following setup steps:
- Configured a DRBD resource.
- Disabled the automatic promotion of DRBD resources by setting the following DRBD configuration option:
options { auto-promote no; }
- Completed (or skipped if no preexisting data) the initial resource synchronization.
- Created a file system on top of your DRBD virtual block device. This article uses XFS as an example file system.
📝 NOTE: To avoid data divergence, you can use the DRBD 9 quorum feature. The quorum feature requires at least three nodes, although one of the nodes can be an intentionally diskless client node, to be used as a tiebreaker and witness node only.
If you really wanted to, you can have a 2-node setup. However, this can be a complicated cluster environment to set up. To prevent data divergence in a 2-node cluster, you would need to set up some sort of fencing, and then freeze I/O whenever your cluster lost a peer node and until the fencing handlers have determined that it is safe to continue to write I/O to the blocks. Due to the setup complexity and the risk of data divergence in a 2-node setup, a 3-node setup that implements quorum is recommended in most cases.
The example setup that I use in this article starts with not using fencing or quorum, so that I can show data divergence scenarios later. This will also serve to show why taking precautions to avoid data divergence (so-called “split-brain” scenarios) is a good idea.
Later, you can add a third “witness” or “tiebreaker” node and enable DRBD quorum, to show how that changes behavior in your cluster.
In this example, the DRBD resource name is s0
, and has two volumes: 0 and 1. I like to name my example or “scratch” resources starting with s
, but if you want, you can think of “service” instead. I try to use $res
and $vol
variables everywhere though, as placeholders for my resource name and volume number.
If the drbd-udev
rules are in effect and working properly, these volumes will show up as /dev/drbd/by-res/$res/$vol
. Verify that you installed the drbd-udev
package before loading DRBD into your kernel by using a modprobe
command.
Next, you need to prepare some mount points for your DRBD volumes. In this example, I use /ha/$res/$vol
. You might use another naming scheme that makes sense to you.
To verify that the setup is healthy, try to start the DRBD resource manually by creating a shell script and then running it.
For example, create the file health.sh
with the following contents:
#!/bin/bash
set -xe
res=s0
drbdadm adjust $res
drbdadm primary $res
for vol in 0 1; do
mnt=/ha/$res/$vol
mkdir -p $mnt
mount /dev/drbd/by-res/$res/$vol $mnt
mkdir -p $mnt/demo
df -TPh $mnt/demo
umount $mnt
done
drbdadm secondary $res
drbdadm down $res
echo "ALL GOOD"
❗ IMPORTANT: Replace occurrences of
$res
,$vol
, and$mnt
to match the naming in your environment, or else set environment variables for these.
Creating and running the shell script should work without any errors, and also leave the DRBD resource in a down state on the node you run the script on.
Configuring systemd to Start a DRBD Resource
The journalctl
command is widely used to troubleshoot and investigate systemd units and events. Before configuring systemd to start a DRBD resource, here are some tricks to have journalctl
show only new entries, in case you might need to troubleshoot:
# use a journal "cursor"
cursor=$(journalctl -n0 --show-cursor | sed -ne 's/^-- cursor: //p')
journalctl --after-cursor "$cursor"
# use a timestamp
t0=$(date "+%F %H:%M:%S")
journalctl --since "$t0"
You might also want to have an extra journalctl -f
command running somewhere, so that you can monitor what happens behind-the-scenes and in real-time, now that you are going to hand over control to systemd.
To use systemd to start the DRBD resource, enable and start the DRBD resource systemd target unit by entering the following command:
res=s0; systemctl enable --now drbd@$res.target
You can verify that this brings the DRBD resource up by entering the following command:
res=s0; drbdsetup status $res
Output from the command should show the s0
resource as up and in a secondary role.
Repeat this process on peer nodes.
Using systemd to Ensure Network is Online
The [email protected]
requires the network to be online. To ensure this requirement, you need to enable one of the following services to satisfy that requirement:
systemctl enable systemd-networkd-wait-online.service
systemctl enable NetworkManager-wait-online.service
For more details, refer to man [email protected]
and man [email protected]
. (I want to believe that someone is reading manual pages!)
Using systemd to Ensure LVM Backing Devices Are Available
In this example, the backing devices of s0
are all LVM logical volumes, so you also need to enter the following command:
res=s0; systemctl enable drbd-lvchange@$res.service
This service helps systemd get the start order correct, that is, LVM logical volumes are made available before trying to start DRBD resources that the logical volumes back. It might also make it so that you get some useful error messages where you might expect to look for them, in case the back-end logical volumes cannot be activated for some reason. This has the side effect that systemctl stop [email protected]
will also deactivate the backing LVM logical volumes. Enabling this service is optional. If your setup ensures that the backing devices are available before [email protected]
is even started, you can ignore this [email protected]
.
These systemd template units all use SyslogIdentifier=drbd-%I
. In this case, you can get the relevant info from res=s0; journalctl -t drbd-$res
.
To avoid being confused by messages from previous attempts, always check the timestamp for applicability to the attempt that you are concerned with. Or use a journalctl --after-cursor
command as mentioned earlier.
Now that systemd knows how to start a DRBD resource, you can next give systemd the ability to mount and promote the resource.
Configuring systemd to Promote and Mount a DRBD Resource
For this use case, I prefer an explicitly promoted DRBD resource rather than an “auto-promoted” one. To do this, you can use the [email protected]
systemd unit template.
The [email protected]
template pulls in [email protected]
if necessary. To avoid a premature promotion attempt that would just fail, you can enable this service.
res=s0; systemctl enable drbd-wait-promotable@$res.service
Without this, trying to start a DRBD resource might fail because DRBD was not connected to the peer yet, and possibly only in a “Consistent” state (if you configured fencing), or not yet quorate (when using the DRBD quorum feature).
Enabling this service has the effect that the [email protected]
is only reached once the heuristics in this script show that the DRBD resource is “promotable”. The [email protected]
template requires [email protected]
, and so the service specific to a DRBD resource would not even attempt to promote the resource until that is the case.
This is another thing that DRBD Reactor would normally take care of for you.
Promoting the DRBD Resource Before Mounting the File System
Next, tell systemd that your mount needs to first promote the DRBD resource.
You could use an entry in /etc/fstab
and have systemd generate the mount units, using “magic options” like x-systemd.requires
and similar. Refer to the systemd.mount
manual page for details if you want to learn more. I prefer to use BindsTo
, and not rely on Requires
, for more flexibility.
To do this, create explicit systemd.mount
units by using the following create-mount-units.sh
script:
#!/bin/bash
res=s0
for vol in 0 1; do
cat > /etc/systemd/system/ha-$res-$vol.mount <<EOF
[Unit]
Documentation=man:fstab(5) man:systemd.mount(5)
After=drbd-promote@$res.service
BindsTo=drbd-promote@$res.service
PartOf=drbd-services@$res.target
[Mount]
Where=/ha/$res/$vol
What=/dev/drbd/by-res/$res/$vol
Options=defaults,noauto,nofail
EOF
done
systemctl daemon-reload
When you start the service by entering a systemctl start /ha/$res/$vol
command, systemd will know about the dependency, and will promote the DRBD resource that backs the mount point first. The [email protected]
will pull in the [email protected]
if necessary.
Observe that a systemctl stop /ha/$res/$vol
command will unmount the mount point, but leave the DRBD resource promoted. That is for now, at least. I will cover this topic in the [email protected]
section soon.
If your service needs more than one mount, the systemd way to express that is to useRequiresMountsFor
.
Creating a Demonstration Service for Testing
Next, create a demonstration service that requires mounts, then shows file system disk space usage by using a df
command, and then spawns one sleep
per mount with current working directory in the respective ./demo
subdirectory.
Create another shell script called demoservice.sh
with the following contents:
#!/bin/bash
res=s0
mounts=$(for vol in 0 1; do printf " %s" /ha/$res/$vol; done)
# be careful with the quoting and variable expansion in the here!
cat > /etc/systemd/system/my-demo.service <<EOF
[Unit]
Description=My Demo
PartOf=drbd-services@$res.target
RequiresMountsFor=$mounts
[Service]
ExecStart=/bin/bash -x -e -c 'for m in $mounts; do df -TPh \$m/demo; cd \$m/demo ; sleep 86400 & done ; echo Demo service start.; wait'
ExecStop=/bin/bash -x -e -c 'echo "Demo service stop."'
[Install]
RequiredBy=drbd-services@$res.target
EOF
systemctl daemon-reload
systemctl start my-demo.service
systemctl status my-demo.service
Run the shell script after you create it.
At this point, stopping the service (or else a failing or crashing service) will not demote the DRBD resource on the node yet. You can fix that, however.
You can add the StopWhenUnneeded=true
option to explicit mount units, and to the [email protected]
unit, either to the template, or to only the instance specific to your DRBD resource.
You can also use the [email protected]
template (which I already used in the PartOf
directives earlier).
Testing the Demonstration Service
Verify that the demonstration service works as intended by entering the following command:
systemctl enable my-demo.service
To have it be required by your [email protected]
unit, enter the following command:
res=s0; systemctl start drbd-services@$res.target
Now your node should be in a primary role for the DRBD resource, both volumes should be mounted, and there should be one sleep
process each with current working directories /ha/s0/0/demo
and /ha/s0/1/demo
.
Next, enter the following command to stop DRBD services on your node:
res=s0; systemctl stop drbd-services@$res.target
The sleep
dummy services should be stopped, and DRBD should be unmounted and demoted, but still instantiated. You can use journalctl
, df
and drbdadm status
commands to verify that this is the case.
Entering a systemctl start my-demo.target
command should return everything to a running state again. After this, you can enter a pkill sleep
command, or, if you prefer to be specific, kill the specific process numbers shown by a systemctl status my-demo.service
command. This my-demo.service
would then be considered inactive. However, you only have a RequiredBy
option on the target, so your [email protected]
is still considered active.
To propagate service problems to the target, and from there by specifying the PartOf
directives to the other units, you need a BindsTo
option.
You cannot use symlinks with the BindsTo
option. However, you can add it in an override for the [email protected]
unit.
Disable and remove the [Installed] RequiredBy
option from the my-demo.service
by creating another shell script called changemyservice.sh
with the following contents:
#!/bin/bash
res=s0
systemctl disable my-demo.service
sed -i -e '/^ExecStop/q' /etc/systemd/system/my-demo.service
mkdir -p /etc/systemd/system/drbd-services@$res.target.d
printf "[Unit]\n""BindsTo=my-demo.service\n" > \
/etc/systemd/system/drbd-services@$res.target.d/my-demo.conf
systemctl daemon-reload
systemctl start drbd-services@$res.target
systemctl status my-demo.service
After running this script, services should be running.
You can test the systemd service by entering a pkill sleep
command.
Due to the use of the BindsTo
option, this command causes [email protected]
to no longer remain active. Additionally, because of the PartOf
option, this stop action propagates to your mount units and to the [email protected]
service. In fact, the PartOf
in [email protected]
alone would be enough, because the BindsTo
and the After
directives in the mount units would ensure services are stopped (and mount points unmounted) automatically.
If all processes that use the mount points are managed by systemd service units, which are either bound to or part of the appropriate target and depend on the mounts, all mount points should be able to unmount properly, and DRBD should be able to demote successfully.
If there might be other processes that use these mounted file systems and would keep them active, you have several options to end those processes when you need to demote DRBD.
For example, instead of mount units, you could use an OCF wrapper service template, and have the Filesystem
resource agent (part of the Pacemaker cluster stack and also included in the LINBIT resource-agents
package) try to kill all processes that are using the file system before unmounting it.
But for “well behaved” services, what I have shown is good enough for now.
Enabling and Starting DRBD Services
Next, enable and start the DRBD services systemd target unit for your DRBD resource, by entering the following command:
res=s0; systemctl enable --now drbd-services@$res.target
Or else you can keep the services target disabled, and have it only come up as secondary with the [email protected]
that you already enabled earlier. This way, you can decide at runtime if and when you want to start the target by using a systemctl start [email protected]
command.
If you have more than one service that depend on more than one DRBD resource, you can add an aggregate target unit to bind to all the [email protected]
systemd units and then enable that aggregate target.
Controlling Which Node Will Run Services
Other options to control which node will run services include the various systemd condition checks. For example, you can use ConditionPathExists=/run/my-demo-primary
. You can simply create a file in this directory that you can use as a required condition for a node to become primary in the cluster.
Optionally you can combine that with a systemd.path
unit on that same file. Sometimes though it is easier for some automation logic to create that file by using a touch /run/my-demo-primary
command, rather than using a systemctl start my-demo.target
command.
After systemd considers the DRBD units to be “active”, systemd will not notice a manually entered drbdadm down
command. For this reason, you could end up in a situation where systemd thinks [email protected]
is reached, while in fact you manually unloaded DRBD already, bypassing this systemd unit logic.
If that happens, you can clean up the system by entering the following command:
res=s0; systemctl stop drbd@$s0.service
This should bring down the s0
DRBD resource, and systemd should take care of bringing down all services that depend on it, provided that systemd realizes they are active.
Of course, you could add various monitoring scripts or hacks here and there, or create additional systemd service units to detect specific situations and trigger actions. However, at that point, if you need to respond to DRBD state changes, you are better off using DRBD Reactor to respond to these changes automatically. At some point, trying to force everything through systemd alone might no longer makes sense.
Conclusion
I believe this introduction to managing DRBD with systemd units should cover most use cases.
For more information on this topic or if you might need support for your particular use case, you can reach out to the LINBIT team. You might also ask for help within the LINBIT Community Forums. If you are an existing LINBIT customer, just check in with the team through the LINBIT customer portal if you need help.
- LINBIT publishes RPM and DEB packages in its official customer repositories. There are also DEB packages for LINBIT open source software in a couple of public package repositories, including one for Proxmox VE and a personal package archive (PPA) for Ubuntu Linux. While convenient for testing and homelab setups, LINBIT does not maintain packages in its public repositories to the same level as those in its customer repositories. Packages in LINBIT public repositories also fall outside the scope of LINBIT support. However, LINBIT sometimes pushes software release candidates to the public repositories for community testing. While LINBIT does not support public repository packages, feedback from the community, especially about release candidates, is always appreciated.↩︎