Highly Available NFS Exports with DRBD & Pacemaker

Highly available NFS

Updated December 2023 based on user submitted suggestions and a recent LINBIT technical review.

Creating Highly Available NFS Targets with DRBD and Pacemaker

This blog post explains how to configure a highly available (HA) active/passive NFS server on a three-node Linux cluster by using DRBD® and Pacemaker. You can implement this solution on DEB or RPM-based Linux distributions, for example, Ubuntu, or Red Hat Enterprise Linux (RHEL).

NFS is well-suited for many use cases because it:

  • Enables many computers to access the same files, so everyone on the network can use the same data.
  • Reduces storage costs by having computers share applications rather than needing local disk space for each user application.

The system preparation requirements for this use case are:

  • Two diskful nodes for data replication, one diskless node for quorum purposes.
  • A separate network link for DRBD replication. (This is best practice, not mandatory.)
  • Pacemaker and Corosync are installed on all nodes and their services are enabled to start at system boot time.
  • Open Cluster Framework (OCF) resource agents are installed on all nodes. (If you are a LINBIT® customer, install the resource-agents and drbd-pacemaker packages to install the resource agents used in this article.)
  • A virtual IP address, required for the NFS server. (In this article, the OCF IPaddr2 resource agent is used to automatically set this to 172.16.16.102 in the Pacemaker configuration.)
  • The latest version of DRBD is installed on all nodes and loaded into kernel. (Available from Github, or through LINBIT customer repositories. See the DRBD 9.0 User’s Guide for more details.)
  • An NFS server is installed on all nodes for redundancy. (The NFS server service should not be enabled to start as Pacemaker will start the server when necessary.)
  • All cluster nodes can resolve each other’s hostnames. (Check /etc/hosts or your local DNS server.)
  • SELinux and any firewalls in use are configured to allow appropriate traffic on all nodes. (Refer to the DRBD 9.0 User’s Guide for more information.)
  • The crmsh and pcs CLI utilities are installed on all nodes, for editing and managing the Pacemaker configuration.

After completing these initial preparation steps, you can create your high availability NFS cluster.

Creating a Logical Volume and Directory for the NFS Share

Before creating DRBD resources in the cluster, you need to create a physical volume on top of the physical device (drive). These commands should be entered as root user or else prefaced with sudo.

To do that, enter:

pvcreate /dev/sdx

Here, “x” in sdx corresponds to the letter identifying your physical device.

Then create the volume group, named ha_vg, by entering:

vgcreate ha_vg /dev/sdx

Next, create the logical volume that DRBD will consume. You can replace the “300G” with a size appropriate for your use, or else use the -l 100 option rather than -L 300G in the following command if you want the logical volume to use 100% of your physical volume.

lvcreate -L 300G -n ha_HA_lv ha_vg

After creating the logical volume, create the directory that will serve as the NFS share and recursively give the directory appropriate access permissions.

mkdir -p /nfsshare/exports/HA
chmod 777 -R /nfsshare

Also, create the “tickle” directory on all diskful nodes. This directory will be used by the portblock OCF resource agent.

mkdir -p /srv/drbd-nfs/nfstest/

Configuring DRBD

After preparing your backing storage device and a file system mount point on your nodes, you can next configure DRBD to replicate the storage device across the nodes.

Creating a DRBD Resource File

DRBD resource configuration files are located in the /etc/drbd.d/ directory. Resource files need to be created on all cluster nodes. You can create a resource file on one node and then use the rsync command to distribute the file to other nodes. Each DRBD resource defined in the resource configuration file needs a different TCP port. There is only one defined resource in this configuration, so the configuration just uses one TCP port (7003) here. Use the text editor of your choice to create the DRBD resource file as shown below. Change the host names and IP addresses to reflect your network configuration.

📝 NOTE: The third cluster node only serves a quorum function in the cluster. It is not involved in DRBD replication. In the configuration, it is identified as “diskless”.

vi /etc/drbd.d/ha_nfs.res

resource ha_nfs {
  device "/dev/drbd1003";
  disk "/dev/ha_vg/ha_HA_lv";
  meta-disk internal;
  options {
    on-no-quorum suspend-io;
    quorum majority;
  }
  net {
    protocol C;
    timeout 10;
    ko-count 1;
    ping-int 1;
  }
  connection-mesh {
    hosts "drbd1" "drbd2" "drbd3";
  }
  on "drbd1" {
    address 172.16.16.111:7003;
    node-id 0;
  }
  on "drbd2" {
    address 172.16.16.112:7003;
    node-id 1;
  }
  on "drbd3" {
    disk none;
    address 172.16.16.113:7003;
    node-id 2;
  }
}

Initializing DRBD Resources

After creating the DRBD resource configuration file, you need to initialize DRBD resources. To do that, enter the following commands as ‘root’ user or use sudo. The first two commands must be entered and run on both diskful cluster nodes.

drbdadm create-md ha_nfs 
drbdadm up ha_nfs

As this is a new file system with no data content, you can skip the initial synchronization. However, use caution with this command because it can delete any existing data on the logical volume. Enter the following command on a diskful node:

drbdadm new-current-uuid --clear-bitmap ha_nfs/0

Enter and run the following command on only one of the two cluster nodes. This command forces the node to become primary and creates the file system. Then, DRBD will replicate the file system to the other cluster node.

drbdadm primary --force ha_nfs
mkfs.ext4 /dev/drbd1003

Next, check the drbdadm status and lsblk commands.

The drbdadm status command should show that DRBD is in sync and UpToDate. If everything looks fine, use the following command to change primary DRBD to secondary.

drbdadm secondary ha_nfs

Creating NFS Exports and Pacemaker Resources

There are two ways to create Pacemaker resources. The first way is by directly editing the Pacemaker configuration file by using the interactive crm shell. The second way is by using the pcs command-line tool. This example uses the crm shell to edit the Pacemaker configuration.

To enter the crm shell, enter crm. Next, edit the Pacemaker configuration by entering configure edit. Press the “i” key (if your default editor is Vi or Vim) to enter the insert mode to edit the file.

In this mode, delete everything in the configuration file and paste in the following configuration. Remember to change hostnames, subnets, and IP addresses to match your network configuration.

node 1: drbd1
node 2: drbd2
node 3: drbd3
primitive p_virtip IPaddr2 \
        params \
            ip=172.16.16.102 \
            cidr_netmask=32 \
        op monitor interval=0s timeout=40s \
        op start interval=0s timeout=20s \
        op stop interval=0s timeout=20s
primitive p_drbd_attr ocf:linbit:drbd-attr
primitive p_drbd_ha_nfs ocf:linbit:drbd \
        params \
            drbd_resource=ha_nfs \
        op monitor timeout=20 interval=21 role=Slave \
        op monitor timeout=20 interval=20 role=Master
primitive p_expfs_nfsshare_exports_HA exportfs \
        params \
            clientspec="172.16.16.0/24" \
            directory="/nfsshare/exports/HA" \
            fsid=1003 unlock_on_stop=1 options=rw \
        op monitor interval=15s timeout=40s \
        op_params OCF_CHECK_LEVEL=0 \
        op start interval=0s timeout=40s \
        op stop interval=0s timeout=120s
primitive p_fs_nfsshare_exports_HA Filesystem \
        params \
            device="/dev/drbd1003" \
            directory="/nfsshare/exports/HA" \
            fstype=ext4 \
            run_fsck=no \
        op monitor interval=15s timeout=40s \
        op_params OCF_CHECK_LEVEL=0 \
        op start interval=0s timeout=60s \
        op stop interval=0s timeout=60s
primitive p_nfsserver nfsserver
primitive p_pb_block portblock \
        params \
            action=block \
            ip=172.16.16.102 \
            portno=2049 \
            protocol=tcp
primitive p_pb_unblock portblock \
        params \
            action=unblock \
            ip=172.16.16.102 \
            portno=2049 \
            tickle_dir="/srv/drbd-nfs/nfstest/.tickle" \
            reset_local_on_unblock_stop=1 protocol=tcp \
        op monitor interval=10s timeout=20s
ms ms_drbd_ha_nfs p_drbd_ha_nfs \
        meta master-max=1 master-node-max=1 \
        clone-node-max=1 clone-max=3 notify=true
clone c_drbd_attr p_drbd_attr
colocation co_ha_nfs inf: \
        p_pb_block \
        p_virtip \
        ms_drbd_ha_nfs:Master \
        p_fs_nfsshare_exports_HA \
        p_expfs_nfsshare_exports_HA \
        p_nfsserver p_pb_unblock
property cib-bootstrap-options: \
        have-watchdog=false \
        cluster-infrastructure=corosync \
        cluster-name=nfscluster \
        stonith-enabled=false

After you have finished editing the configuration, save the file and exit the editor by entering :x (if your default editor is Vi or Vim). Next, commit the changes by entering the configure commit command in the crm shell. Pacemaker will then try to start the NFS server service on one of the nodes. Enter quit to leave the crm shell. Then enter and run the following command to clean up (restart in a way) cluster resources:

pcs resource cleanup

Next, enter pcs status to see if everything is working fine.

With that, you have configured an NFS high availability cluster, and the NFS share is ready to be used by clients on your network. You can verify the availability of your NFS share by entering the command showmount -e 172.16.16.102 from any host that is on the 172.16.16.0/24 network.

Having DRBD and Pacemaker running on your cluster stack makes it so that if one cluster node fails, the other cluster node will take over seamlessly. This is because you have prepared redundant services and because DRBD ensures real-time, up-to-date data replication.

If you need more details or help from our experienced team, check out our HA for NFS solutions or contact the experts at LINBIT.

Yusuf Yıldız

Yusuf Yıldız

After nearly 15 years of system and storage management, Yusuf started to work as a solution architect at LINBIT. Yusuf's main focus is on customer success and contributing to product development and testing. As part of the solution architects team, he is one of the backbone and supporter of the sales team.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.