Highly Available NFS Exports with DRBD & Pacemaker

Highly available NFS

Create Highly Available NFS Targets with DRBD and Pacemaker

This blog post explains how to configure a highly available (HA) active/passive NFS server on a two-node Linux cluster using DRBD® and Pacemaker. NFS is preferred for many use cases because it:

  • Enables multiple computers to access the same files, so everyone on the network can use the same data.
  • Reduces storage costs by having computers share applications instead of needing local disk space for each user application.

This use case requires that your system include the following components:

  • Two diskfull nodes for data replication, one diskless node for quorum purposes.
  • A separate network link for the replication (This is best practice, not mandatory.)
  • A virtual IP address, required for the NFS server (An IP address in the 172.16.16.0/24 subnet is used here.)
  • Pacemaker and resource agents are installed on all nodes and enabled to start as a service.
  • The latest version of DRBD is installed on all nodes and loaded into kernel (Available from Github, or through LINBIT® customer repositories. See the DRBD 9.0 User’s Guide for more details.)
  • An NFS server is installed on all nodes (not enabled to start as Pacemaker will start the server when necessary)
  • All cluster nodes can resolve each other’s hostnames (Check /etc/hosts or your local DNS server.)
  • SELinux and any firewalls in use are configured to allow appropriate traffic on all nodes (Please check the DRBD 9.0 User’s Guide for more information.)
  • The `crmsh` and `pcs` CLI utilities are installed on all nodes (for editing and managing the Pacemaker configuration)

After completing these initial preparation steps, you can create your cluster.

Create a Logical Volume and Directory for the NFS Share

Before creating DRBD resources in the cluster, you need to create a physical volume on top of the physical device (drive). These commands should be entered as ‘root’ user or else prefaced with `sudo.`

To do that, enter:

pvcreate /dev/sdx

where “x” in “sdx” corresponds to the letter identifying your physical device.

Then create the volume group, named ‘ha_vg,’ by entering:

vgcreate ha_vg /dev/sdx

Next, create the logical volume that DRBD will consume. You can replace the “300G”  with a size appropriate for your use.

lvcreate -L 300G -n ha_HA_lv ha_vg

After creating the logical volume, create the directory that will serve as the NFS share and recursively give the directory appropriate access permissions.

mkdir -p /nfsshare/exports/HA
chmod 777 -R /nfsshare

Create a DRBD Resource File

DRBD resource configuration files are located in the */etc/drbd.d/* directory. Resource files need to be created on all cluster nodes. You can create a resource file on one node and then use the `rsync` command to distribute the file to other nodes. Each DRBD resource defined in the resource configuration file needs a different TCP port. There is only one defined resource in this configuration, so the configuration just uses one TCP port (7003) here. Use the text editor of your choice to create the DRBD resource file as shown below. Change the host names and IP addresses to reflect your network configuration. Please note that the third cluster node only serves a quorum function in the cluster. It is not involved in DRBD replication. In the configuration, it is identified as “diskless”.

vi /etc/drbd.d/ha_HA_lv.res

resource ha_HA_lv {
  device "/dev/drbd1003";
  disk "/dev/ha_vg/ha_HA_lv";
  meta-disk internal;
  options {
    on-no-quorum suspend-io;
    quorum majority;
  }
  net {
    protocol C;
    timeout 10;
    ko-count 1;
    ping-int 1;
  }
  connection-mesh {
    hosts "drbd1" "drbd2" "drbd3";
  }
  on "drbd1" {
    address 172.16.16.111:7003;
    node-id 0;
  }
  on "drbd2" {
    address 172.16.16.112:7003;
    node-id 1;
  }
  on "drbd3" {
    disk none;
    address 172.16.16.113:7003;
    node-id 2;
  }
}

Initialize DRBD Resources

After creating the DRBD resource configuration file, you need to initialize DRBD resources. To do that, enter the following commands as ‘root’ user or use `sudo.` The first two commands must be entered and run on both cluster nodes.

drbdadm create-md ha_HA_lv 
drbdadm up ha_HA_lv

As this is a new filesystem with no data content, you can skip the initial synchronization. However, use caution with this command because it can destroy any existing data on the logical volume.

drbdadm new-current-uuid --clear-bitmap ha_HA_lv/0

Enter and run the following command on only one of the two cluster nodes. This command forces the node to become primary and creates the filesystem. Then, DRBD will replicate the filesystem to the other cluster node.

drbdadm primary --force ha_HA_lv
mkfs.ext4 /dev/drbd1003

Now check the `drbdadm status` and `lsblk` commands.

`drbdadm status` should show that DRBD is in sync and “UpToDate”. If everything looks fine, use the following command to change primary DRBD to secondary.

drbdadm secondary ha_HA_lv

Create NFS Exports and Pacemaker Resources

There are two ways to create Pacemaker resources. The first way is by directly editing the Pacemaker configuration file using the interactive CRM Shell. The second way is by using the `pcs` command-line tool. In this example, we’ll use the CRM Shell to edit the Pacemaker configuration.

To enter the CRM Shell, enter `crm conf edit`, then press “i” to enter editing mode.

In this mode, delete everything in the configuration file and paste in the configuration below. Please do not forget to change hostnames, subnets, and IP addresses to match your network configuration.

node 1: drbd1
node 2: drbd2
node 3: drbd3
primitive p_drbd_attr ocf:linbit:drbd-attr
primitive p_HA_lv_nfs ocf:linbit:drbd \
  params drbd_resource=ha_HA_lv \
  op monitor interval=11s timeout=20s role=Master \
  op monitor interval=13s timeout=20s role=Slave
primitive p_nfs_HA_fs Filesystem \
  params device="/dev/drbd1003" directory="/nfsshare/exports/HA" \
    fstype=ext4 run_fsck=no \
  op monitor interval=15 timeout=40 \
  op start timeout=40 interval=0 \
  op stop timeout=40 interval=0
primitive p_nfs_HA_exp exportfs \
  params fsid=10003 unlock_on_stop=1 options=rw directory="/nfsshare/exports/HA" \
    clientspec="172.16.16.0/24" \
  op monitor interval=15 timeout=40 \
  op start timeout=40 interval=0 \
  op stop timeout=40 interval=0
primitive p_nfs_nfs_ip IPaddr2 \
  params ip=172.16.16.102 cidr_netmask=32 \
  op monitor interval=15 timeout=40 \
  op start timeout=40 interval=0 \
  op stop timeout=40 interval=0
primitive p_nfs_server nfsserver
primitive pb_b portblock \
  params action=block ip=172.16.16.102 portno=2049 protocol=tcp
primitive pb_u portblock \
  params action=unblock ip=172.16.16.102 portno=2049 \
    tickle_dir="/srv/drbd-nfs/nfstest/.tickle" \
    reset_local_on_unblock_stop=1 protocol=tcp \
  op monitor interval=10s timeout=20s
ms ha_HA_lv_clone p_HA_lv_nfs \
  meta clone-max=3 notify=true master-max=1
clone c_drbd_attr p_drbd_attr
colocation co_nfs_nfstest inf: pb_b p_nfs_nfs_ip ha_HA_lv_clone:Master \
  p_nfs_server p_nfs_HA_fs p_nfs_HA_exp pb_u
location lo_nfs_nfstest { p_nfs_nfstest_fs } resource-discovery=never \
  rule -inf: #uname ne drbd1 and #uname ne drbd2 and #uname ne drbd3
order o_nfs_nfstest pb_b p_nfs_nfs_ip ha_HA_lv_clone:promote p_nfs_server \
  p_nfs_HA_fs p_nfs_HA_exp pb_u
property cib-bootstrap-options: \
  have-watchdog=false \
  cluster-infrastructure=corosync \
  cluster-name=nfscluster \
  stonith-enabled=false

After you have finished editing the configuration, save the file. Next, commit the changes by entering the `commit` command in the CRM Shell. Pacemaker will then try to start the NFS server service on one of the nodes. Enter `exit` to leave the CRM Shell. Then enter and run the following command to cleanup (restart in a way) cluster resources.

pcs resource cleanup

Next, enter `pcs status` to see if everything is working fine.

With that, you have configured an NFS HA cluster, and the NFS share is ready to be used by clients on your network. Having DRBD and Pacemaker running on your cluster stack ensures that should one cluster node fail, the other cluster node will take over with an up-to-date copy of your data.

If you need more details and help from our experienced team, please contact the experts at LINBIT Support.

Share on facebook
Share on twitter
Share on linkedin
Share on reddit
Share on whatsapp
Share on vk
Share on email

Share this post

Yusuf Yıldız

Yusuf Yıldız

After nearly 15 years of system and storage management, Yusuf started to work as a solution architect at Linbit. Yusuf's main focus is on customer success and contributing to product development and testing. As part of the solution architects team, he is one of the backbone and supporter of the sales team.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.