Replication Speed and CPU Efficiency Improvements in DRBD 9.3.1

The release of DRBD® 9.3.1 brings with it a “streaming I/O” feature. This feature means that DRBD can allocate compound pages from the kernel as I/O buffers. When applications make large I/O requests or streaming accesses, using compound pages can improve performance because DRBD will use CPU resources more efficiently.

Background on DRBD memory allocation

Historically, LINBIT® developers have optimized DRBD for replicating data for high availability use cases, with a focus on workloads that have small, frequent I/O patterns. While DRBD is not limited to replicating data writes with these I/O patterns, making applications such as databases and messaging queues highly available is a common use case for DRBD, because its performance excels in this area. There are, of course, use cases that make large I/O requests, such as video streaming and transcoding, large file copies, and others.

Linux supports I/O requests from 512 bytes to 1MiB in 512-byte increments. Before DRBD 9.3.1, DRBD handled its memory allocation by requesting one 4 KiB page at a time. This could mean up to 256 allocations for a 1 MiB write.

The state of storage and networks today is not what it was more than 25 years ago when LINBIT CEO Philipp Reisner created DRBD as a master thesis project in 1999 while attending university. Networking speeds are reaching 400Gbps, and 800Gbps is being standardized in IEEE 802.3df. Storage device technology, for example, SSD and NVMe, is getting faster with each generation. Technical advances in these hardware categories highlight the importance of DRBD becoming more CPU-efficient and not becoming a bottleneck for highly available applications.

Background on compound pages in the Linux kernel

For technical background about compound pages, you can read an interesting article about the pagemap process. As a brief summary, compound pages are 2^n “physically contiguous pages”. The Linux kernel uses compound pages similarly to large pages supported by hardware. Compound pages efficiently group smaller contiguous pages into a single huge page, making page table operations more CPU-efficient.

With the streaming I/O feature introduced in DRBD 9.3.1, DRBD has a new memory allocation strategy. DRBD will now attempt a single higher-order allocation first (try to use compound pages), and fall back to allocating order-0 pages (the traditional DRBD memory allocation strategy) on failure.

This DRBD memory allocation strategy will reduce DRBD CPU use, particularly on secondary nodes. Secondary nodes need to do fresh memory allocations with every incoming write replicated from the primary node. The primary node does not need to allocate a receive memory buffer for its own writes. The primary node only needs to read from a buffer already created by an application doing the data writes.

📝 NOTE: For pre-5.1 kernels, DRBD will always allocate order-0 pages, to prevent a buffer overflow risk because of the lack of a bio_for_each_bvec kernel helper for handling multi-page allocations.

Early performance benchmark test results

DRBD 9.3.1 is already used on LINBIT internal production systems. To compare the impact of the streaming I/O DRBD feature, LINBIT Solutions Architect Matt Kereczman ran the following before-and-after benchmark test on a system with an Intel® Xeon® Silver 4112 CPU @ 2.60GHz. The backing storage device Kereczman used to run the fio tests was a Samsung M.2 NVMe PCIe SM963, model number MZQKW480HMHQ-00003.

Fio command used as a benchmark test on both systems:

fio --name=test0 \
  --readwrite=write \
  --bs=1m \
  --direct=1 \
  --numjobs=1 \
  --filename=/dev/drbd10

Before running both tests, output from a drbdadm status command run on node0 (primary role) showed node1 in a secondary role:

r0 role:Primary
  disk:UpToDate open:no
  node1 role:Secondary
    peer-disk:UpToDate

Summary of parameters:

Parameter DRBD 9.3.0-rc.1 DRBD 9.3.1
DRBD Version 9.3.0-rc.1 9.3.1
Transport TCP TCP
Read/Write Mode Write (Sequential, 1MB block) Write (Sequential, 1MB block)
I/O Engine psync psync
I/O Depth 1 1
Total Data Transferred 40.0 GiB 40.0 GiB

Summary of fio benchmark test results:

Metric DRBD 9.3.0-rc.1 DRBD 9.3.1 Delta
Throughput (MiB/s) 816 MiB/s 882 MiB/s +8.09%
Throughput (MB/s) 855 MB/s 925 MB/s +8.19%
IOPS (avg) 816 882 +8.09%
Run time 50,210 ms 46,434 ms −7.52%
clat avg (usec) 1,179.33 1,089.33 −7.63%
clat min (usec) 803 690 −14.07%
clat max (usec) 21,899 9,692 −55.74%
clat stdev (usec) 412.20 471.61 +14.41%
lat avg (usec) 1,223.73 1,131.71 −7.52%
clat p50 (usec) 1,123 979 −12.82%
clat p90 (usec) 1,319 1,270 −3.71%
clat p99 (usec) 3,032 3,392 +11.87%
clat p99.9 (usec) 5,473 5,604 +2.39%
clat p99.99 (usec) 12,649 7,701 −39.12%
BW avg (KiB/s) 836,075 903,991 +8.12%
BW stdev (KiB/s) 56,337 9,969 −82.30%
IOPS stdev 55.02 9.74 −82.30%
CPU usr 3.77% 3.71% −0.06pp
CPU sys 7.30% 7.55% +0.25pp
drbd10 util 92.01% 91.90% −0.11pp
nvme0n1 util 42.86% 41.41% −1.45pp

These benchmark test results show the positive impact that the DRBD streaming I/O feature had, including:

  • About an 8% higher average throughput and correspondingly shorter test run time
  • Significantly lower and more consistent latency at typical percentiles (p50 improved by about 13%)
  • Significantly more stable throughput, as indicated by the bandwidth standard deviation dropping by 82%, seeming to show that DRBD 9.3.1 with the streaming I/O feature sustained its speed much more evenly throughout the test
  • Much better worst-case latency, where the absolute maximum completion latency was reduced by more than half, from 21.9 ms to 9.7 ms

 

The one minor caveat is a slight uptick at the p99 latency tail (3,392 compared with 3,032 µs).

📝 NOTE: Performance benchmark results will vary depending on the hardware in systems. The benefits of the DRBD streaming I/O feature might be much higher or lower depending on the ratio of CPU speed compared to network and backing storage device performance.

Independent benchmarking results

SIOS (LINBIT Japanese partner) engineer has also shared some exciting early results with the LINBIT team. After running benchmarking tests comparing DRBD 9.2.7 and 9.3.1, using a RAM disk and fio, the partner confirmed an increase from about 1.4 GB/s to 2.0 GB/s and “a drastic reduction in latency.”

Conclusion

The DRBD streaming I/O optimization improves how DRBD allocates memory for receiving a write request. Before, DRBD allocated 4KiB (4096 bytes) pages, until it had enough to buffer the write request. That means for a 1MiB write request, DRBD made 256 4KiB allocations. With the new code in DRBD 9.3.1, DRBD now tries to allocate 1MiB in a single kernel call. That takes less time and consumes fewer CPU cycles. The effect becomes more significant if the CPU is slow, or the network and backing block device throughput is very high.

As mentioned earlier, the LINBIT team now uses DRBD 9.3.1 exclusively on its internal production systems. If you are running an earlier DRBD version, the LINBIT team invites you to upgrade to realize the potential performance benefits in your deployments. Let us know your benchmarking results if you do any before-and-after testing, or share them with the community of LINBIT software users in the LINBIT Community Forum.

Picture of Michael Troutman

Michael Troutman

Michael Troutman has an extensive background working in systems administration, networking, and technical support, and has worked with Linux since the early 2000s. Michael's interest in writing goes back to an avid reading filled childhood. Somewhere he still has the rejection letter from a publisher for a choose-your-own-adventure style novella, set in the world of a then popular science fiction role-playing game, cowritten with his grandmother (a romance novelist and travel writer) when at the tender age of 10. In the spirit of the open source community in which LINBIT thrives, Michael works as a Documentation Specialist to help LINBIT document its software and its many uses so that it may be widely understood and used.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.