Before MySQL version 5.5, released in December of 2010, MySQL only natively supported asynchronous replication which had the potential to lose data under certain conditions. Due to this risk, DRBD® was often used to transparently provide synchronous replication at the block level, underneath the database. Because DRBD is situated in the Linux kernel as a block device driver, it offers low latency I/O to the file system or application storing data on the DRBD device, which is crucial for transactional database workloads making small but frequent updates with a low depth. Because DRBD creates a virtual block device over top of a physical block device, I/O isn’t bottlenecked by traversing through various file or object storage layers.
LINBIT®’s DRBD-based solutions for transparently adding high availability (HA) capabilities to MySQL databases remained fairly common even after MySQL began supporting their own, database specific, replication. For standalone MySQL databases, this might have been because of ease-of-use in the failover and failback mechanisms (Heartbeat, Pacemaker, and Keepalived) used with DRBD, when compared to the relatively new MySQL native failover and failback processes. Another reason could be that when replicating a database using DRBD, you can also replicate related file systems or logs using DRBD. Additionally, if you’re replicating a database and a file system which need to remain “in step” with one another, such as a database and its transaction logs, you can do that using a multi-volume DRBD resource as a way of enforcing consistency between two devices.
Choosing a Cluster Resource Manager
When replicating the storage underneath your MySQL database, you need someone or something to “activate” a standby node when the original active node becomes unavailable. Typically in Linux HA clustering this is the job of the cluster resource manager (CRM). LINBIT supports both Pacemaker and DRBD Reactor as CRMs for customers with HA MySQL clusters.
Pacemaker is an open source CRM stack maintained by the ClusterLabs organization which supports n-node clusters where n is 1-32 nodes (without using additional software such as pacemaker-remoted
to scale beyond 32 nodes). DRBD Reactor is an open source CRM developed and maintained by LINBIT which relies on DRBD’s quorum state to promote resources, and because of this supports n-node clusters where n is equal to the number of DRBD peers for any given resource. Both Pacemaker and DRBD Reactor can use systemd
or Open Cluster Framework (OCF) resource agents to start, stop, and monitor cluster resources in a cluster.
Pacemaker supports complex ordering and location rules for resources running in the cluster. You could use location and ordering rules to configure host preferences within the cluster. For example, if you had a 7-node cluster running a handful of KVM virtual machines, and a transactional database used by those virtual machines (VMs), you could configure Pacemaker to prefer running VMs and the database on different nodes to avoid competition for CPU and memory. However, you could also configure the rules such that the VMs and database could run together on the same node if they absolutely needed to. Much of the time, complex ordering and location rules aren’t needed, and administrators will configure resources using resource groups which enforce linear ordering and colocation of resources within the group.
DRBD Reactor is aimed to be simpler to configure than Pacemaker. DRBD Reactor only supports linear ordering and colocation of cluster resources based on where the clustered resources’ DRBD volume has been promoted to Primary. Since DRBD Reactor relies on DRBD’s quorum for promotion, this means a DRBD Reactor cluster has a “built-in” fencing mechanism, which can be difficult to configure in Pacemaker clusters.
Regardless of which CRM you choose, LINBIT has documentation resources that can help with the installation and configuration of the cluster stack.
- MariaDB HA Clustering Using LINSTOR, DRBD, and DRBD Reactor
- MySQL HA Clustering Using DRBD and Pacemaker on RHEL 8
Comparing MySQL Performance on Replicated Storage Solutions
Replication isn’t free in terms of performance. Additional latency from the replication process is expected when distributing blocks to various nodes on a network. LINBIT often claims to be good at providing storage to highly transactional workloads, such as databases, because of DRBD’s relatively low impact on performance while replicating synchronously.
To compare results and validate LINBIT’s claim, you can use sysbench
. Sysbench is an open source benchmarking tool designed to evaluate the performance of various system components, but is commonly used to test the performance of database systems. The results below were gathered from separate tests. Each test ran using the same underlying hardware and benchmarking parameters for the following environments:
- Single-node database for establishing the baseline
- Three-node DRBD backed database cluster
- Three-node Ceph RBD backed database cluster
Environment | Reads | Write | Other | TPS¹ | Min Lat² | Max Lat³ | Avg Lat⁴ |
---|---|---|---|---|---|---|---|
Baseline | 1252790 | 357871 | 178943 | 1490.81 | 1.04ms | 57.49ms | 6.71ms |
DRBD | 1020208 | 291413 | 145714 | 1213.87 | 1.26ms | 57.28ms | 8.24ms |
Ceph RBD | 798504 | 228091 | 114051 | 950.10 | 2.08ms | 87.61ms | 10.52ms |
- TPS: the transactions per second measured during the test
- Min Lat: the minimum latency measured during the test
- Max Lat: the maximum latency measured during the test
- Avg Lat : the average latency of all transactions measured during the test
For both Ceph RBD and DRBD, I configured three nodes for each of their “storage pools” (three OSDs for Ceph, and three replicas for DRBD), to make sure that I was replicating writes to the same number of nodes in each test. The sysbench
test that I ran was the oltp_read_write
test using 10 threads.
The results show that the TPS measured while using DRBD was 81.4% of the baseline, while Ceph RBD measured 63.7% of the baseline results. The maximum latency for both Ceph RBD and DRBD are the measurements that stand out the most, with DRBD’s maximum latency being almost equal to the baseline results and Ceph RBD’s maximum latency being about 50% higher than DRBD and the baseline’s. Obviously, looking at the minimum and maximum is not as useful as looking at the average latency where the results show that DRBD adds a 22.8% to latency, and Ceph RBD adds 56.7% latency.
Concluding Thoughts
Using DRBD to add HA capabilities underneath MySQL databases is certainly not new. Because DRBD replicates at the block level, you can replicate any application that writes to persistent storage, even applications that come with their own methods for replication. When you need to replicate a database, a file system, or any other persistent storage in your software stack, it can be easier on operations to use the same technology stack for all of that replication. DRBD can accommodate that replication without much additional write latency, as the results of this performance benchmark testing show.
If you’re looking for low latency replicated storage for a transactional workload of your own, or maybe you’re new to DRBD or HA clustering in general and have specific questions you’d like to ask, don’t hesitate to reach out to LINBIT directly. If you’d rather ask your questions or share your experiences asynchronously, you can join the LINBIT community forums and write us there.