Pacemaker was never designed to operate across a WAN, or any high latency networks. However, there has always been a need and want to orchestrate active/passive failovers between data centers and across long distances. To address this issue the Booth Service add-on for Pacemaker was conceived back in late 2011. LINBIT® has been involved in the development of Booth since 2013, and has been offering it as a supported solution since 2015.
Booth addresses the shortcomings of Pacemaker by introducing the concept of “tickets”. We constrain particular resources to tickets, and only the site which holds the ticket may start the particular resources. This can be thought of like the old token ring networks of days past. In order for Booth to ensure there is no cluster split, and two sites never possess the ticket at the same time, we use arbitration nodes to achieve quorum, and set an expiration period upon the tickets. If a site loses communication with the rest of the Booth cluster its ticket will not renew and it will stop resources within the expected time frame.
While Pacemaker with Booth addresses the issues of high availability across a WAN, one issue which has always proven difficult is redirecting client traffic to the new site. In most of our demonstrations of Booth we have simply used a round-robin DNS (such as in my demonstration here: Booth Geo Cluster Demo). While round-robin DNS is easy to configure and simple, it is quite inefficient as every other request is discarded. Plenty of other specialty options exist such as a load balancer (HAProxy, hardware load balancing appliance, and so on), or with a DNS update type solution (route53, DynDNS, and so on).
To demonstrate this solution in detail we have developed a tech-guide which outlines, step-by-step, how to configure this using RHEL 8, Pacemaker, Booth, and DRBD® Proxy for data replication, to provide a highly available, geo-clustered, MariaDB service. This document can be found in the documentation section of our website.