LINSTOR® has supported using auxiliary node properties to fine tune LINSTOR’s logic for where in the cluster LINSTOR should place DRBD® volume replicas when you don’t explicitly name nodes. The auxiliary properties, also known as “AuxProps”, are key value pairs placed on nodes in a LINSTOR cluster. When you combine auxiliary node properties with LINSTOR options such as --replcas-on-same
and --replicas-on-different
, you can influence the automatic placement strategies of volume replicas for your DRBD devices.
Depending on the topology of a LINSTOR cluster and an organization’s high availability (HA) and disaster recovery (DR) requirements, these settings might play a crucial role in whether or not the cluster’s replication is in compliance with policies that might govern those practices.
A colleague, Ryan Ronnander, explained how you could use LINSTOR’s auxiliary node properties to configure automatic placement of replicas spanning multiple Availability Zones in a blog post using the aforementioned --replicas-on-different
option. Ryan’s blog is a must read if you’re unfamiliar with the aforementioned LINSTOR options, and will help to understand the new option we’re about to delve into.
This blog post introduces a LINSTOR option, --x-replicas-on-different
, that expands LINSTOR’s automatic placement options to give you more flexibility and control. This new option allows users to effectively tell LINSTOR, “I want $X replicas of a resource created in each different $Y“, where $Y is a key value pair you’ve assigned to your LINSTOR satellite nodes. This is useful when you’re trying to support both HA and DR strategies for a storage resource, for example, by creating two replicas in each data center (“X = 2″ and “Y = ‘data center'”).
Learning Through Examples
I believe the option, --x-replicas-on-different
, is easier to both describe and understand through example. I will use this section to demonstrate a practical scenario, showing the LINSTOR commands used and their resulting DRBD resource replica distribution, which helps describe this new option.
Using a 5-satellite LINSTOR cluster, with satellites named linstor-sat-0
through linstor-sat-4
spread between two data centers in two different cities, I applied an auxiliary node property to each using site
as the key and the name of the city where the node is located as the value:
linstor node set-property --aux linstor-sat-0 site portland
linstor node set-property --aux linstor-sat-1 site portland
linstor node set-property --aux linstor-sat-2 site vienna
linstor node set-property --aux linstor-sat-3 site vienna
linstor node set-property --aux linstor-sat-4 site vienna
The commands above result in the node list below, with the Aux/site
key set to the name of the city where the node is located:
linstor node list --show-aux-props
╭─────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType ┊ Addresses ┊ AuxProps ┊ State ┊
╞═════════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor-ctrl-0 ┊ CONTROLLER ┊ 192.168.222.250:3370 (PLAIN) ┊ ┊ Online ┊
┊ linstor-sat-0 ┊ SATELLITE ┊ 192.168.222.10:3366 (PLAIN) ┊ Aux/site=portland ┊ Online ┊
┊ linstor-sat-1 ┊ SATELLITE ┊ 192.168.222.11:3366 (PLAIN) ┊ Aux/site=portland ┊ Online ┊
┊ linstor-sat-2 ┊ SATELLITE ┊ 192.168.222.12:3366 (PLAIN) ┊ Aux/site=vienna ┊ Online ┊
┊ linstor-sat-3 ┊ SATELLITE ┊ 192.168.222.13:3366 (PLAIN) ┊ Aux/site=vienna ┊ Online ┊
┊ linstor-sat-4 ┊ SATELLITE ┊ 192.168.222.14:3366 (PLAIN) ┊ Aux/site=vienna ┊ Online ┊
╰─────────────────────────────────────────────────────────────────────────────────────────╯
With these settings in place, I can now use the --x-replicas-on-different
option while creating resources from my rg0
resource group to influence where LINSTOR places the resulting resource’s DRBD replicas. In the following command examples I will use LINSTOR’s rg spawn
subcomamnd, which is short hand for resource-group spawn-resources
, to both save space and help direct attention to the new options. In each command, the --x-replicas-on-different
option takes two positional arguments, first the name of the “AuxProp” (in this case, site
) and second the number of replicas to place in each different site
. The remaining arguments passed to the rg spawn
subcommand are --place-count X
, where X is the total number of replicas for the new resource, followed by the resource group’s name (rg0
), the new resource’s name (resX
), and the size of the resource (200M
will be used in all examples).
I will start with a relatively straightforward example where four “diskful” replicas are created in the cluster, while specifying that two replicas should be placed in each differently named data center:
linstor rg spawn --x-replicas-on-different site 2 --place-count 4 rg0 res0 200M
This results in four “diskful” replicas, two on nodes in Portland and two on nodes in Vienna. Also, notice that LINSTOR automatically assigned a diskless TieBreaker
resource to achieve quorum:
linstor resource list --resource res0
╭─────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊ CreatedOn ┊
╞═════════════════════════════════════════════════════════════════════════════════════════╡
┊ res0 ┊ linstor-sat-0 ┊ 7000 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 18:36:22 ┊
┊ res0 ┊ linstor-sat-1 ┊ 7000 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 18:36:21 ┊
┊ res0 ┊ linstor-sat-2 ┊ 7000 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 18:36:21 ┊
┊ res0 ┊ linstor-sat-3 ┊ 7000 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2024-05-30 18:36:15 ┊
┊ res0 ┊ linstor-sat-4 ┊ 7000 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 18:36:21 ┊
╰─────────────────────────────────────────────────────────────────────────────────────────╯
What could have happened in this example, but did not because of the --x-replicas-on-different site 2
option, is three diskful replicas could have been placed in Vienna with only one replica existing in Portland. This ensures there is not only data center-level fault tolerance, but also node-level fault tolerance within each data center for resources created using these options.
Another example, albeit one that is redundant to the --replicas-on-same
LINSTOR option, is creating two diskful replicas while specifying that two replicas should be placed in each differently named data center:
linstor rg spawn --x-replicas-on-different site 2 --place-count 2 rg0 res1 200M
This results in two diskful replicas being placed onto nodes within the same data center, and the diskless quorum TieBreaker
resource landing on any other node in the cluster:
linstor resource list --resource res1
╭─────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊ CreatedOn ┊
╞═════════════════════════════════════════════════════════════════════════════════════════╡
┊ res1 ┊ linstor-sat-0 ┊ 7001 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 18:45:39 ┊
┊ res1 ┊ linstor-sat-1 ┊ 7001 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 18:45:39 ┊
┊ res1 ┊ linstor-sat-2 ┊ 7001 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2024-05-30 18:45:38 ┊
╰─────────────────────────────────────────────────────────────────────────────────────────╯
Again, this is redundant to the --replicas-on-same
option, but the example demonstrates what happens when a --x-replicas-on-different
value is equal to the total number of diskful replicas specified by --place-count
.
An example that might not be as straightforward, is what happens when a --place-count
is specified that is not divisible by the value set for --x-replicas-on-different
. For example:
linstor rg spawn --x-replicas-on-different site 2 --place-count 3 rg0 res2 200M
Rather than refusing to create the resource, LINSTOR will respect the --x-replicas-on-different
option up to the point where you’d have a remainder replica assignment, at which point it will fill out the “next different site” with the remaining replicas. After running the example command above, the following replica assignments are made:
linstor resource list --resource res2
╭───────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊ CreatedOn ┊
╞═══════════════════════════════════════════════════════════════════════════════════════╡
┊ res2 ┊ linstor-sat-0 ┊ 7002 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 19:40:26 ┊
┊ res2 ┊ linstor-sat-1 ┊ 7002 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 19:40:26 ┊
┊ res2 ┊ linstor-sat-3 ┊ 7002 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 19:40:26 ┊
╰───────────────────────────────────────────────────────────────────────────────────────╯
Another example with an outcome that might not be obvious, is what happens when --place-count
is less than the value of --x-replicas-on-different
:
linstor rg spawn --x-replicas-on-different site 3 --place-count 2 rg0 res3 200M
linstor resource list --resource res3
╭─────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊ CreatedOn ┊
╞═════════════════════════════════════════════════════════════════════════════════════════╡
┊ res3 ┊ linstor-sat-0 ┊ 7003 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 19:52:46 ┊
┊ res3 ┊ linstor-sat-1 ┊ 7003 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2024-05-30 19:52:46 ┊
┊ res3 ┊ linstor-sat-2 ┊ 7003 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2024-05-30 19:52:45 ┊
╰─────────────────────────────────────────────────────────────────────────────────────────╯
Perhaps the result above is not that surprising considering the previous example, because again we see that LINSTOR will fill out a site with replicas until it reaches the value set using --x-replicas-on-different
.
Finally, what happens when LINSTOR is asked for something impossible? Spoiler alert, it doesn’t create a black hole consuming the entire solar system or summon Godzilla from the depths of the Pacific to wreak havoc on cities specified in your AuxProps. Sorry, Hollywood. Instead, it politely refuses.
For example, the site with the most nodes in the example cluster is Vienna, with three nodes. When trying to specify --x-replicas-on-different 4
, and a --place-count 4
, output similar to the following is displayed:
linstor rg spawn --x-replicas-on-different site 4 --place-count 4 rg0 res4 200M
ERROR:
Description:
Not enough available nodes
Details:
Not enough nodes fulfilling the following auto-place criteria:
* has a deployed storage pool named [drbdpool]
* the storage pools have to have at least '204800' free space
* the current access context has enough privileges to use the node and the storage pool
* the node is online
Auto-place configuration details:
Replica count: 4
Don't place with resource
res4
Storage pool name:
drbdpool
X replicas on nodes with different property:
Aux/site: 4
Resource group: rg0
The above error lists the possible reasons auto placement did not succeed.
Closing Thoughts
This blog has hopefully shed some light on LINSTOR’s --x-replicas-on-different
option, and maybe even pointed out some of LINSTOR’s lesser-known automatic placement options. The --x-replicas-on-different
option essentially tells LINSTOR’s auto-placer to choose the same site, where “site” is the arbitrary key value pair used in my example, X times before considering others. One might consider combining the --x-replicas-on-different
setting with other auto-placer settings like --replicas-on-different
, for example, to place two replicas in both Portland and Vienna but also ensure the replicas in each data center land on different racks.
The example in this blog, using sites, is the scenario our customer base had in mind for --x-replicas-on-different
, but perhaps there are other uses we’ve yet to realize. If there is an example scenario I missed, or a use case you think could be interesting for these options, reach out and let us know!