Striping LINSTOR Volumes Across Several Physical Devices

Introduction and considerations

In a single sentence, striping can be described as a technique used in storage systems to spread data across multiple physical disks by dividing it into small chunks, which are written sequentially to different drives.

Striping volumes across several physical devices in a LINSTOR® storage pool improves performance by distributing I/O operations across multiple disks. This reduces bottlenecks and increases throughput when compared to linearly allocated storage pools, which is the default allocation strategy for LVM and therefore LVM backed LINSTOR storage pools.

Another benefit of striping data across multiple devices is avoiding “hot spots” on individual drives within a storage pool. In linearly allocated storage pools, one physical device will be fully allocated before blocks from another device are allocated, therefore devices are unevenly used. This will eventually cause some drives to fail much sooner than others.

Sounds like striping data is all upsides, right? Not completely. When you stripe a single volume across multiple physical disks, you are also expanding the fault domain of services using that volume to include multiple disks. If any of those disks fail, the volume becomes inaccessible and so will your services, or they would if you weren’t also using DRBD® to replicate the volume to a peer. When DRBD detects an I/O error on the storage system below it, it transparently begins passing I/O operations over the network to a peer with a healthy backing device. This means you won’t have any immediate downtime, but you will still need to replace the faulty drive, and perform a full resync of the DRBD volume to recover.

Growing striped volumes once you’ve completely filled the physical devices you started with is also a consideration. Striping requires you to add physical devices in multiples of the stripe count. Conversely, adding a single physical device to extend a linearly allocated logical volume is far simpler and obviously requires less available physical space for additional devices inside of your server chassis.

You will also need to consider your stripe sizes. Which applications will be reading and writing to your striped volumes, and the size of those applications’ I/O operations, should influence the chosen stripe size. An application that frequently performs large sequential I/O operations, such as media streaming or working with virtual machine disk images, would benefit from a larger stripe size (256K or 512K), while smaller random I/O operations, such as what a database or a messaging queue application might perform, would benefit from smaller stripe sizes (32K or 64k).

Importantly, DRBD updates its metadata on disk in 4K chunks. This means that when you’re striping, you will probably benefit from using a separate external physical device and LINSTOR storage pool to store DRBD’s metadata. That is, unless your application’s I/O operation sizes align with DRBD’s.

How to configure volume striping using LINSTOR

Now that you’ve been bombarded with things to think about, you might also want to know how it’s implemented. Assuming you have a LINSTOR cluster up and running, and each LINSTOR satellite node has four additional disks that LINSTOR will use to create striped and replicated storage volumes from, you can use the following example, adjusted for specifics in your cluster, to configure striped storage.

đź“ť NOTE: This blog only discusses configurations compatible with the LVM storage provider for LINSTOR. You can configure the ZFS storage provider similarly, but specifics such as LINSTOR property names and the names of options passed to ZFS will differ.

Adding physical devices to LINSTOR storage pools

Before LINSTOR can manage them, you need to add the physical devices to LINSTOR storage pools. One storage pool, and physical device, will be dedicated to DRBD’s metadata, while the others will be used to create the storage volumes that will store user data.

Create a storage pool named storage using three physical devices:

linstor physical-storage create-device-pool \
    --storage-pool storage --pool-name storage lvm linbit-0 /dev/vdb /dev/vdc /dev/vdd
linstor physical-storage create-device-pool \
    --storage-pool storage --pool-name storage lvm linbit-1 /dev/vdb /dev/vdc /dev/vdd
linstor physical-storage create-device-pool \
    --storage-pool storage --pool-name storage lvm linbit-2 /dev/vdb /dev/vdc /dev/vdd

Create a storage pool named meta using the remaining physical device:

linstor physical-storage create-device-pool \
    --storage-pool meta --pool-name meta lvm linbit-0 /dev/vde
linstor physical-storage create-device-pool \
    --storage-pool meta --pool-name meta lvm linbit-1 /dev/vde
linstor physical-storage create-device-pool \
    --storage-pool meta --pool-name meta lvm linbit-2 /dev/vde

đź’ˇ TIP: DRBD’s metadata size requirements are roughly 32MiB for metadata per 1TiB of usable volume size per peer. For that reason, for your external DRBD metadata storage devices, you should use smaller capacity physical devices that have performance characteristics (latency and throughput) equal to or better than the devices that you use for data storage.

Configuring striping on the LINSTOR storage pool

You can now configure the storage pools with striping options. For striping, the relevant lvcreate options that LINSTOR will need are --stripesize or -I, and --stripes or -i.

To configure LINSTOR to create the logical volumes (LVs) used as DRBD’s backing storage with three stripes that are 64KiB in size, use the following commands:

linstor storage-pool set-property \
    linbit-0 storage StorDriver/LvcreateOptions "-i3 -I64"
linstor storage-pool set-property \
    linbit-1 storage StorDriver/LvcreateOptions "-i3 -I64"
linstor storage-pool set-property \
    linbit-2 storage StorDriver/LvcreateOptions "-i3 -I64"

âť— IMPORTANT: If you intend to use a separate storage pool for DRBD’s metadata, you must apply the StorDriver/LvcreateOptions property on the storage pool intended to store the data volumes. Otherwise, it might make more sense to apply the StorDriver/LvcreateOptions property on the LINSTOR storage pool, allowing for multiple LINSTOR resource groups with different stripe settings. However, if you apply StorDriver/LvcreateOptions to a storage pool and also configure the storage pool to use a separate LINSTOR storage pool for DRBD metadata, the StorDriver/LvcreateOptions will be applied to both the data volumes and the metadata volumes when LINSTOR resources are created This is likely suboptimal.

Configuring the metadata pool and the resource group

Finally, to tie everything together, you will need to create a resource group that tells LINSTOR which storage pool to use for data, which storage pool to use for metadata, and the replica count for LINSTOR resources created from the storage pool.

Create the storage pool specifying the storage pool to use for DRBD’s data volumes, and the number of replicas LINSTOR should create within the cluster when a resource is created from the resource group:

linstor resource-group create --storage-pool storage --place-count 2 rg0

Configure LINSTOR to use the meta storage pool for DRBD metadata when creating resources from the rg0 resource group:

linstor resource-group set-property rg0 StorPoolNameDrbdMeta meta

Verifying striping of LINSTOR resources

Spawning resources from the rg0 resource group will result in LINSTOR creating striped LVM logical volumes within the storage LVM volume group that will be used as DRBD’s backing device. LINSTOR will create a single logical volume within the meta LVM volume group that will be used to persist DRBD’s metadata.

You can apply this relatively complex block device configuration to new LINSTOR resources by using a single command:

linstor resource-group spawn-resources rg0 res0 10G

There are many ways to verify this works as expected. For example, you can use the lsblk utility to list information about specified block devices. After creating a LINSTOR resource from the rg0 resource group as in the example above, the lsblk command will show the structure of the underlying block device configuration:

lsblk /dev/vd{b,c,d,e}
NAME                   MAJ:MIN  RM SIZE RO TYPE MOUNTPOINTS
vdb                    252:16    0   4G  0 disk
└─storage-res0_00000   253:1     0  10G  0 lvm
  └─drbd1000           147:1000  0  10G  0 disk
vdc                    252:32    0   4G  0 disk
└─storage-res0_00000   253:1     0  10G  0 lvm
  └─drbd1000           147:1000  0  10G  0 disk
vdd                    252:48    0   4G  0 disk
└─storage-res0_00000   253:1     0  10G  0 lvm
  └─drbd1000           147:1000  0  10G  0 disk
vde                    252:64    0   4G  0 disk
└─meta-res0.meta_00000 253:2     0   4M  0 lvm
  └─drbd1000           147:1000  0  10G  0 disk

Output should show that the DRBD device LINSTOR created, in this case /dev/drbd1000, uses four logical volumes: three logical volumes in the storage volume group, and a single logical volume in the meta volume group.

đź“ť NOTE: Running an lsblk command from a Diskless node in a LINSTOR cluster will show the drbd1000 device as a “standalone” device.

Another way that you can verify striping is by monitoring write activity with a utility such as iostat. This will also show that all four physical devices are being written to while using the LINSTOR-created DRBD device. For example, while running a simulated write workload on the /dev/drbd1000 device created in the example above, iostat shows that the three storage logical volumes are receiving writes at an almost exactly equal rate, while the meta logical volume is being updated as needed by DRBD.

iostat -p vdb,vdc,vdd,vde 1 1
Linux 5.15.0-119-generic (linbit-0)     10/25/2024      _x86_64_        (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.32    0.00    0.45    0.02    0.01   99.21

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
vdb               3.73         6.21        14.09         5.78   31239189   70931506   29081604
vdc               3.73         6.18        14.09         5.78   31125966   70924186   29081600
vdd               3.73         6.19        14.09         5.78   31136922   70919466   29081600
vde               1.92         0.05         4.17         0.01     226908   21000917      32768

Conclusion

Striping volumes across multiple devices in a LINSTOR storage pool can provide significant performance benefits, particularly for workloads with heavy I/O demands. By distributing read and write operations across several physical disks, striping helps avoid storage hotspots and reduces latency. However, it’s essential to consider both the complexities and the potential pitfalls that striping can introduce, for example, increasing the fault domain across multiple disks. By using DRBD replication you can mitigate this risk and provide valuable resilience within your clusters. Furthermore, carefully selecting stripe sizes and planning for storage expansion will ensure optimal performance and scalability.

In conclusion, configuring LINSTOR for striped storage requires thoughtful planning but offers a powerful tool to maximize storage efficiency and throughput. Balancing this approach with DRBD replication and an understanding of your specific workloads can unlock significant advantages in a high-performance storage environment.

Matt Kereczman

Matt Kereczman

Matt Kereczman is a Solutions Architect at LINBIT with a long history of Linux System Administration and Linux System Engineering. Matt is a cornerstone in LINBIT's technical team, and plays an important role in making LINBIT and LINBIT's customer's solutions great. Matt was President of the GNU/Linux Club at Northampton Area Community College prior to graduating with Honors from Pennsylvania College of Technology with a BS in Information Security. Open Source Software and Hardware are at the core of most of Matt's hobbies.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.