The LINSTOR User’s Guide

Please Read This First

This guide is intended to serve users of the Software-Defined-Storage Solution LINSTOR as a definitive reference guide and handbook.

This guide assumes, throughout, that you are using the latest version of LINSTOR and related tools.

This guide is organized as follows:

LINSTOR

1. Basic administrative tasks / Setup

LINSTOR is a configuration management system for storage on Linux systems. It manages LVM logical volumes and/or ZFS ZVOLs on a cluster of nodes. It leverages DRBD for replication between different nodes and to provide block storage devices to users and applications. It manages snapshots, encryption and caching of HDD backed data in SSDs via bcache.

1.1. Concepts and Terms

This section goes over core concepts and terms that you will need to familiarize yourself with to understand how LINSTOR works and deploys storage. The section is laid out in a “ground up” approach.

1.1.1. Installable Components

linstor-controller

A LINSTOR setup requires at least one active controller and one or more satellites.

The linstor-controller relies on a database that holds all configuration information for the whole cluster. It makes all decisions that need to have a view of the whole cluster. Multiple controllers can be used for LINSTOR but only one can be active.

linstor-satellite

The linstor-satellite runs on each node where LINSTOR consumes local storage or provides storage to services. It is stateless; it receives all the information it needs from the controller. It runs programs like lvcreate and drbdadm. It acts like a node agent.

linstor-client

The linstor-client is a command line utility that you use to issue commands to the system and to investigate the status of the system.

1.1.2. Objects

Objects are the end result which LINSTOR presents to the end-user or application, such as; Kubernetes/OpenShift, a replicated block device (DRBD), NVMeOF target, etc.

Node

Node’s are a server or container that participate in a LINSTOR cluster. The Node attribute defines:

  • Determines which LINSTOR cluster the node participates in

  • Sets the role of the node: Controller, Satellite, Auxiliary

  • NetInterface objects define the node’s connectivity

NetInterface

As the name implies, this is how you define the interface/address of a node’s network interface.

Definitions

Definitions define attributes of an object, they can be thought of as profile or template. Objects created will inherit the configuration defined in the definitions. A definition must be defined prior to creating the associated object. For example; you must create a ResourceDefinition prior to creating the Resource

StoragePoolDefinition
  • Defines the name of a storage pool

ResourceDefinition

Resource definitions define the following attributes of a resource:

  • The name of a DRBD resource

  • The TCP port for DRBD to use for the resource’s connection

VolumeDefinition

Volume definitions define the following:

  • A volume of a DRBD resource

  • The size of the volume

  • The volume number of the DRBD resource’s volume

  • The meta data properties of the volume

  • The minor number to use for the DRBD device associated with the DRBD volume

StoragePool

The StoragePool identifies storage in the context of LINSTOR. It defines:

  • The configuration of a storage pool on a specific node

  • The storage back-end driver to use for the storage pool on the cluster node (LVM, ZFS, etc)

  • The parameters and configuration to pass to the storage backed driver

Resource

LINSTOR has now has expanded its capabilities to manage a broader set of storage technologies outside of just DRBD. A Resource:

  • Represents the placement of a DRBD resource, as defined within the ResourceDefinition

  • Places a resource on a node in the cluster

  • Defines the placement of a ResourceDefinition on a node

Volume

Volumes are a subset of a Resource. A Resource could have multiple volumes, for example you may wish to have your database stored on slower storage than your logs in your MySQL cluster. By keeping the volumes under a single resource you are essentially creating a consistency group. The Volume attribute can define also define attributes on a more granular level.

1.2. Broader Context

While LINSTOR might be used to make the management of DRBD more convenient, it is often integrated with software stacks higher up. Such integration exist already for Kubernetes, OpenStack, OpenNebula and Proxmox. Chapters specific to deploying LINSTOR in these environments are included in this guide.

The southbound drivers used by LINSTOR are LVM, thinLVM and ZFS.

1.3. Packages

LINSTOR is packaged in both the .rpm and the .deb variants:

  1. linstor-client contains the command line client program. It depends on python which is usually already installed. In RHEL 8 systems you will need to symlink python

  2. linstor-controller and linstor-satellite Both contain systemd unit files for the services. They depend on Java runtime environment (JRE) version 1.8 (headless) or higher.

For further detail about these packages see the Installable Components section above.

If you have a support subscription to LINBIT, you will have access to our certified binaries via our repositories.

1.4. Installation

If you want to use LINSTOR in containers skip this Topic and use the “Containers” section below for the installation.

1.4.1. Ubuntu Linux

If you want to have the option of creating replicated storage using DRBD, you will need to install drbd-dkms and drbd-utils. These packages will need to be installed on all nodes. You will also need to choose a volume manager, either ZFS or LVM, in this instance we’re using LVM.

# apt install -y drbd-dkms drbd-utils lvm2

Depending on whether your node is a LINSTOR controller, satellite, or both (Combined) will determine what packages are required on that node. For combined type nodes, we’ll need both the controller and satellite LINSTOR package.

Combined node:

# apt install linstor-controller linstor-satellite  linstor-client

That will make our remaining nodes our Satellites, so we’ll need to install the following packages on them:

# apt install linstor-satellite  linstor-client

1.4.2. SUSE Linux Enterprise Server

SLES High Availability Extension (HAE) includes DRBD.

On SLES, DRBD is normally installed via the software installation component of YaST2. It comes bundled with the High Availability package selection.

As we download DRBD’s newest module we can check if the LVM-tools are up to date as well. User who prefer a command line install may simply issue the following command to get the newest DRBD and LVM version:

# zypper install drbd lvm2

Depending on whether your node is a LINSTOR controller, satellite, or both (Combined) will determine what packages are required on that node. For combined type nodes, we’ll need both the controller and satellite LINSTOR package.

Combined node:

# zypper install linstor-controller linstor-satellite  linstor-client

That will make our remaining nodes our Satellites, so we’ll need to install the following packages on them:

# zypper install linstor-satellite  linstor-client

1.4.3. CentOS

CentOS has had DRBD 8 since release 5. For DRBD 9 you’ll need to look at EPEL and similar sources. Alternatively, if you have an active support contract with LINBIT you can utilize our RHEL 8 repositories. DRBD can be installed using yum. We can also check for the newest version of the LVM-tools as well.

LINSTOR requires DRBD 9 if you wish to have replicated storage. This requires an external repository to be configured, either LINBIT’s or a 3rd parties.
# yum install drbd kmod-drbd lvm2

Depending on whether your node is a LINSTOR controller, satellite, or both (Combined) will determine what packages are required on that node. For combined type nodes, we’ll need both the controller and satellite LINSTOR package.

On RHEL 8 systems you will need to install python2 for the linstor-client to work.

Combined node:

# yum install linstor-controller linstor-satellite  linstor-client

That will make our remaining nodes our Satellites, so we’ll need to install the following packages on them:

# yum install linstor-satellite  linstor-client

1.5. Upgrading

LINSTOR doesn’t support rolling upgrade, controller and satellites must have the same version, otherwise the controller with discard the satellite with a VERSION_MISMATCH. But this isn’t a problem, as the satellite won’t do any actions as long it isn’t connected to a controller and DRBD will not be disrupted by any means.

If you are using the embedded default H2 database and the linstor-controller package is upgraded an automatic backup file of the database will be created in the default /var/lib/linstor directory. This file is a good restore point if for any reason a linstor-controller database migration should fail, than it is recommended to report the error to LINBIT and restore the old database file and downgrade to your previous controller version.

If you use any external database or etcd, it is recommended to do a manually backup of your current database to have a restore point.

So first upgrade the linstor-controller, linstor-client package on you controller host and restart the linstor-controller, the controller should start and all of it’s client should show OFFLINE(VERSION_MISMATCH). After that you can continue upgrading linstor-satellite on all satellite nodes and restart them, after a short reconnection time they should all show ONLINE again and your upgrade is finished.

1.6. Containers

LINSTOR is also available as containers. The base images are available in LINBIT’s container registry, drbd.io.

In order to access the images, you first have to login to the registry (reach out to sales@linbit.com for credentials):

# docker login drbd.io

The containers available in this repository are:

  • drbd.io/drbd9-rhel8

  • drbd.io/drbd9-rhel7

  • drbd.io/drbd9-sles15sp1

  • drbd.io/drbd9-bionic

  • drbd.io/drbd9-focal

  • drbd.io/linstor-csi

  • drbd.io/linstor-controller

  • drbd.io/linstor-satellite

  • drbd.io/linstor-client

An up to date list of available images with versions can be retrieved by opening http://drbd.io in your browser. Make sure to access the host via “http”, as the registry’s images themselves are served via “https”.

To load the kernel module, needed only for LINSTOR satellites, you’ll need to run a drbd9-$dist container in privileged mode. The kernel module containers either retrieve an official LINBIT package from a customer repository, use shipped packages, or they try to build the kernel modules from source. If you intend to build from source, you need to have the according kernel headers (e.g., kernel-devel) installed on the host. There are 4 ways to execute such a module load container:

  • Building from shipped source

  • Using a shipped/pre-built kernel module

  • Specifying a LINBIT node hash and a distribution.

  • Bind-mounting an existing repository configuration.

Example building from shipped source (RHEL based):

# docker run -it --rm --privileged -v /lib/modules:/lib/modules \
  -v /usr/src:/usr/src:ro \
  drbd.io/drbd9-rhel7

Example using a module shipped with the container, which is enabled by not bind-mounting /usr/src:

# docker run -it --rm --privileged -v /lib/modules:/lib/modules \
  drbd.io/drbd9-rhel8

Example using a hash and a distribution (rarely used):

# docker run -it --rm --privileged -v /lib/modules:/lib/modules \
  -e LB_DIST=rhel7.7 -e LB_HASH=ThisIsMyNodeHash \
  drbd.io/drbd9-rhel7

Example using an existing repo config (rarely used):

# docker run -it --rm --privileged -v /lib/modules:/lib/modules \
  -v /etc/yum.repos.d/linbit.repo:/etc/yum.repos.d/linbit.repo:ro \
  drbd.io/drbd9-rhel7
In both cases (hash + distribution, as well as bind-mounting a repo) the hash or config has to be from a node that has a special property set. Feel free to contact our support, and we set this property.
For now (i.e., pre DRBD 9 version “9.0.17”), you must use the containerized DRBD kernel module, as opposed to loading a kernel module onto the host system. If you intend to use the containers you should not install the DRBD kernel module on your host systems. For DRBD version 9.0.17 or greater, you can install the kernel module as usual on the host system, but you need to make sure to load the module with the usermode_helper=disabled parameter (e.g., modprobe drbd usermode_helper=disabled).

Then run the LINSTOR satellite container, also privileged, as a daemon:

# docker run -d --name=linstor-satellite --net=host -v /dev:/dev --privileged drbd.io/linstor-satellite
net=host is required for the containerized drbd-utils to be able to communicate with the host-kernel via netlink.

To run the LINSTOR controller container as a daemon, mapping ports 3370, 3376 and 3377 on the host to the container:

# docker run -d --name=linstor-controller -p 3370:3370 -p 3376:3376 -p 3377:3377 drbd.io/linstor-controller

To interact with the containerized LINSTOR cluster, you can either use a LINSTOR client installed on a system via packages, or via the containerized LINSTOR client. To use the LINSTOR client container:

# docker run -it --rm -e LS_CONTROLLERS=<controller-host-IP-address> drbd.io/linstor-client node list

From this point you would use the LINSTOR client to initialize your cluster and begin creating resources using the typical LINSTOR patterns.

To stop and remove a daemonized container and image:

# docker stop linstor-controller
# docker rm linstor-controller

1.7. Initializing your cluster

We assume that the following steps are accomplished on all cluster nodes:

  1. The DRBD9 kernel module is installed and loaded

  2. drbd-utils are installed

  3. LVM tools are installed

  4. linstor-controller and/or linstor-satellite its dependencies are installed

  5. The linstor-client is installed on the linstor-controller node

Start and enable the linstor-controller service on the host where it has been installed:

# systemctl enable --now linstor-controller

If you are sure the linstor-controller service gets automatically enabled on installation you can use the following command as well:

# systemctl start linstor-controller

1.8. Using the LINSTOR client

Whenever you run the LINSTOR command line client, it needs to know where your linstor-controller runs. If you do not specify it, it will try to reach a locally running linstor-controller listening on IP 127.0.0.1 port 3376. Therefore we will use the linstor-client on the same host as the linstor-controller.

The linstor-satellite requires ports 3366 and 3367. The linstor-controller requires ports 3376 and 3377. Make sure you have these ports allowed on your firewall.
# linstor node list

should give you an empty list and not an error message.

You can use the linstor command on any other machine, but then you need to tell the client how to find the linstor-controller. As shown, this can be specified as a command line option, an environment variable, or in a global file:

# linstor --controllers=alice node list
# LS_CONTROLLERS=alice linstor node list

Alternatively you can create the /etc/linstor/linstor-client.conf file and populate it like below.

[global]
controllers=alice

If you have multiple linstor-controllers configured you can simply specify them all in a comma separated list. The linstor-client will simply try them in the order listed.

The linstor-client commands can also be used in a much faster and convenient way by only writing the starting letters of the parameters e.g.: linstor node listlinstor n l

1.9. Adding nodes to your cluster

The next step is to add nodes to your LINSTOR cluster.

# linstor node create bravo 10.43.70.3

If the IP is omitted, the client will try to resolve the given node-name as host-name by itself.

Linstor will automatically detect the node’s local uname -n which is later used for the DRBD-resource.

When you use linstor node list you will see that the new node is marked as offline. Now start and enable the linstor-satellite on that node so that the service comes up on reboot as well:

# systemctl enable --now  linstor-satellite

You can also use systemctl start linstor-satellite if you are sure that the service is already enabled as default and comes up on reboot.

About 10 seconds later you will see the status in linstor node list becoming online. Of course the satellite process may be started before the controller knows about the existence of the satellite node.

In case the node which hosts your controller should also contribute storage to the LINSTOR cluster, you have to add it as a node and start the linstor-satellite as well.

If you want to have other services wait until the linstor-satellite had a chance to create the necessary devices (i.e. after a boot), you can update the corresponding .service file and change Type=simple to Type=notify.

This will cause the satellite to delay sending the READY=1 message to systemd until the controller connects, sends all required data to the satellite and the satellite at least tried once to get the devices up and running.

1.10. Storage pools

StoragePools identify storage in the context of LINSTOR. To group storage pools from multiple nodes, simply use the same name on each node. For example, one valid approach is to give all SSDs one name and all HDDs another.

On each host contributing storage, you need to create either an LVM VG or a ZFS zPool. The VGs and zPools identified with one LINSTOR storage pool name may have different VG or zPool names on the hosts, but do yourself a favor and use the same VG or zPool name on all nodes.

# vgcreate vg_ssd /dev/nvme0n1 /dev/nvme1n1 [...]

These then need to be registered with LINSTOR:

# linstor storage-pool create lvm alpha pool_ssd vg_ssd
# linstor storage-pool create lvm bravo pool_ssd vg_ssd
The storage pool name and common metadata is referred to as a storage pool definition. The listed commands create a storage pool definition implicitly. You can see that by using linstor storage-pool-definition list. Creating storage pool definitions explicitly is possible but not necessary.

To list your storage-pools you can use:

# linstor storage-pool list

or using the short version

# linstor sp l

Should the deletion of the storage pool be prevented due to attached resources or snapshots with some of its volumes in another still functional storage pool, hints will be given in the ‘status’ column of the corresponding list-command (e.g. linstor resource list). After deleting the LINSTOR-objects in the lost storage pool manually, the lost-command can be executed again to ensure a complete deletion of the storage pool and its remaining objects.

1.10.1. A storage pool per backend device

In clusters where you have only one kind of storage and the capability to hot-repair storage devices, you may choose a model where you create one storage pool per physical backing device. The advantage of this model is to confine failure domains to a single storage device.

1.10.2. Physical storage command

Since linstor-server 1.5.2 and a recent linstor-client, LINSTOR can create LVM/ZFS pools on a satellite for you. The linstor-client has the following commands to list possible disks and create storage pools, but such LVM/ZFS pools are not managed by LINSTOR and there is no delete command, so such action must be done manually on the nodes.

# linstor physical-storage list

Will give you a list of available disks grouped by size and rotational(SSD/Magnetic Disk).

It will only show disks that pass the following filters:

  • The device size must be greater than 1GiB

  • The device is a root device (not having children) e.g.: /dev/vda, /dev/sda

  • The device does not have any file-system or other blkid marker (wipefs -a might be needed)

  • The device is no DRBD device

With the create-device-pool command you can create a LVM pool on a disk and also directly add it as a storage-pool in LINSTOR.

# linstor physical-storage create-device-pool --pool-name lv_my_pool LVMTHIN node_alpha /dev/vdc --storage-pool newpool

If the --storage-pool option was provided, LINSTOR will create a storage-pool with the given name.

For more options and exact command usage please check the linstor-client help.

1.11. Resource groups

A resource group is a parent object of a resource definition where all property changes made on a resource group will be inherited by it’s resource definition children. The resource group also stores settings for automatic placement rules and can spawn a resource definition depending on the stored rules.

In simpler terms, resource groups are like templates that define characteristics of resources created from them. Changes to these pseudo templates will be applied to all resources that were created from the resource group, retroactively.

Using resource groups to define how you’d like your resources provisioned should be considered the de facto method for deploying volumes provisioned by LINSTOR. Chapters that follow which describe creating each resource from a resource-definition and volume-definition should only be used in special scenarios.
Even if you choose not to create and use resource-groups in your LINSTOR cluster, all resources created from resource-definitions and volume-definitions will exist in the ‘DfltRscGrp’ resource-group.

A simple pattern for deploying resources using resource groups would look like this:

# linstor resource-group create my_ssd_group --storage-pool pool_ssd --place-count 2
# linstor volume-group create my_ssd_group
# linstor resource-group spawn-resources my_ssd_group my_ssd_res 20G

The commands above would result in a resource named ‘my_ssd_res’ with a 20GB volume replicated twice being automatically provisioned from nodes who participate in the storage pool named ‘pool_ssd’.

A more useful pattern could be to create a resource group with settings you’ve determined are optimal for your use case. Perhaps you have to run nightly online verifications of your volumes’ consistency, in that case, you could create a resource group with the ‘verify-alg’ of your choice already set so that resources spawned from the group are pre-configured with ‘verify-alg’ set:

# linstor resource-group create my_verify_group --storage-pool pool_ssd --place-count 2
# linstor resource-group drbd-options --verify-alg crc32c my_verify_group
# linstor volume-group create my_verify_group
# for i in {00..19}; do
    linstor resource-group spawn-resources my_verify_group res$i 10G
  done

The commands above result in twenty 10GiB resources being created each with the ‘crc32c’ ‘verify-alg’ pre-configured.

You can tune the settings of individual resources or volumes spawned from resource groups by setting options on the respective resource-definition or volume-definition. For example, if ‘res11’ from the example above is used by a very active database receiving lots of small random writes, you might want to increase the ‘al-extents’ for that specific resource:

# linstor resource-definition drbd-options --al-extents 6007 res11

If you configure a setting in a resource-definition that is already configured on the resource-group it was spawned from, the value set in the resource-definition will override the value set on the parent resource-group. For example, if the same ‘res11’ was required to use the slower but more secure ‘sha256’ hash algorithm in its verifications, setting the ‘verify-alg’ on the resource-definition for ‘res11’ would override the value set on the resource-group:

# linstor resource-definition drbd-options --verify-alg sha256 res11
A rule of thumb for the hierarchy in which settings are inherited is the value “closer” to the resource or volume wins: volume-definition settings take precedence over volume-group settings, and resource-definition settings take precedence over resource-group settings.

1.12. Cluster configuration

1.12.1. Available storage plugins

LINSTOR has the following supported storage plugins as of writing:

  • Thick LVM

  • Thin LVM with a single thin pool

  • Thick ZFS

  • Thin ZFS

1.13. Creating and deploying resources/volumes

In the following scenario we assume that the goal is to create a resource ‘backups’ with a size of ‘500 GB’ that is replicated among three cluster nodes.

First, we create a new resource definition:

# linstor resource-definition create backups

Second, we create a new volume definition within that resource definition:

# linstor volume-definition create backups 500G

If you want to change the size of the volume-definition you can simply do that by:

# linstor volume-definition set-size backups 0 100G

The parameter 0 is the number of the volume in the resource backups. You have to provide this parameter , because resources can have multiple volumes and they are identified by a so called volume-number. This number can be found by listing the volume-definitions.

The size of a volume-definition can only be decreased if it has no resource. Despite of that the size can be increased even with an deployed resource.

So far we have only created objects in LINSTOR’s database, not a single LV was created on the storage nodes. Now you have the choice of delegating the task of placement to LINSTOR or doing it yourself.

1.13.1. Manual placement

With the resource create command you may assign a resource definition to named nodes explicitly.

# linstor resource create alpha backups --storage-pool pool_hdd
# linstor resource create bravo backups --storage-pool pool_hdd
# linstor resource create charlie backups --storage-pool pool_hdd

1.13.2. Autoplace

The value after autoplace tells LINSTOR how many replicas you want to have. The storage-pool option should be obvious.

# linstor resource create backups --auto-place 3 --storage-pool pool_hdd

Maybe not so obvious is that you may omit the --storage-pool option, then LINSTOR may select a storage pool on its own. The selection follows these rules:

  • Ignore all nodes and storage pools the current user has no access to

  • Ignore all diskless storage pools

  • Ignore all storage pools not having enough free space

The remaining storage pools will be rated by different strategies. LINSTOR has currently three strategies:

  • MaxFreeSpace: This strategy maps the rating 1:1 to the remaining free space of the storage pool. However, this strategy only considers the actually allocated space (in case of thinly provisioned storage pool this might grow with time without creating new resources)

  • MinReservedSpace: Unlike the “MaxFreeSpace”, this strategy considers the reserved spaced. That is the space that a thin volume can grow to before reaching its limit. The sum of reserved spaces might exceed the storage pools capacity, which is as overprovisioning.

  • MinRscCount: Simply the count of resources already deployed in a given storage pool

  • MaxThroughput: For this strategy, the storage pool’s Autoplacer/MaxThroughput property is the base of the score, or 0 if the property is not present. Every Volume deployed in the given storage pool will subtract its defined sys/fs/blkio_throttle_read and sys/fs/blkio_throttle_write property- value from the storage pool’s max throughput. The resulting score might be negative.

The scores of the strategies will be normalized, weighted and summed up, where the scores of minimizing strategies will be converted first to allow an overall maximization of the resulting score.

The weights of the strategies can be configured with

linstor controller set-property Autoplacer/Weights/$name_of_the_strategy $weight

whereas the strategy-names are listed above and the weight can be an arbitrary decimal.

To keep the behaviour of the autoplacer similar to the old one (due to compatibility), all strategies have a default-weight of 0, except the MaxFreeSpace which has a weight of 1.
Neither 0 nor a negative score will prevent a storage pool from getting selected, just making them to be considered later.

Finally LINSTOR tries to find the best matching group of storage pools meeting all requirements. This step also considers other autoplacement restrictions as --replicas-on-same, --replicas-on-different and others.

These two arguments, --replicas-on-same and --replicas-on-different expect the name of a property within the Aux/ namespace. The following example shows that the client automatically prefixes the testProperty with the Aux/ namespace.

linstor resource-group create testRscGrp --replicas-on-same testProperty
SUCCESS:
Description:
    New resource group 'testRscGrp' created.
Details:
    Resource group 'testRscGrp' UUID is: 35e043cb-65ab-49ac-9920-ccbf48f7e27d

linstor resource-group list
+-----------------------------------------------------------------------------+
| ResourceGroup | SelectFilter                         | VlmNrs | Description |
|-============================================================================|
| DfltRscGrp    | PlaceCount: 2                        |        |             |
|-----------------------------------------------------------------------------|
| testRscGrp    | PlaceCount: 2                        |        |             |
|               | ReplicasOnSame: ['Aux/testProperty'] |        |             |
+-----------------------------------------------------------------------------+
If everything went right the DRBD-resource has now been created by LINSTOR. This can be checked by looking for the DRBD block device with the lsblk command which should look like drbd0000 or similar.

Now we should be able to mount the block device of our resource and start using LINSTOR.

2. Further LINSTOR tasks

2.1. LINSTOR high availability

By default a LINSTOR cluster consists of exactly one LINSTOR controller. Making LINSTOR highly-available involves providing replicated storage for the controller database, multiple LINSTOR controllers where only one is active at a time, and a service manager that takes care of mounting/unmounting the highly-available storage and starting/stopping LINSTOR controllers.

2.1.1. Highly-available Storage

For configuring the highly-available storage we use LINSTOR itself. This has the advantage that the storage is under LINSTOR control and can for example be easily extended to new cluster nodes. Just create a new resource with 200MB in size. This could look like this, you certainly need to adapt the storage pool name:

# linstor resource-definition create linstor_db
# linstor resource-definition drbd-options --on-no-quorum=io-error linstor_db
# linstor resource-definition drbd-options --auto-promote=no linstor_db
# linstor volume-definition create linstor_db 200M
# linstor resource create linstor_db -s pool1 --auto-place 3

From now on we assume the resource’s name is “linstor_db”. It is crucial that your cluster qualifies for auto-quorum and uses the io-error policy (see Section AutoQuorum Policies), and that auto-promote is disabled.

After the resource is created, it is time to move the LINSTOR DB to the new storage and to create a systemd mount service. First we stop the current controller and disable it, as it will be managed by drbd-reactor later.

# systemctl disable --now linstor-controller

# cat << EOF > /etc/systemd/system/var-lib-linstor.mount
[Unit]
Description=Filesystem for the LINSTOR controller

[Mount]
# you can use the minor like /dev/drbdX or the udev symlink
What=/dev/drbd/by-res/linstor_db/0
Where=/var/lib/linstor
EOF

# mv /var/lib/linstor{,.orig}
# mkdir /var/lib/linstor
# chattr +i /var/lib/linstor  # only if on LINSTOR >= 1.14.0
# drbdadm primary linstor_db
# mkfs.ext4 /dev/drbd/by-res/linstor_db/0
# systemctl start var-lib-linstor.mount
# cp -r /var/lib/linstor.orig/* /var/lib/linstor
# systemctl start linstor-controller

copy the /etc/systemd/system/var-lib-linstor.mount mount file to all the standby nodes for the linstor controller. Again, do not systemctl enable any of these services, they get managed by drbd-reactor.

2.1.2. Multiple LINSTOR controllers

The next step is to install LINSTOR controllers on all nodes that have access to the linstor_db DRBD resource (as they need to mount the DRBD volume) and which you want to become a possible LINSTOR controller. It is important that the controllers are manged by drbd-reactor, so make sure the linstor-controller.service is disabled on all nodes! To be sure, execute systemctl disable linstor-controller on all cluster nodes and systemctl stop linstor-controller on all nodes except the one it is currently running from the previous step. Also make sure to set chattr +i /var/lib/linstor on all potential controller nodes if you use LINSTOR version equal or greater to 1.14.0.

2.1.3. Managing the services

For starting and stopping the mount service and the linstor-controller service we use drbd-reactor. Install this component on all nodes that could become a LINSTOR controller and edit their /etc/drbd-reactor.d/linstor_db.toml configuration file. It should contain an enabled promoter plugin section like this:

[[promoter]]
id = "linstor_db"
[promoter.resources.linstor_db]
start = ["var-lib-linstor.mount", "linstor-controller.service"]

Depending on your requirements you might also want to set an on-stop-failure action and set stop-services-on-exit.

After that restart drbd-reactor and enable it on all the nodes you configured it.

# systemctl restart drbd-reactor
# systemctl enable drbd-reactor

Check that there are no warnings from drbd-reactor in the logs by running systemctl status drbd-reactor. As there is already an active LINSTOR controller things will just stay the way they are. Run drbd-reactorctl status linstor_db to check the health of the linstor_db target unit.

The last but never the less important step is to configure the LINSTOR satellite services to not delete (and then regenerate) the resource file for the LINSTOR controller DB at its startup. Do not edit the service files directly, but use systemctl edit. Edit the service file on all nodes that could become a LINSTOR controller and that are also LINSTOR satellites.

# systemctl edit linstor-satellite
[Service]
Environment=LS_KEEP_RES=linstor_db

After this change you should execute systemctl restart linstor-satellite on all satellite nodes.

Be sure to configure your LINSTOR client for use with multiple controllers as described in the section titled, Using the LINSTOR client and make sure you also configured your integration plugins (e.g., the Proxmox plugin) to be ready for multiple LINSTOR controllers.

2.2. DRBD clients

By using the --drbd-diskless option instead of --storage-pool you can have a permanently diskless DRBD device on a node. This means that the resource will appear as block device and can be mounted to the filesystem without an existing storage-device. The data of the resource is accessed over the network on another nodes with the same resource.

# linstor resource create delta backups --drbd-diskless
The option --diskless was deprecated. Please use --drbd-diskless or --nvme-initiator instead.

2.3. LINSTOR – DRBD consistency group/multiple volumes

The so called consistency group is a feature from DRBD. It is mentioned in this user-guide, due to the fact that one of LINSTOR’s main functions is to manage storage-clusters with DRBD. Multiple volumes in one resource are a consistency group.

This means that changes on different volumes from one resource are getting replicated in the same chronological order on the other Satellites.

Therefore you don’t have to worry about the timing if you have interdependent data on different volumes in a resource.

To deploy more than one volume in a LINSTOR-resource you have to create two volume-definitions with the same name.

# linstor volume-definition create backups 500G
# linstor volume-definition create backups 100G

2.4. Volumes of one resource to different Storage-Pools

This can be achieved by setting the StorPoolName property to the volume definitions before the resource is deployed to the nodes:

# linstor resource-definition create backups
# linstor volume-definition create backups 500G
# linstor volume-definition create backups 100G
# linstor volume-definition set-property backups 0 StorPoolName pool_hdd
# linstor volume-definition set-property backups 1 StorPoolName pool_ssd
# linstor resource create alpha backups
# linstor resource create bravo backups
# linstor resource create charlie backups
Since the volume-definition create command is used without the --vlmnr option LINSTOR assigned the volume numbers starting at 0. In the following two lines the 0 and 1 refer to these automatically assigned volume numbers.

Here the ‘resource create’ commands do not need a --storage-pool option. In this case LINSTOR uses a ‘fallback’ storage pool. Finding that storage pool, LINSTOR queries the properties of the following objects in the following order:

  • Volume definition

  • Resource

  • Resource definition

  • Node

If none of those objects contain a StorPoolName property, the controller falls back to a hard-coded ‘DfltStorPool’ string as a storage pool.

This also means that if you forgot to define a storage pool prior deploying a resource, you will get an error message that LINSTOR could not find the storage pool named ‘DfltStorPool’.

2.5. LINSTOR without DRBD

LINSTOR can be used without DRBD as well. Without DRBD, LINSTOR is able to provision volumes from LVM and ZFS backed storage pools, and create those volumes on individual nodes in your LINSTOR cluster.

Currently LINSTOR supports the creation of LVM and ZFS volumes with the option of layering some combinations of LUKS, DRBD, and/or NVMe-oF/NVMe-TCP on top of those volumes.

For example, assume we have a Thin LVM backed storage pool defined in our LINSTOR cluster named, thin-lvm:

# linstor --no-utf8 storage-pool list
+--------------------------------------------------------------+
| StoragePool | Node      | Driver   | PoolName          | ... |
|--------------------------------------------------------------|
| thin-lvm    | linstor-a | LVM_THIN | drbdpool/thinpool | ... |
| thin-lvm    | linstor-b | LVM_THIN | drbdpool/thinpool | ... |
| thin-lvm    | linstor-c | LVM_THIN | drbdpool/thinpool | ... |
| thin-lvm    | linstor-d | LVM_THIN | drbdpool/thinpool | ... |
+--------------------------------------------------------------+

We could use LINSTOR to create a Thin LVM on linstor-d that’s 100GiB in size using the following commands:

# linstor resource-definition create rsc-1
# linstor volume-definition create rsc-1 100GiB
# linstor resource create --layer-list storage \
          --storage-pool thin-lvm linstor-d rsc-1

You should then see you have a new Thin LVM on linstor-d. You can extract the device path from LINSTOR by listing your linstor resources with the --machine-readable flag set:

# linstor --machine-readable resource list | grep device_path
            "device_path": "/dev/drbdpool/rsc-1_00000",

If you wanted to layer DRBD on top of this volume, which is the default --layer-list option in LINSTOR for ZFS or LVM backed volumes, you would use the following resource creation pattern instead:

# linstor resource-definition create rsc-1
# linstor volume-definition create rsc-1 100GiB
# linstor resource create --layer-list drbd,storage \
          --storage-pool thin-lvm linstor-d rsc-1

You would then see that you have a new Thin LVM backing a DRBD volume on linstor-d:

# linstor --machine-readable resource list | grep -e device_path -e backing_disk
            "device_path": "/dev/drbd1000",
            "backing_disk": "/dev/drbdpool/rsc-1_00000",

The following table shows which layer can be followed by which child-layer:

Layer Child layer

DRBD

CACHE, WRITECACHE, NVME, LUKS, STORAGE

CACHE

WRITECACHE, NVME, LUKS, STORAGE

WRITECACHE

CACHE, NVME, LUKS, STORAGE

NVME

CACHE, WRITECACHE, LUKS, STORAGE

LUKS

STORAGE

STORAGE

One layer can only occur once in the layer-list
For information about the prerequisites for the LUKS layer, refer to the Encrypted Volumes section of this User’s Guide.

2.5.1. NVMe-oF/NVMe-TCP LINSTOR Layer

NVMe-oF/NVMe-TCP allows LINSTOR to connect diskless resources to a node with the same resource where the data is stored over NVMe fabrics. This leads to the advantage that resources can be mounted without using local storage by accessing the data over the network. LINSTOR is not using DRBD in this case, and therefore NVMe resources provisioned by LINSTOR are not replicated, the data is stored on one node.

NVMe-oF only works on RDMA-capable networks and NVMe-TCP on every network that can carry IP traffic. If you want to know more about NVMe-oF/NVMe-TCP visit https://www.linbit.com/en/nvme-linstor-swordfish/ for more information.

To use NVMe-oF/NVMe-TCP with LINSTOR the package nvme-cli needs to be installed on every Node which acts as a Satellite and will use NVMe-oF/NVMe-TCP for a resource:

If you are not using Ubuntu use the suitable command for installing packages on your OS – SLES: zypper – CentOS: yum
# apt install nvme-cli

To make a resource which uses NVMe-oF/NVMe-TCP an additional parameter has to be given as you create the resource-definition:

# linstor resource-definition create nvmedata -l nvme,storage
As default the -l (layer-stack) parameter is set to drbd, storage when DRBD is used. If you want to create LINSTOR resources with neither NVMe nor DRBD you have to set the -l parameter to only storage.

In order to use NVMe-TCP instead of the default NVMe-oF the following property needs to be set:

# linstor resource-definition set-property nvmedata NVMe/TRType tcp

The property NVMe/TRType can alternatively be set on resource-group or controller level.

Next, create the volume-definition for our resource:

# linstor volume-definition create nvmedata 500G

Before you create the resource on your nodes you have to know where the data will be stored locally and which node accesses it over the network.

First we create the resource on the node where our data will be stored:

# linstor resource create alpha nvmedata --storage-pool pool_ssd

On the nodes where the resource-data will be accessed over the network, the resource has to be defined as diskless:

# linstor resource create beta nvmedata --nvme-initiator

Now you can mount the resource nvmedata on one of your nodes.

If your nodes have more than one NIC you should force the route between them for NVMe-of/NVME-TCP, otherwise multiple NICs could cause troubles.

2.5.2. OpenFlex™ Layer

Since version 1.5.0 the additional Layer openflex can be used in LINSTOR. From LINSTOR’s perspective, the OpenFlex Composable Infrastructure takes the role of a combined layer acting as a storage layer (like LVM) and also providing the allocated space as an NVMe target. OpenFlex has a REST API which is also used by LINSTOR to operate with.

As OpenFlex combines concepts of LINSTOR’s storage as well as NVMe-layer, LINSTOR was added both, a new storage driver for the storage pools as well as a dedicated openflex layer which uses the mentioned REST API.

In order for LINSTOR to communicate with the OpenFlex-API, LINSTOR needs some additional properties, which can be set once on controller level to take LINSTOR-cluster wide effect:

  • StorDriver/Openflex/ApiHost specifies the host or IP of the API entry-point

  • StorDriver/Openflex/ApiPort this property is glued with a colon to the previous to form the basic http://ip:port part used by the REST calls

  • StorDriver/Openflex/UserName the REST username

  • StorDriver/Openflex/UserPassword the password for the REST user

Once that is configured, we can now create LINSTOR objects to represent the OpenFlex architecture. The theoretical mapping of LINSTOR objects to OpenFlex objects are as follows: Obviously an OpenFlex storage pool is represented by a LINSTOR storage pool. As the next thing above a LINSTOR storage pool is already the node, a LINSTOR node represents an OpenFlex storage device. The OpenFlex objects above storage device are not mapped by LINSTOR.

When using NVMe, LINSTOR was designed to run on both sides, the NVMe target as well as on the NVMe initiator side. In the case of OpenFlex, LINSTOR cannot (or even should not) run on the NVMe target side as that is completely managed by OpenFlex. As LINSTOR still needs nodes and storage pools to represent the OpenFlex counterparts, the LINSTOR client was extended with special node create commands since 1.0.14. These commands not only accept additionally needed configuration data, but also starts a “special satellite” besides the already running controller instance. This special satellites are completely LINSTOR managed, they will shutdown when the controller shuts down and will be started again when the controller starts. The new client command for creating a “special satellite” representing an OpenFlex storage device is:

$ linstor node create-openflex-target ofNode1 192.168.166.7 000af795789d

The arguments are as follows:

  • ofNode1 is the node name which is also used by the standard linstor node create command

  • 192.168.166.7 is the address on which the provided NVMe devices can be accessed. As the NVMe devices are accessed by a dedicated network interface, this address differs from the address specified with the property StorDriver/Openflex/ApiHost. The latter is used for the management / REST API.

  • 000af795789d is the identifier for the OpenFlex storage device.

The last step of the configuration is the creation of LINSTOR storage pools:

$ linstor storage-pool create openflex ofNode1 sp0 0
  • ofNode1 and sp0 are the node name and storage pool name, respectively, just as usual for the LINSTOR’s create storage pool command

  • The last 0 is the identifier of the OpenFlex storage pool within the previously defined storage device

Once all necessary storage pools are created in LINSTOR, the next steps are similar to the usage of using an NVMe resource with LINSTOR. Here is a complete example:

# set the properties once
linstor controller set-property StorDriver/Openflex/ApiHost 10.43.7.185
linstor controller set-property StorDriver/Openflex/ApiPort 80
linstor controller set-property StorDriver/Openflex/UserName myusername
linstor controller set-property StorDriver/Openflex/UserPassword mypassword

# create a node for openflex storage device "000af795789d"
linstor node create-openflex-target ofNode1 192.168.166.7 000af795789d

# create a usual linstor satellite. later used as nvme initiator
linstor node create bravo

# create a storage pool for openflex storage pool "0" within storage device "000af795789d"
linstor storage-pool create openflex ofNode1 sp0 0

# create resource- and volume-definition
linstor resource-definition create backupRsc
linstor volume-definition create backupRsc 10G

# create openflex-based nvme target
linstor resource create ofNode1 backupRsc --storage-pool sp0 --layer-list openflex

# create openflex-based nvme initiator
linstor resource create bravo backupRsc --nvme-initiator --layer-list openflex
In case a node should access the OpenFlex REST API through a different host than specified with
linstor controller set-property StorDriver/Openflex/ApiHost 10.43.7.185 you can always use LINSTOR’s inheritance mechanism for properties. That means simply define the same property on the node-level you need it, i.e.
linstor node set-property ofNode1 StorDriver/Openflex/ApiHost 10.43.8.185

2.5.3. Writecache Layer

A DM-Writecache device is composed by two devices, one storage device and one cache device. LINSTOR can setup such a writecache device, but needs some additional information, like the storage pool and the size of the cache device.

# linstor storage-pool create lvm node1 lvmpool drbdpool
# linstor storage-pool create lvm node1 pmempool pmempool

# linstor resource-definition create r1
# linstor volume-definition create r1 100G

# linstor volume-definition set-property r1 0 Writecache/PoolName pmempool
# linstor volume-definition set-property r1 0 Writecache/Size 1%

# linstor resource create node1 r1 --storage-pool lvmpool --layer-list WRITECACHE,STORAGE

The two properties set in the examples are mandatory, but can also be set on controller level which would act as a default for all resources with WRITECACHE in their --layer-list. However, please note that the Writecache/PoolName refers to the corresponding node. If the node does not have a storage-pool named pmempool you will get an error message.

The 4 mandatory parameters required by DM-Writecache are either configured via property or figured out by LINSTOR. The optional properties listed in the mentioned link can also be set via property. Please see linstor controller set-property --help for a list of Writecache/* property-keys.

Using --layer-list DRBD,WRITECACHE,STORAGE while having DRBD configured to use external metadata, only the backing device will use a writecache, not the device holding the external metadata.

2.5.4. Cache Layer

LINSTOR can also setup a DM-Cache device, which is very similar to the DM-Writecache from the previous section. The major difference is that a cache device is composed by three devices: one storage device, one cache device and one meta device. The LINSTOR properties are quite similar to those of the writecache but are located in the Cache namespace:

# linstor storage-pool create lvm node1 lvmpool drbdpool
# linstor storage-pool create lvm node1 pmempool pmempool

# linstor resource-definition create r1
# linstor volume-definition create r1 100G

# linstor volume-definition set-property r1 0 Cache/CachePool pmempool
# linstor volume-definition set-property r1 0 Cache/Cachesize 1%

# linstor resource create node1 r1 --storage-pool lvmpool --layer-list CACHE,STORAGE
Instead of Writecache/PoolName (as when configuring the Writecache layer) the Cache layer’s only required property is called Cache/CachePool. The reason for this is that the Cache layer also has a Cache/MetaPool which can be configured separately or it defaults to the value of Cache/CachePool.

Please see linstor controller set-property --help for a list of Cache/* property-keys and default values for omitted properties.

Using --layer-list DRBD,CACHE,STORAGE while having DRBD configured to use external metadata, only the backing device will use a cache, not the device holding the external metadata.

2.5.5. Storage Layer

The storage layer will provide new devices from well known volume managers like LVM, ZFS or others. Every layer combination needs to be based on a storage layer, even if the resource should be diskless – for that type there is a dedicated diskless provider type.

For a list of providers with their properties please see Storage Providers.

For some storage providers LINSTOR has special properties:

  • StorDriver/WaitTimeoutAfterCreate: If LINSTOR expects a device to appear after creation (for example after calls of lvcreate, zfs create,…​), LINSTOR waits per default 500ms for the device to appear. These 500ms can be overridden by this property.

  • StorDriver/dm_stats: If set to true LINSTOR calls dmstats create $device after creation and dmstats delete $device --allregions after deletion of a volume. Currently only enabled for LVM and LVM_THIN storage providers.

2.6. Storage Providers

LINSTOR has a few storage providers. The most used ones are LVM and ZFS. But also for those two providers there are already sub-types for their thinly provisioned variants.

  • Diskless: This provider type is mostly required to have a storage pool that can be configured with LINSTOR properties like PrefNic as described in Managing Network Interface Cards.

  • LVM / LVM-Thin: The adminstrator is expected to specify the LVM volume group or the thin-pool (in form of “LV/thinpool”) in order to use the corresponding storage type. These drivers support following properties for fine-tuning:

    • StorDriver/LvcreateOptions: The value of this property is appended to every lvcreate …​ call LINSTOR executes.

  • ZFS / ZFS-Thin: The administrator is expected to specify the ZPool that LINSTOR should use. These drivers support following properties for fine-tuning:

    • StorDriver/ZfscreateOptions: The value of this property is appended to every zfs create …​ call LINSTOR executes.

  • File / FileThin: Mostly used for demonstration / experiments. LINSTOR will basically reserve a file in a given directory and will configure a loop device on top of that file.

  • OpenFlex: This special storage provider currently requires to be run on a “special satellite”. Please see OpenFlex™ Layer for more details.

  • EXOS: This special storage provider currently requires to be run on a “special satellite”. Please see the EXOS Integration chapter

  • SPDK: The administrator is expected to speicify the logical volume store which LINSTOR should use. The usage of this storage provider implies the usage of the NVME Layer.

    • Remote-SPDK: This special storage provider currently requires to be run on a “special satellite”. Please see Remote SPDK Provider for more details.

2.6.1. Remote SPDK Provider

A storage pool with the type remote SPDK can only be created on a “special satellite”. For this you first need to start a new satellite using the command:

$ linstor node create-remote-spdk-target nodeName 192.168.1.110

This will start a new satellite instance running on the same machine as the controller. This special satellite will do all the REST based RPC communication towards the remote SPDK proxy. As the help message of the LINSTOR command shows, the administrator might want to use additional settings when creating this special satellite:

$ linstor node create-remote-spdk-target -h
usage: linstor node create-remote-spdk-target [-h] [--api-port API_PORT]
                                              [--api-user API_USER]
                                              [--api-user-env API_USER_ENV]
                                              [--api-pw [API_PW]]
                                              [--api-pw-env API_PW_ENV]
                                              node_name api_host

The difference between the --api-* and their corresponding --api-\*-env versions is that the version with the -env ending will look for an environment variable containing the actual value to use whereas the --api-\* version directly take the value which is stored in the LINSTOR property. Administrators might not want to save the --api-pw in plaintext, which would be clearly visible using commands like linstor node list-property <nodeName>.

Once that special satellite is up and running the actual storage pool can be created:

$ linstor storage-pool create remotespdk -h
usage: linstor storage-pool create remotespdk [-h]
                                              [--shared-space SHARED_SPACE]
                                              [--external-locking]
                                              node_name name driver_pool_name

Whereas node_name is self-explanatory, name is the name of the LINSTOR storage pool and driver_pool_name refers to the SPDK logical volume store.

Once this remotespdk storage pool is created the remaining procedure is quite similar as using NVMe: First the target has to be created by creating a simple “diskful” resource followed by a second resource having the --nvme-initiator option enabled.

2.7. Managing Network Interface Cards

LINSTOR can deal with multiple network interface cards (NICs) in a machine, they are called netif in LINSTOR speak.

When a satellite node is created a first netif gets created implicitly with the name default. Using the --interface-name option of the node create command you can give it a different name.

Additional NICs are created like this:

# linstor node interface create alpha 100G_nic 192.168.43.221
# linstor node interface create alpha 10G_nic 192.168.43.231

NICs are identified by the IP address only, the name is arbitrary and is not related to the interface name used by Linux. The NICs can be assigned to storage pools so that whenever a resource is created in such a storage pool, the DRBD traffic will be routed through the specified NIC.

# linstor storage-pool set-property alpha pool_hdd PrefNic 10G_nic
# linstor storage-pool set-property alpha pool_ssd PrefNic 100G_nic

FIXME describe how to route the controller <-> client communication through a specific netif.

2.8. Encrypted volumes

LINSTOR can handle transparent encryption of drbd volumes. dm-crypt is used to encrypt the provided storage from the storage device.

In order to use dm-crypt please make sure to have cryptsetup installed before you start the satellite

Basic steps to use encryption:

  1. Disable user security on the controller (this will be obsolete once authentication works)

  2. Create a master passphrase

  3. Add luks to the layer-list. Note that all plugins (e.g., Proxmox) require a DRBD layer as the top most layer if they do not explicitly state otherwise.

  4. Don’t forget to re-enter the master passphrase after a controller restart.

2.8.1. Disable user security

Disabling the user security on the Linstor controller is a one time operation and is afterwards persisted.

  1. Stop the running linstor-controller via systemd: systemctl stop linstor-controller

  2. Start a linstor-controller in debug mode: /usr/share/linstor-server/bin/Controller -c /etc/linstor -d

  3. In the debug console enter: setSecLvl secLvl(NO_SECURITY)

  4. Stop linstor-controller with the debug shutdown command: shutdown

  5. Start the controller again with systemd: systemctl start linstor-controller

2.8.2. Encrypt commands

Below are details about the commands.

Before LINSTOR can encrypt any volume a master passphrase needs to be created. This can be done with the linstor-client.

# linstor encryption create-passphrase

crypt-create-passphrase will wait for the user to input the initial master passphrase (as all other crypt commands will with no arguments).

If you ever want to change the master passphrase this can be done with:

# linstor encryption modify-passphrase

The luks layer can be added when creating the resource-definition or the resource itself, whereas the former method is recommended since it will be automatically applied to all resource created from that resource-definition.

# linstor resource-definition create crypt_rsc --layer-list luks,storage

To enter the master passphrase (after controller restart) use the following command:

# linstor encryption enter-passphrase
Whenever the linstor-controller is restarted, the user has to send the master passphrase to the controller, otherwise LINSTOR is unable to reopen or create encrypted volumes.

2.8.3. Automatic Passphrase

It is possible to automate the process of creating and re-entering the master passphrase.

To use this, either an environment variable called MASTER_PASSPHRASE or an entry in /etc/linstor/linstor.toml containing the master passphrase has to be created.

The required linstor.toml looks like this:

[encrypt]
passphrase="example"

If either one of these is set, then every time the controller starts it will check whether a master passphrase already exists. If there is none, it will create a new master passphrase as specified. Otherwise, the controller enters the passphrase.

If a master passphrase is already configured, and it is not the same one as specified in the environment variable or linstor.toml, the controller will be unable to re-enter the master passphrase and react as if the user had entered a wrong passphrase. This can only be resolved through manual input from the user, using the same commands as if the controller was started without the automatic passphrase.
In case the master passphrase is set in both an environment variable and the linstor.toml, only the master passphrase from the linstor.toml will be used.

2.9. Checking the state of your cluster

LINSTOR provides various commands to check the state of your cluster. These commands start with a ‘list-‘ prefix and provide various filtering and sorting options. The ‘–groupby’ option can be used to group and sort the output in multiple dimensions.

# linstor node list
# linstor storage-pool list --groupby Size

2.10. Managing snapshots

Snapshots are supported with thin LVM and ZFS storage pools.

2.10.1. Creating a snapshot

Assuming a resource definition named ‘resource1’ which has been placed on some nodes, a snapshot can be created as follows:

# linstor snapshot create resource1 snap1

This will create snapshots on all nodes where the resource is present. LINSTOR will ensure that consistent snapshots are taken even when the resource is in active use.

Setting the resource-definition property AutoSnapshot/RunEvery LINSTOR will automatically create snapshots every X minute. The optional property AutoSnapshot/Keep can be used to clean-up old snapshots which were created automatically. No manually created snapshot will be cleaned-up / deleted. If AutoSnapshot/Keep is omitted (or ⇐ 0), LINSTOR will keep the last 10 snapshots by default.

# linstor resource-definition set-property AutoSnapshot/RunEvery 15
# linstor resource-definition set-property AutoSnapshot/Keep 5

2.10.2. Restoring a snapshot

The following steps restore a snapshot to a new resource. This is possible even when the original resource has been removed from the nodes where the snapshots were taken.

First define the new resource with volumes matching those from the snapshot:

# linstor resource-definition create resource2
# linstor snapshot volume-definition restore --from-resource resource1 --from-snapshot snap1 --to-resource resource2

At this point, additional configuration can be applied if necessary. Then, when ready, create resources based on the snapshots:

# linstor snapshot resource restore --from-resource resource1 --from-snapshot snap1 --to-resource resource2

This will place the new resource on all nodes where the snapshot is present. The nodes on which to place the resource can also be selected explicitly; see the help (linstor snapshot resource restore -h).

2.10.3. Rolling back to a snapshot

LINSTOR can roll a resource back to a snapshot state. The resource must not be in use. That is, it may not be mounted on any nodes. If the resource is in use, consider whether you can achieve your goal by restoring the snapshot instead.

Rollback is performed as follows:

# linstor snapshot rollback resource1 snap1

A resource can only be rolled back to the most recent snapshot. To roll back to an older snapshot, first delete the intermediate snapshots.

2.10.4. Removing a snapshot

An existing snapshot can be removed as follows:

# linstor snapshot delete resource1 snap1

2.10.5. Shipping a snapshot

Both, the source as well as the target node have to have the resource for snapshot shipping deployed. Additionally, the target resource has to be deactivated.

# linstor resource deactivate nodeTarget resource1
Deactivating a resource with DRBD in its layer-list can NOT be reactivated again. However, a successfully shipped snapshot of a DRBD resource can still be restored into a new resource.

To manually start the snapshot-shipping, use:

# linstor snapshot ship --from-node nodeSource --to-node nodeTarget --resource resource1

By default, the snapshot-shipping uses tcp ports from the range 12000-12999. To change this range, the property SnapshotShipping/TcpPortRange, which accepts a to-from range, can be set on the controller:

# linstor controller set-property SnapshotShipping/TcpPortRange 10000-12000

A resource can also be periodically shipped. To accomplish this, it is mandatory to set the properties SnapshotShipping/TargetNode as well as SnapshotShipping/RunEvery on the resource-definition. SnapshotShipping/SourceNode can also be set, but if omitted LINSTOR will choose an active resource of the same resource-definition.

To allow incremental snapshot-shipping, LINSTOR has to keep at least the last shipped snapshot on the target node. The property SnapshotShipping/Keep can be used to specify how many snapshots LINSTOR should keep. If the property is not set (or ⇐ 0) LINSTOR will keep the last 10 shipped snapshots by default.

# linstor resource-definition set-property resource1 SnapshotShipping/TargetNode nodeTarget
# linstor resource-definition set-property resource1 SnapshotShipping/SourceNode nodeSource
# linstor resource-definition set-property resource1 SnapshotShipping/RunEvery 15
# linstor resource-definition set-property resource1 SnapshotShipping/Keep 5

2.11. Setting options for resources

DRBD options are set using LINSTOR commands. Configuration in files such as /etc/drbd.d/global_common.conf that are not managed by LINSTOR will be ignored. The following commands show the usage and available options:

# linstor controller drbd-options -h
# linstor resource-definition drbd-options -h
# linstor volume-definition drbd-options -h
# linstor resource drbd-peer-options -h

For instance, it is easy to set the DRBD protocol for a resource named backups:

# linstor resource-definition drbd-options --protocol C backups

2.12. Adding and removing disks

LINSTOR can convert resources between diskless and having a disk. This is achieved with the resource toggle-disk command, which has syntax similar to resource create.

For instance, add a disk to the diskless resource backups on ‘alpha’:

# linstor resource toggle-disk alpha backups --storage-pool pool_ssd

Remove this disk again:

# linstor resource toggle-disk alpha backups --diskless

2.12.1. Migrating disks

In order to move a resource between nodes without reducing redundancy at any point, LINSTOR’s disk migrate feature can be used. First create a diskless resource on the target node, and then add a disk using the --migrate-from option. This will wait until the data has been synced to the new disk and then remove the source disk.

For example, to migrate a resource backups from ‘alpha’ to ‘bravo’:

# linstor resource create bravo backups --drbd-diskless
# linstor resource toggle-disk bravo backups --storage-pool pool_ssd --migrate-from alpha

2.13. DRBD Proxy with LINSTOR

LINSTOR expects DRBD Proxy to be running on the nodes which are involved in the relevant connections. It does not currently support connections via DRBD Proxy on a separate node.

Suppose our cluster consists of nodes ‘alpha’ and ‘bravo’ in a local network and ‘charlie’ at a remote site, with a resource definition named backups deployed to each of the nodes. Then DRBD Proxy can be enabled for the connections to ‘charlie’ as follows:

# linstor drbd-proxy enable alpha charlie backups
# linstor drbd-proxy enable bravo charlie backups

The DRBD Proxy configuration can be tailored with commands such as:

# linstor drbd-proxy options backups --memlimit 100000000
# linstor drbd-proxy compression zlib backups --level 9

LINSTOR does not automatically optimize the DRBD configuration for long-distance replication, so you will probably want to set some configuration options such as the protocol:

# linstor resource-connection drbd-options alpha charlie backups --protocol A
# linstor resource-connection drbd-options bravo charlie backups --protocol A

Please contact LINBIT for assistance optimizing your configuration.

2.13.1. Automatically enable DRBD Proxy

LINSTOR can also be configured to automatically enable the above mentioned Proxy connection between two nodes. For this automation, LINSTOR first needs to know on which site each node is.

# linstor node set-property alpha Site A
# linstor node set-property bravo Site A
# linstor node set-property charlie Site B

As the Site property might also be used for other site-based decisions in future features, the DrbdProxy/AutoEnable also has to be set to true:

# linstor controller set-property DrbdProxy/AutoEnable true

This property can also be set on node, resource-definition, resource and resource-connection level (from left to right in increasing priority, whereas the controller is the left-most, i.e. least prioritized level)

Once this initialization steps are completed, every newly created resource will automatically check if it has to enable DRBD proxy to any of its peer-resources.

2.14. External database

It is possible to have LINSTOR working with an external database provider like PostgreSQL, MariaDB and since version 1.1.0 even ETCD key value store is supported.

To use an external database there are a few additional steps to configure. You have to create a DB/Schema and user to use for linstor, and configure this in the /etc/linstor/linstor.toml.

2.14.1. PostgreSQL

A sample PostgreSQL linstor.toml looks like this:

[db]
user = "linstor"
password = "linstor"
connection_url = "jdbc:postgresql://localhost/linstor"

2.14.2. MariaDB/MySQL

A sample MariaDB linstor.toml looks like this:

[db]
user = "linstor"
password = "linstor"
connection_url = "jdbc:mariadb://localhost/LINSTOR?createDatabaseIfNotExist=true"
The LINSTOR schema/database is created as LINSTOR so make sure the MariaDB connection string refers to the LINSTOR schema, as in the example above.

2.14.3. ETCD

ETCD is a distributed key-value store that makes it easy to keep your LINSTOR database distributed in a HA-setup. The ETCD driver is already included in the LINSTOR-controller package and only needs to be configured in the linstor.toml.

More information on how to install and configure ETCD can be found here: ETCD docs

And here is a sample [db] section from the linstor.toml:

[db]
## only set user/password if you want to use authentication, only since LINSTOR 1.2.1
# user = "linstor"
# password = "linstor"

## for etcd
## do not set user field if no authentication required
connection_url = "etcd://etcdhost1:2379,etcdhost2:2379,etcdhost3:2379"

## if you want to use TLS, only since LINSTOR 1.2.1
# ca_certificate = "ca.pem"
# client_certificate = "client.pem"

## if you want to use client TLS authentication too, only since LINSTOR 1.2.1
# client_key_pkcs8_pem = "client-key.pkcs8"
## set client_key_password if private key has a password
# client_key_password = "mysecret"

2.15. LINSTOR REST-API

To make LINSTOR’s administrative tasks more accessible and also available for web-frontends a REST-API has been created. The REST-API is embedded in the linstor-controller and since LINSTOR 0.9.13 configured via the linstor.toml configuration file.

[http]
  enabled = true
  port = 3370
  listen_addr = "127.0.0.1"  # to disable remote access

If you want to use the REST-API the current documentation can be found on the following link: https://app.swaggerhub.com/apis-docs/Linstor/Linstor/

2.15.1. LINSTOR REST-API HTTPS

The HTTP REST-API can also run secured by HTTPS and is highly recommended if you use any features that require authorization. Todo so you have to create a java keystore file with a valid certificate that will be used to encrypt all HTTPS traffic.

Here is a simple example on how you can create a self signed certificate with the keytool that is included in the java runtime:

keytool -keyalg rsa -keysize 2048 -genkey -keystore ./keystore_linstor.jks\
 -alias linstor_controller\
 -dname "CN=localhost, OU=SecureUnit, O=ExampleOrg, L=Vienna, ST=Austria, C=AT"

keytool will ask for a password to secure the generated keystore file and is needed for the LINSTOR-controller configuration. In your linstor.toml file you have to add the following section:

[https]
  keystore = "/path/to/keystore_linstor.jks"
  keystore_password = "linstor"

Now (re)start the linstor-controller and the HTTPS REST-API should be available on port 3371.

More information on how to import other certificates can be found here: https://docs.oracle.com/javase/8/docs/technotes/tools/unix/keytool.html

When HTTPS is enabled, all requests to the HTTP /v1/ REST-API will be redirected to the HTTPS redirect.
LINSTOR REST-API HTTPS restricted client access

Client access can be restricted by using a SSL truststore on the Controller. Basically you create a certificate for your client and add it to your truststore and the client then uses this certificate for authentication.

First create a client certificate:

keytool -keyalg rsa -keysize 2048 -genkey -keystore client.jks\
 -storepass linstor -keypass linstor\
 -alias client1\
 -dname "CN=Client Cert, OU=client, O=Example, L=Vienna, ST=Austria, C=AT"

Then we import this certificate to our controller truststore:

keytool -importkeystore\
 -srcstorepass linstor -deststorepass linstor -keypass linstor\
 -srckeystore client.jks -destkeystore trustore_client.jks

And enable the truststore in the linstor.toml configuration file:

[https]
  keystore = "/path/to/keystore_linstor.jks"
  keystore_password = "linstor"
  truststore = "/path/to/trustore_client.jks"
  truststore_password = "linstor"

Now restart the Controller and it will no longer be possible to access the controller API without a correct certificate.

The LINSTOR client needs the certificate in PEM format, so before we can use it we have to convert the java keystore certificate to the PEM format.

# Convert to pkcs12
keytool -importkeystore -srckeystore client.jks -destkeystore client.p12\
 -storepass linstor -keypass linstor\
 -srcalias client1 -srcstoretype jks -deststoretype pkcs12

# use openssl to convert to PEM
openssl pkcs12 -in client.p12 -out client_with_pass.pem

To avoid entering the PEM file password all the time it might be convenient to remove the password.

openssl rsa -in client_with_pass.pem -out client1.pem
openssl x509 -in client_with_pass.pem >> client1.pem

Now this PEM file can easily be used in the client:

linstor --certfile client1.pem node list

The --certfile parameter can also added to the client configuration file, see Using the LINSTOR client for more details.

2.16. Logging

Linstor uses SLF4J with Logback as binding. This gives Linstor the possibility to distinguish between the log levels ERROR, WARN, INFO, DEBUG and TRACE (in order of increasing verbosity). In the current linstor version (1.1.2) the user has the following four methods to control the logging level, ordered by priority (first has highest priority):

  1. TRACE mode can be enabled or disabled using the debug console:

    Command ==> SetTrcMode MODE(enabled)
    SetTrcMode           Set TRACE level logging mode
    New TRACE level logging mode: ENABLED
  2. When starting the controller or satellite a command line argument can be passed:

    java ... com.linbit.linstor.core.Controller ... --log-level INFO
    java ... com.linbit.linstor.core.Satellite  ... --log-level INFO
  3. The recommended place is the logging section in the configuration file. The default configuration file location is /etc/linstor/linstor.toml for the controller and /etc/linstor/linstor_satellite.toml for the satellite. Configure the logging level as follows:

    [logging]
       level="INFO"
  4. As Linstor is using Logback as an implementation, /usr/share/linstor-server/lib/logback.xml can also be used. Currently only this approach supports different log levels for different components, like shown in the example below:

    <?xml version="1.0" encoding="UTF-8"?>
    <configuration scan="false" scanPeriod="60 seconds">
    <!--
     Values for scanPeriod can be specified in units of milliseconds, seconds, minutes or hours
     https://logback.qos.ch/manual/configuration.html
    -->
     <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
       <!-- encoders are assigned the type
            ch.qos.logback.classic.encoder.PatternLayoutEncoder by default -->
       <encoder>
         <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger - %msg%n</pattern>
       </encoder>
     </appender>
     <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
       <file>${log.directory}/linstor-${log.module}.log</file>
       <append>true</append>
       <encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
         <Pattern>%d{yyyy_MM_dd HH:mm:ss.SSS} [%thread] %-5level %logger - %msg%n</Pattern>
       </encoder>
       <rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
         <FileNamePattern>logs/linstor-${log.module}.%i.log.zip</FileNamePattern>
         <MinIndex>1</MinIndex>
         <MaxIndex>10</MaxIndex>
       </rollingPolicy>
       <triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
         <MaxFileSize>2MB</MaxFileSize>
       </triggeringPolicy>
     </appender>
     <logger name="LINSTOR/Controller" level="INFO" additivity="false">
       <appender-ref ref="STDOUT" />
       <!-- <appender-ref ref="FILE" /> -->
     </logger>
     <logger name="LINSTOR/Satellite" level="INFO" additivity="false">
       <appender-ref ref="STDOUT" />
       <!-- <appender-ref ref="FILE" /> -->
     </logger>
     <root level="WARN">
       <appender-ref ref="STDOUT" />
       <!-- <appender-ref ref="FILE" /> -->
     </root>
    </configuration>

See the Logback Manual to find more details about logback.xml.

When none of the configuration methods above is used Linstor will default to INFO log level.

2.17. Monitoring

Since LINSTOR 1.8.0, a Prometheus /metrics HTTP path is provided with LINSTOR and JVM specific exports.

The /metrics path also supports 3 GET arguments to reduce LINSTOR’s reported data:

  • resource

  • storage_pools

  • error_reports

These are all default true, to disabled e.g. error-report data: http://localhost:3370/metrics?error_reports=false

2.17.1. Health check

The LINSTOR-Controller also provides a /health HTTP path that will simply return HTTP-Status 200 if the controller can access its database and all services are up and running. Otherwise it will return HTTP error status code 500 Internal Server Error.

2.18. Secure Satellite connections

It is possible to have the LINSTOR use SSL secure TCP connection between controller and satellite connections. Without going into further details on how java’s SSL engine works we will give you command line snippets using the keytool from java’s runtime environment on how to configure a 3 node setup using secure connections. The node setup looks like this:

Node alpha is the just the controller. Node bravo and node charlie are just satellites.

Here are the commands to generate such a keystore setup, values should of course be edited for your environment.

# create directories to hold the key files
mkdir -p /tmp/linstor-ssl
cd /tmp/linstor-ssl
mkdir alpha bravo charlie


# create private keys for all nodes
keytool -keyalg rsa -keysize 2048 -genkey -keystore alpha/keystore.jks\
 -storepass linstor -keypass linstor\
 -alias alpha\
 -dname "CN=Max Mustermann, OU=alpha, O=Example, L=Vienna, ST=Austria, C=AT"

keytool -keyalg rsa -keysize 2048 -genkey -keystore bravo/keystore.jks\
 -storepass linstor -keypass linstor\
 -alias bravo\
 -dname "CN=Max Mustermann, OU=bravo, O=Example, L=Vienna, ST=Austria, C=AT"

keytool -keyalg rsa -keysize 2048 -genkey -keystore charlie/keystore.jks\
 -storepass linstor -keypass linstor\
 -alias charlie\
 -dname "CN=Max Mustermann, OU=charlie, O=Example, L=Vienna, ST=Austria, C=AT"

# import truststore certificates for alpha (needs all satellite certificates)
keytool -importkeystore\
 -srcstorepass linstor -deststorepass linstor -keypass linstor\
 -srckeystore bravo/keystore.jks -destkeystore alpha/certificates.jks

keytool -importkeystore\
 -srcstorepass linstor -deststorepass linstor -keypass linstor\
 -srckeystore charlie/keystore.jks -destkeystore alpha/certificates.jks

# import controller certificate into satellite truststores
keytool -importkeystore\
 -srcstorepass linstor -deststorepass linstor -keypass linstor\
 -srckeystore alpha/keystore.jks -destkeystore bravo/certificates.jks

keytool -importkeystore\
 -srcstorepass linstor -deststorepass linstor -keypass linstor\
 -srckeystore alpha/keystore.jks -destkeystore charlie/certificates.jks

# now copy the keystore files to their host destinations
ssh root@alpha mkdir /etc/linstor/ssl
scp alpha/* root@alpha:/etc/linstor/ssl/
ssh root@bravo mkdir /etc/linstor/ssl
scp bravo/* root@bravo:/etc/linstor/ssl/
ssh root@charlie mkdir /etc/linstor/ssl
scp charlie/* root@charlie:/etc/linstor/ssl/

# generate the satellite ssl config entry
echo '[netcom]
  type="ssl"
  port=3367
  server_certificate="ssl/keystore.jks"
  trusted_certificates="ssl/certificates.jks"
  key_password="linstor"
  keystore_password="linstor"
  truststore_password="linstor"
  ssl_protocol="TLSv1.2"
' | ssh root@bravo "cat > /etc/linstor/linstor_satellite.toml"

echo '[netcom]
  type="ssl"
  port=3367
  server_certificate="ssl/keystore.jks"
  trusted_certificates="ssl/certificates.jks"
  key_password="linstor"
  keystore_password="linstor"
  truststore_password="linstor"
  ssl_protocol="TLSv1.2"
' | ssh root@charlie "cat > /etc/linstor/linstor_satellite.toml"

Now just start controller and satellites and add the nodes with --communication-type SSL.

2.19. Automatisms for DRBD-Resources

2.19.1. AutoQuorum Policies

LINSTOR automatically configures quorum policies on resources when quorum is achievable. This means, whenever you have at least two diskful and one or more diskless resource assignments, or three or more diskful resource assignments, LINSTOR will enable quorum policies for your resources automatically.

Inversely, LINSTOR will automatically disable quorum policies whenever there are less than the minimum required resource assignments to achieve quorum.

This is controlled via the, DrbdOptions/auto-quorum, property which can be applied to the linstor-controller, resource-group, and resource-definition. Accepted values for the DrbdOptions/auto-quorum property are disabled, suspend-io, and io-error.

Setting the DrbdOptions/auto-quorum property to disabled will allow you to manually, or more granularly, control the quorum policies of your resources should you so desire.

The default policies for DrbdOptions/auto-quorum are quorum majority, and on-no-quorum io-error. For more information on DRBD’s quorum features and their behavior, please refer to the quorum section of the DRBD user’s guide.
The DrbdOptions/auto-quorum policies will override any manually configured properties if DrbdOptions/auto-quorum is not disabled.

For example, to manually set the quorum policies of a resource-group named my_ssd_group, you would use the following commands:

# linstor resource-group set-property my_ssd_group DrbdOptions/auto-quorum disabled
# linstor resource-group set-property my_ssd_group DrbdOptions/Resource/quorum majority
# linstor resource-group set-property my_ssd_group DrbdOptions/Resource/on-no-quorum suspend-io

You may wish to disable DRBD’s quorum features completely. To do that, you would need to first disable DrbdOptions/auto-quorum on the appropriate LINSTOR object, and then set the DRBD quorum features accordingly. For example, use the following commands to disable quorum entirely on the my_ssd_group resource-group:

# linstor resource-group set-property my_ssd_group DrbdOptions/auto-quorum disabled
# linstor resource-group set-property my_ssd_group DrbdOptions/Resource/quorum off
# linstor resource-group set-property my_ssd_group DrbdOptions/Resource/on-no-quorum
Setting DrbdOptions/Resource/on-no-quorum to an empty value in the commands above deletes the property from the object entirely.

2.19.2. Auto-Evict

If a satellite is offline for a prolonged period of time, LINSTOR can be configured to declare that node as evicted. This triggers an automated reassignment of the affected DRBD-resources to other nodes to ensure a minimum replica count is kept.

This feature uses the following properties to adapt the behaviour.

  • DrbdOptions/AutoEvictMinReplicaCount sets the number of replicas that should always be present. You can set this property on the controller to change a global default, or on a specific resource-definition or resource-group to change it only for that resource-definition or resource-group. If this property is left empty, the place-count set for the auto-placer of the corresponding resource-group will be used.

  • DrbdOptions/AutoEvictAfterTime describes how long a node can be offline in minutes before the eviction is triggered. You can set this property on the controller to change a global default, or on a single node to give it a different behavior. The default value for this property is 60 minutes.

  • DrbdOptions/AutoEvictMaxDisconnectedNodes sets the percentage of nodes that can be not reachable (for whatever reason) at the same time. If more than the given percent of nodes are offline at the same time, the auto-evict will not be triggered for any node , since in this case LINSTOR assumes connection problems from the controller. This property can only be set for the controller, and only accepts a value between 0 and 100. The default value is 34. If you wish to turn the auto-evict-feature off, simply set this property to 0. If you want to always trigger the auto-evict, regardless of how many satellites are unreachable, set it to 100.

  • DrbdOptions/AutoEvictAllowEviction is an additional property that can stop a node from being evicted. This can be useful for various cases, for example if you need to shut down a node for maintenance. You can set this property on the controller to change a global default, or on a single node to give it a different behavior. It accepts true and false as values and per default is set to true on the controller. You can use this property to turn the auto-evict feature off by setting it to false on the controller, although this might not work completely if you already set different values for individual nodes, since those values take precedence over the global default.

After the linstor-controller loses the connection to a satellite, aside from trying to reconnect, it starts a timer for that satellite. As soon as that timer exceeds DrbdOptions/AutoEvictAfterTime and all of the DRBD-connections to the DRBD-resources on that satellite are broken, the controller will check whether or not DrbdOptions/AutoEvictMaxDisconnectedNodes has been met. If it hasn’t, and DrbdOptions/AutoEvictAllowEviction is true for the node in question, the satellite will be marked as EVICTED. At the same time, the controller will check for every DRBD-resource whether the number of resources is still above DrbdOptions/AutoEvictMinReplicaCount. If it is, the resource in question will be marked as DELETED. If it isn’t, an auto-place with the settings from the corresponding resource-group will be started. Should the auto-place fail, the controller will try again later when changes that might allow a different result, such as adding a new node, have happened. Resources where an auto-place is necessary will only be marked as DELETED if the corresponding auto-place was successful.

The evicted satellite itself will not be able to reestablish connection with the controller. Even if the node is up and running, a manual reconnect will fail. It is also not possible to delete the satellite, even if it is working as it should be. The satellite can, however, be restored. This will remove the EVICTED-flag from the satellite and allow you to use it again. Previously configured network interfaces, storage pools, properties and similar entities as well as non-DRBD-related resources and resources that could not be autoplaced somewhere else will still be on the satellite. To restore a satellite, use

# linstor node restore [nodename]

Should you wish to instead throw everything that once was on that node, including the node itself, away, you need to use the node lost command instead.

2.20. QoS Settings

2.20.1. Sysfs

LINSTOR is able to set the following Sysfs settings:

SysFs Linstor property

/sys/fs/cgroup/blkio/blkio.throttle.read_bps_device

sys/fs/blkio_throttle_read

/sys/fs/cgroup/blkio/blkio.throttle.write_bps_device

sys/fs/blkio_throttle_write

/sys/fs/cgroup/blkio/blkio.throttle.read_iops_device

sys/fs/blkio_throttle_read_iops

/sys/fs/cgroup/blkio/blkio.throttle.write_iops_device

sys/fs/blkio_throttle_write_iops

If a LINSTOR volume is composed of multiple “stacked” volume (for example DRBD with external metadata will have 3 devices: backing (storage) device, metadata device and the resulting DRBD device), setting a sys/fs/\* property for a Volume, only the bottom-most local “data”-device will receive the corresponding /sys/fs/cgroup/…​ setting. That means, in case of the example above only the backing device will receive the setting. In case a resource-definition has an nvme-target as well as an nvme-initiator resource, both bottom-most devices of each node will receive the setting. In case of the target the bottom-most device will be the volume of LVM or ZFS, whereas in case of the initiator the bottom-most device will be the connected nvme-device, regardless which other layers are stacked on top of that.

2.21. Getting help

2.21.1. From the command line

A quick way to list available commands on the command line is to type linstor.

Further information on sub-commands (e.g., list-nodes) can be retrieved in two ways:

# linstor node list -h
# linstor help node list

Using the ‘help’ sub-command is especially helpful when LINSTOR is executed in interactive mode (linstor interactive).

One of the most helpful features of LINSTOR is its rich tab-completion, which can be used to complete basically every object LINSTOR knows about (e.g., node names, IP addresses, resource names, …​). In the following examples, we show some possible completions, and their results:

# linstor node create alpha 1<tab> # completes the IP address if hostname can be resolved
# linstor resource create b<tab> c<tab> # linstor assign-resource backups charlie

If tab-completion does not work out of the box, please try to source the appropriate file:

# source /etc/bash_completion.d/linstor # or
# source /usr/share/bash_completion/completions/linstor

For zsh shell users linstor-client can generate a zsh compilation file, that has basic support for command and argument completion.

# linstor gen-zsh-completer > /usr/share/zsh/functions/Completion/Linux/_linstor

2.21.2. SOS-Report

If something goes wrong and you need help finding the cause of the issue, you can use

# linstor sos-report create

The command above will create a new sos-report in /var/log/linstor/controller/ on the controller node. Alternatively you can use

# linstor sos-report download

which will create a new sos-report and additionally downloads that report to the local machine into your current working directory.

This sos-report contains logs and useful debug-information from several sources (Linstor-logs, dmesg, versions of external tools used by Linstor, ip a, database dump and many more). These information are stored for each node in plaintext in the resulting .tar.gz file.

2.21.3. From the community

For help from the community please subscribe to our mailing list located here: https://lists.linbit.com/listinfo/drbd-user

2.21.4. GitHub

To file bug or feature request please check out our GitHub page https://github.com/linbit

2.21.5. Paid support and development

Alternatively, if you wish to purchase remote installation services, 24/7 support, access to certified repositories, or feature development please contact us: +1-877-454-6248 (1-877-4LINBIT) , International: +43-1-8178292-0 | sales@linbit.com

3. LINSTOR Volumes in Kubernetes

This chapter describes the usage of LINSTOR in Kubernetes as managed by the operator and with volumes provisioned using the LINSTOR CSI plugin.

This Chapter goes into great detail regarding all the install time options and various configurations possible with LINSTOR and Kubernetes. For those more interested in a “quick-start” for testing, or those looking for some examples for reference. We have some complete Helm Install Examples of a few common uses near the end of the chapter.

3.1. Kubernetes Overview

Kubernetes is a container orchestrator. Kubernetes defines the behavior of containers and related services via declarative specifications. In this guide, we’ll focus on using kubectl to manipulate .yaml files that define the specifications of Kubernetes objects.

3.2. Deploying LINSTOR on Kubernetes

3.2.1. Deploying with the LINSTOR Operator

LINBIT provides a LINSTOR operator to commercial support customers. The operator eases deployment of LINSTOR on Kubernetes by installing DRBD, managing Satellite and Controller pods, and other related functions.

The operator itself is installed using a Helm v3 chart as follows:

  • Create a kubernetes secret containing your my.linbit.com credentials:

    kubectl create secret docker-registry drbdiocred --docker-server=drbd.io --docker-username=<YOUR_LOGIN> --docker-email=<YOUR_EMAIL> --docker-password=<YOUR_PASSWORD>

    The name of this secret must match the one specified in the Helm values, by default drbdiocred.

  • Configure storage for the LINSTOR etcd instance. There are various options for configuring the etcd instance for LINSTOR:

    • Use an existing storage provisioner with a default StorageClass.

    • Use hostPath volumes.

    • Disable persistence for basic testing. This can be done by adding --set etcd.persistentVolume.enabled=false to the helm install command below.

  • Read the storage guide and configure a basic storage setup for LINSTOR

  • Read the section on securing the deployment and configure as needed.

  • Select the appropriate kernel module injector using --set with the helm install command in the final step.

    • Choose the injector according to the distribution you are using. Select the latest version from one of drbd9-rhel7, drbd9-rhel8,…​ from http://drbd.io/ as appropriate. The drbd9-rhel8 image should also be used for RHCOS (OpenShift). For the SUSE CaaS Platform use the SLES injector that matches the base system of the CaaS Platform you are using (e.g., drbd9-sles15sp1). For example:

      operator.satelliteSet.kernelModuleInjectionImage=drbd.io/drbd9-rhel8:v9.0.24
    • Only inject modules that are already present on the host machine. If a module is not found, it will be skipped.

      operator.satelliteSet.kernelModuleInjectionMode=DepsOnly
    • Disable kernel module injection if you are installing DRBD by other means. Deprecated by DepsOnly

      operator.satelliteSet.kernelModuleInjectionMode=None
  • Finally create a Helm deployment named linstor-op that will set up everything.

    helm repo add linstor https://charts.linstor.io
    helm install linstor-op linstor/linstor

    Further deployment customization is discussed in the advanced deployment section

LINSTOR etcd hostPath persistence

You can use the pv-hostpath Helm templates to create hostPath persistent volumes. Create as many PVs as needed to satisfy your configured etcd replicas (default 1).

Create the hostPath persistent volumes, substituting cluster node names accordingly in the nodes= option:

helm repo add linstor https://charts.linstor.io
helm install linstor-etcd linstor/pv-hostpath --set "nodes={<NODE0>,<NODE1>,<NODE2>}"

Persistence for etcd is enabled by default.

Using an existing database

LINSTOR can connect to an existing PostgreSQL, MariaDB or etcd database. For instance, for a PostgreSQL instance with the following configuration:

POSTGRES_DB: postgresdb
POSTGRES_USER: postgresadmin
POSTGRES_PASSWORD: admin123

The Helm chart can be configured to use this database instead of deploying an etcd cluster by adding the following to the Helm install command:

--set etcd.enabled=false --set "operator.controller.dbConnectionURL=jdbc:postgresql://postgres/postgresdb?user=postgresadmin&password=admin123"

3.2.2. Configuring storage

The LINSTOR operator can automate some basic storage set up for LINSTOR.

Configuring storage pool creation

The LINSTOR operator can be used to create LINSTOR storage pools. Creation is under control of the LinstorSatelliteSet resource:

$ kubectl get LinstorSatelliteSet.linstor.linbit.com linstor-op-ns -o yaml
kind: LinstorSatelliteSet
metadata:
..
spec:
  ..
  storagePools:
    lvmPools:
    - name: lvm-thick
      volumeGroup: drbdpool
    lvmThinPools:
    - name: lvm-thin
      thinVolume: thinpool
      volumeGroup: ""
    zfsPools:
    - name: my-linstor-zpool
      zPool: for-linstor
      thin: true
At install time

At install time, by setting the value of operator.satelliteSet.storagePools when running helm install.

First create a file with the storage configuration like:

operator:
  satelliteSet:
    storagePools:
      lvmPools:
      - name: lvm-thick
        volumeGroup: drbdpool

This file can be passed to the helm installation like this:

helm install -f <file> linstor-op linstor/linstor
After install

On a cluster with the operator already configured (i.e. after helm install), you can edit the LinstorSatelliteSet configuration like this:

$ kubectl edit LinstorSatelliteSet.linstor.linbit.com <satellitesetname>

The storage pool configuration can be updated like in the example above.

Preparing physical devices

By default, LINSTOR expects the referenced VolumeGroups, ThinPools and so on to be present. You can use the devicePaths: [] option to let LINSTOR automatically prepare devices for the pool. Eligible for automatic configuration are block devices that:

  • Are a root device (no partition)

  • do not contain partition information

  • have more than 1 GiB

To enable automatic configuration of devices, set the devicePaths key on storagePools entries:

  storagePools:
    lvmPools:
    - name: lvm-thick
      volumeGroup: drbdpool
      devicePaths:
      - /dev/vdb
    lvmThinPools:
    - name: lvm-thin
      thinVolume: thinpool
      volumeGroup: linstor_thinpool
      devicePaths:
      - /dev/vdc
      - /dev/vdd

Currently, this method supports creation of LVM and LVMTHIN storage pools.

lvmPools configuration
  • name name of the LINSTOR storage pool.Required

  • volumeGroup name of the VG to create.Required

  • devicePaths devices to configure for this pool.Must be empty and >= 1GiB to be recognized.Optional

  • raidLevel LVM raid level.Optional

  • vdo Enable [VDO] (requires VDO tools in the satellite).Optional

  • vdoLogicalSizeKib Size of the created VG (expected to be bigger than the backing devices by using VDO).Optional

  • vdoSlabSizeKib Slab size for VDO. Optional

lvmThinPools configuration
  • name name of the LINSTOR storage pool. Required

  • volumeGroup VG to use for the thin pool. If you want to use devicePaths, you must set this to "". This is required because LINSTOR does not allow configuration of the VG name when preparing devices.

  • thinVolume name of the thinpool. Required

  • devicePaths devices to configure for this pool. Must be empty and >= 1GiB to be recognized. Optional

  • raidLevel LVM raid level. Optional

The volume group created by LINSTOR for LVMTHIN pools will always follow the scheme “linstor_$THINPOOL”.
zfsPools configuration
  • name name of the LINSTOR storage pool. Required

  • zPool name of the zpool to use. Must already be present on all machines. Required

  • thin true to use thin provisioning, false otherwise. Required

Using automaticStorageType (DEPRECATED)

ALL eligible devices will be prepared according to the value of operator.satelliteSet.automaticStorageType, unless they are already prepared using the storagePools section. Devices are added to a storage pool based on the device name (i.e. all /dev/nvme1 devices will be part of the pool autopool-nvme1)

The possible values for operator.satelliteSet.automaticStorageType:

  • None no automatic set up (default)

  • LVM create a LVM (thick) storage pool

  • LVMTHIN create a LVM thin storage pool

  • ZFS create a ZFS based storage pool (UNTESTED)

3.2.3. Securing deployment

This section describes the different options for enabling security features available when using this operator. The following guides assume the operator is installed using Helm

Secure communication with an existing etcd instance

Secure communication to an etcd instance can be enabled by providing a CA certificate to the operator in form of a kubernetes secret. The secret has to contain the key ca.pem with the PEM encoded CA certificate as value.

The secret can then be passed to the controller by passing the following argument to helm install

--set operator.controller.dbCertSecret=<secret name>
Authentication with etcd using certificates

If you want to use TLS certificates to authenticate with an etcd database, you need to set the following option on helm install:

--set operator.controller.dbUseClientCert=true

If this option is active, the secret specified in the above section must contain two additional keys: * client.cert PEM formatted certificate presented to etcd for authentication * client.key private key in PKCS8 format, matching the above client certificate. Keys can be converted into PKCS8 format using openssl:

openssl pkcs8 -topk8 -nocrypt -in client-key.pem -out client-key.pkcs8
Configuring secure communication between LINSTOR components

The default communication between LINSTOR components is not secured by TLS. If this is needed for your setup, follow these steps:

  • Create private keys in the java keystore format, one for the controller, one for all satellites:

keytool -keyalg rsa -keysize 2048 -genkey -keystore satellite-keys.jks -storepass linstor -alias satellite -dname "CN=XX, OU=satellite, O=Example, L=XX, ST=XX, C=X"
keytool -keyalg rsa -keysize 2048 -genkey -keystore control-keys.jks -storepass linstor -alias control -dname "CN=XX, OU=control, O=Example, L=XX, ST=XX, C=XX"
  • Create a trust store with the public keys that each component needs to trust:

  • Controller needs to trust the satellites

  • Nodes need to trust the controller

    keytool -importkeystore -srcstorepass linstor -deststorepass linstor -srckeystore control-keys.jks -destkeystore satellite-trust.jks
    keytool -importkeystore -srcstorepass linstor -deststorepass linstor -srckeystore satellite-keys.jks -destkeystore control-trust.jks
  • Create kubernetes secrets that can be passed to the controller and satellite pods

    kubectl create secret generic control-secret --from-file=keystore.jks=control-keys.jks --from-file=certificates.jks=control-trust.jks
    kubectl create secret generic satellite-secret --from-file=keystore.jks=satellite-keys.jks --from-file=certificates.jks=satellite-trust.jks
  • Pass the names of the created secrets to helm install

    --set operator.satelliteSet.sslSecret=satellite-secret --set operator.controller.sslSecret=control-secret
It is currently NOT possible to change the keystore password. LINSTOR expects the passwords to be linstor. This is a current limitation of LINSTOR.
Configuring secure communications for the LINSTOR API

Various components need to talk to the LINSTOR controller via its REST interface. This interface can be secured via HTTPS, which automatically includes authentication. For HTTPS+authentication to work, each component needs access to:

  • A private key

  • A certificate based on the key

  • A trusted certificate, used to verify that other components are trustworthy

The next sections will guide you through creating all required components.

Creating the private keys

Private keys can be created using java’s keytool

keytool -keyalg rsa -keysize 2048 -genkey -keystore controller.pkcs12 -storetype pkcs12 -storepass linstor -ext san=dns:linstor-op-cs.default.svc -dname "CN=XX, OU=controller, O=Example, L=XX, ST=XX, C=X" -validity 5000
keytool -keyalg rsa -keysize 2048 -genkey -keystore client.pkcs12 -storetype pkcs12 -storepass linstor -dname "CN=XX, OU=client, O=Example, L=XX, ST=XX, C=XX" -validity 5000

The clients need private keys and certificate in a different format, so we need to convert it

openssl pkcs12 -in client.pkcs12 -passin pass:linstor -out client.cert -clcerts -nokeys
openssl pkcs12 -in client.pkcs12 -passin pass:linstor -out client.key -nocerts -nodes
The alias specified for the controller key (i.e. -ext san=dns:linstor-op-cs.default.svc) has to exactly match the service name created by the operator. When using helm, this is always of the form <release-name>-cs.<release-namespace>.svc.
It is currently NOT possible to change the keystore password. LINSTOR expects the passwords to be linstor. This is a current limitation of LINSTOR
Create the trusted certificates

For the controller to trust the clients, we can use the following command to create a truststore, importing the client certificate

keytool -importkeystore -srcstorepass linstor -srckeystore client.pkcs12 -deststorepass linstor -deststoretype pkcs12 -destkeystore controller-trust.pkcs12

For the client, we have to convert the controller certificate into a different format

openssl pkcs12 -in controller.pkcs12 -passin pass:linstor -out ca.pem -clcerts -nokeys
Create Kubernetes secrets

Now you can create secrets for the controller and for clients:

kubectl create secret generic http-controller --from-file=keystore.jks=controller.pkcs12 --from-file=truststore.jks=controller-trust.pkcs12
kubectl create secret generic http-client --from-file=ca.pem=ca.pem --from-file=client.cert=client.cert --from-file=client.key=client.key

The names of the secrets can be passed to helm install to configure all clients to use https.

--set linstorHttpsControllerSecret=http-controller  --set linstorHttpsClientSecret=http-client
Automatically set the passphrase for encrypted volumes

Linstor can be used to create encrypted volumes using LUKS. The passphrase used when creating these volumes can be set via a secret:

kubectl create secret generic linstor-pass --from-literal=MASTER_PASSPHRASE=<password>

On install, add the following arguments to the helm command:

--set operator.controller.luksSecret=linstor-pass
Helm Install Examples

All the below examples use the following sp-values.yaml file. Feel free to adjust this for your uses and environment. See Configuring storage pool creation for further details.

operator:
  satelliteSet:
    storagePools:
      lvmThinPools:
      - name: lvm-thin
        thinVolume: thinpool
        volumeGroup: ""
        devicePaths:
        - /dev/sdb

Default install. Please note this does not setup any persistence for the backing etcd key-value store.

This is not suggested for any use outside of testing.
kubectl create secret docker-registry drbdiocred --docker-server=drbd.io --docker-username=<YOUR_LOGIN> --docker-password=<YOUR_PASSWORD>
helm repo add linstor https://charts.linstor.io
helm install linstor-op linstor/linstor

Install with LINSTOR storage-pools defined at install via sp-values.yaml, persistent hostPath volumes, 3 etcd replicas, and by compiling the DRBD kernel modules for the host kernels.

This should be adequate for most basic deployments. Please note that this deployment is not using the pre-compiled DRBD kernel modules just to make this command more portable. Using the pre-compiled binaries will make for a much faster install and deployment. Using the Compile option would not be suggested for use in a large Kubernetes clusters.

kubectl create secret docker-registry drbdiocred --docker-server=drbd.io --docker-username=<YOUR_LOGIN> --docker-password=<YOUR_PASSWORD>
helm repo add linstor https://charts.linstor.io
helm install linstor-etcd linstor/pv-hostpath --set "nodes={<NODE0>,<NODE1>,<NODE2>}"
helm install -f sp-values.yaml linstor-op linstor/linstor --set etcd.replicas=3 --set operator.satelliteSet.kernelModuleInjectionMode=Compile

Install with LINSTOR storage-pools defined at install via sp-values.yaml, use an already created PostgreSQL DB (preferably clustered), instead of etcd, and use already compiled kernel modules for DRBD. Additionally, we’ll disable the Stork scheduler in this example.

The PostgreSQL database in this particular example is reachable via a service endpoint named postgres. PostgreSQL itself is configured with POSTGRES_DB=postgresdb, POSTGRES_USER=postgresadmin, and POSTGRES_PASSWORD=admin123

kubectl create secret docker-registry drbdiocred --docker-server=drbd.io --docker-username=<YOUR_LOGIN> --docker-email=<YOUR_EMAIL> --docker-password=<YOUR_PASSWORD>
helm repo add linstor https://charts.linstor.io
helm install -f sp-values.yaml linstor-op linstor/linstor --set etcd.enabled=false --set "operator.controller.dbConnectionURL=jdbc:postgresql://postgres/postgresdb?user=postgresadmin&password=admin123" --set stork.enabled=false
Terminating Helm deployment

To protect the storage infrastructure of the cluster from accidentally deleting vital components, it is necessary to perform some manual steps before deleting a Helm deployment.

  1. Delete all volume claims managed by LINSTOR components. You can use the following command to get a list of volume claims managed by LINSTOR. After checking that none of the listed volumes still hold needed data, you can delete them using the generated kubectl delete command.

    $ kubectl get pvc --all-namespaces -o=jsonpath='{range .items[?(@.metadata.annotations.volume\.beta\.kubernetes\.io/storage-provisioner=="linstor.csi.linbit.com")]}kubectl delete pvc --namespace {.metadata.namespace} {.metadata.name}{"\n"}{end}'
    kubectl delete pvc --namespace default data-mysql-0
    kubectl delete pvc --namespace default data-mysql-1
    kubectl delete pvc --namespace default data-mysql-2
    These volumes, once deleted, cannot be recovered.
  2. Delete the LINSTOR controller and satellite resources.

    Deployment of LINSTOR satellite and controller is controlled by the LinstorSatelliteSet and LinstorController resources. You can delete the resources associated with your deployment using kubectl

    kubectl delete linstorcontroller <helm-deploy-name>-cs
    kubectl delete linstorsatelliteset <helm-deploy-name>-ns

    After a short wait, the controller and satellite pods should terminate. If they continue to run, you can check the above resources for errors (they are only removed after all associated pods terminate)

  3. Delete the Helm deployment.

    If you removed all PVCs and all LINSTOR pods have terminated, you can uninstall the helm deployment

    helm uninstall linstor-op
    Due to the Helm’s current policy, the Custom Resource Definitions named LinstorController and LinstorSatelliteSet will not be deleted by the command. More information regarding Helm’s current position on CRDs can be found here.

3.2.4. Advanced deployment options

The helm charts provide a set of further customization options for advanced use cases.

global:
  imagePullPolicy: IfNotPresent # empty pull policy means k8s default is used ("always" if tag == ":latest", "ifnotpresent" else) (1)
  setSecurityContext: true # Force non-privileged containers to run as non-root users
# Dependency charts
etcd:
  persistentVolume:
    enabled: true
    storage: 1Gi
  replicas: 1 # How many instances of etcd will be added to the initial cluster. (2)
  resources: {} # resource requirements for etcd containers (3)
  image:
    repository: gcr.io/etcd-development/etcd
    tag: v3.4.15
csi-snapshotter:
  enabled: true # <- enable to add k8s snapshotting CRDs and controller. Needed for CSI snapshotting
  image: k8s.gcr.io/sig-storage/snapshot-controller:v3.0.3
  replicas: 1 (2)
  resources: {} # resource requirements for the cluster snapshot controller. (3)
stork:
  enabled: true
  storkImage: docker.io/openstorage/stork:2.6.2
  schedulerImage: k8s.gcr.io/kube-scheduler-amd64
  schedulerTag: ""
  replicas: 1 (2)
  storkResources: {} # resources requirements for the stork plugin containers (3)
  schedulerResources: {} # resource requirements for the kube-scheduler containers (3)
  podsecuritycontext: {}
csi:
  enabled: true
  pluginImage: "drbd.io/linstor-csi:v0.13.0"
  csiAttacherImage: k8s.gcr.io/sig-storage/csi-attacher:v3.1.0
  csiLivenessProbeImage: k8s.gcr.io/sig-storage/livenessprobe:v2.2.0
  csiNodeDriverRegistrarImage: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.1.0
  csiProvisionerImage: k8s.gcr.io/sig-storage/csi-provisioner:v2.1.1
  csiSnapshotterImage: k8s.gcr.io/sig-storage/csi-snapshotter:v3.0.3
  csiResizerImage: k8s.gcr.io/sig-storage/csi-resizer:v1.1.0
  controllerReplicas: 1 (2)
  nodeAffinity: {} (4)
  nodeTolerations: [] (4)
  controllerAffinity: {} (4)
  controllerTolerations: [] (4)
  enableTopology: false
  resources: {} (3)
priorityClassName: ""
drbdRepoCred: drbdiocred
linstorHttpsControllerSecret: "" # <- name of secret containing linstor server certificates+key.
linstorHttpsClientSecret: "" # <- name of secret containing linstor client certificates+key.
controllerEndpoint: "" # <- override to the generated controller endpoint. use if controller is not deployed via operator
psp:
  privilegedRole: ""
  unprivilegedRole: ""
operator:
  replicas: 1 # <- number of replicas for the operator deployment (2)
  image: "drbd.io/linstor-operator:v1.5.0"
  affinity: {} (4)
  tolerations: [] (4)
  resources: {} (3)
  podsecuritycontext: {}
  controller:
    enabled: true
    controllerImage: "drbd.io/linstor-controller:v1.12.3"
    luksSecret: ""
    dbCertSecret: ""
    dbUseClientCert: false
    sslSecret: ""
    affinity: {} (4)
    tolerations: (4)
      - key: node-role.kubernetes.io/master
        operator: "Exists"
        effect: "NoSchedule"
    resources: {} (3)
    replicas: 1 (2)
    additionalEnv: [] (5)
    additionalProperties: {} (6)
  satelliteSet:
    enabled: true
    satelliteImage: "drbd.io/linstor-satellite:v1.12.3"
    storagePools: {}
    sslSecret: ""
    automaticStorageType: None
    affinity: {} (4)
    tolerations: [] (4)
    resources: {} (3)
    monitoringImage: "drbd.io/drbd-reactor:v0.3.0"
    kernelModuleInjectionImage: "drbd.io/drbd9-rhel7:v9.0.29"
    kernelModuleInjectionMode: ShippedModules
    kernelModuleInjectionResources: {} (3)
    additionalEnv: [] (5)
haController:
  enabled: true
  image: drbd.io/linstor-k8s-ha-controller:v0.1.3
  affinity: {} (4)
  tolerations: [] (4)
  resources: {} (3)
  replicas: 1 (2)
1 Sets the pull policy for all images.
2 Controls the number of replicas for each component.
3 Set container resource requests and limits. See the kubernetes docs. Most containers need a minimal amount of resources, except for:
  • etcd.resources See the etcd docs

  • operator.controller.resources Around 700MiB memory is required

  • operater.satelliteSet.resources Around 700MiB memory is required

  • operator.satelliteSet.kernelModuleInjectionResources If kernel modules are compiled, 1GiB of memory is required.

4 Affinity and toleration determine where pods are scheduled on the cluster. See the kubernetes docs on affinity and toleration. This may be especially important for the operator.satelliteSet and csi.node* values. To schedule a pod using a LINSTOR persistent volume, the node requires a running LINSTOR satellite and LINSTOR CSI pod.
5 Sets additional environments variables to pass to the Linstor Controller and Satellites. Uses the same format as the env value of a container
6 Sets additional properties on the Linstor Controller. Expects a simple mapping of <property-key>: <value>.
High Availability Deployment

To create a High Availability deployment of all components, take a look at the upstream guide The default values are chosen so that scaling the components to multiple replicas ensures that the replicas are placed on different nodes. This ensures that a single node failures will not interrupt the service.

Monitoring with Prometheus

Starting with Linstor Operator v1.5.0, you can use Prometheus to monitor Linstor components. The operator will set up monitoring containers along the existing components and make them available as a Service.

If you use the Prometheus Operator, the Linstor Operator will also set up the ServiceMonitor instances. The metrics will automatically be collected by the Prometheus instance associated to the operator, assuming watching the Piraeus namespace is enabled.

To disable exporting of metrics, set operator.satelliteSet.monitoringImage to an empty value.

Linstor Controller Monitoring

The Linstor Controller exports cluster-wide metrics. Metrics are exported on the existing controller service, using the path /metrics.

DRBD Resource Monitoring

All satellites are bundled with a secondary container that uses drbd-reactor to export metrics directly from DRBD. The metrics are available on port 9942, for convenience a headless service named <linstorsatelliteset-name>-monitoring is provided.

If you want to disable the monitoring container, set monitoringImage to "" in your LinstorSatelliteSet resource.

3.2.5. Deploying with an external LINSTOR controller

The operator can configure the satellites and CSI plugin to use an existing LINSTOR setup. This can be useful in cases where the storage infrastructure is separate from the Kubernetes cluster. Volumes can be provisioned in diskless mode on the Kubernetes nodes while the storage nodes will provide the backing disk storage.

To skip the creation of a LINSTOR Controller deployment and configure the other components to use your existing LINSTOR Controller, use the following options when running helm install:

  • operator.controller.enabled=false This disables creation of the LinstorController resource

  • operator.etcd.enabled=false Since no LINSTOR Controller will run on Kubernetes, no database is required.

  • controllerEndpoint=<url-of-linstor-controller> The HTTP endpoint of the existing LINSTOR Controller. For example: http://linstor.storage.cluster:3370/

After all pods are ready, you should see the Kubernetes cluster nodes as satellites in your LINSTOR setup.

Your kubernetes nodes must be reachable using their IP by the controller and storage nodes.

Create a storage class referencing an existing storage pool on your storage nodes.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: linstor-on-k8s
provisioner: linstor.csi.linbit.com
parameters:
  autoPlace: "3"
  storagePool: existing-storage-pool
  resourceGroup: linstor-on-k8s

You can provision new volumes by creating PVCs using your storage class. The volumes will first be placed only on nodes with the given storage pool, i.e. your storage infrastructure. Once you want to use the volume in a pod, LINSTOR CSI will create a diskless resource on the Kubernetes node and attach over the network to the diskful resource.

3.2.6. Deploying with the Piraeus Operator

The community supported edition of the LINSTOR deployment in Kubernetes is called Piraeus. The Piraeus project provides an operator for deployment.

3.3. Interacting with LINSTOR in Kubernetes

The Controller pod includes a LINSTOR Client, making it easy to interact directly with LINSTOR. For instance:

kubectl exec deployment/linstor-op-cs-controller -- linstor storage-pool list

For a convenient shortcut to the above command, download kubectl-linstor and install it alongside kubectl. Then you can use kubectl linstor to get access to the complete Linstor CLI

$ kubectl linstor node list
╭───────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node                                      ┊ NodeType   ┊ Addresses                   ┊ State  ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════╡
┊ kube-node-01.test                         ┊ SATELLITE  ┊ 10.43.224.26:3366 (PLAIN)   ┊ Online ┊
┊ kube-node-02.test                         ┊ SATELLITE  ┊ 10.43.224.27:3366 (PLAIN)   ┊ Online ┊
┊ kube-node-03.test                         ┊ SATELLITE  ┊ 10.43.224.28:3366 (PLAIN)   ┊ Online ┊
┊ linstor-op-cs-controller-85b4f757f5-kxdvn ┊ CONTROLLER ┊ 172.24.116.114:3366 (PLAIN) ┊ Online ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────╯

It also expands references to PVCs to the matching Linstor resource

$ kubectl linstor resource list -r pvc:my-namespace/demo-pvc-1 --all
pvc:my-namespace/demo-pvc-1 -> pvc-2f982fb4-bc05-4ee5-b15b-688b696c8526
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName                             ┊ Node              ┊ Port ┊ Usage  ┊ Conns ┊    State   ┊ CreatedOn           ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ pvc-2f982fb4-bc05-4ee5-b15b-688b696c8526 ┊ kube-node-01.test ┊ 7000 ┊ Unused ┊ Ok    ┊   UpToDate ┊ 2021-02-05 09:16:09 ┊
┊ pvc-2f982fb4-bc05-4ee5-b15b-688b696c8526 ┊ kube-node-02.test ┊ 7000 ┊ Unused ┊ Ok    ┊ TieBreaker ┊ 2021-02-05 09:16:08 ┊
┊ pvc-2f982fb4-bc05-4ee5-b15b-688b696c8526 ┊ kube-node-03.test ┊ 7000 ┊ InUse  ┊ Ok    ┊   UpToDate ┊ 2021-02-05 09:16:09 ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

It also expands references of the form pod:[<namespace>/]<podname> into a list resources in use by the pod.

This should only be necessary for investigating problems and accessing advanced functionality. Regular operation such as creating volumes should be achieved via the Kubernetes integration.

3.4. Basic Configuration and Deployment

Once all linstor-csi Pods are up and running, we can provision volumes using the usual Kubernetes workflows.

Configuring the behavior and properties of LINSTOR volumes deployed via Kubernetes is accomplished via the use of StorageClasses.

the “resourceGroup” parameter is mandatory. Usually you want it to be unique and the same as the storage class name.

Here below is the simplest practical StorageClass that can be used to deploy volumes:

Listing 1. linstor-basic-sc.yaml
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
  # The name used to identify this StorageClass.
  name: linstor-basic-storage-class
  # The name used to match this StorageClass with a provisioner.
  # linstor.csi.linbit.com is the name that the LINSTOR CSI plugin uses to identify itself
provisioner: linstor.csi.linbit.com
parameters:
  # LINSTOR will provision volumes from the drbdpool storage pool configured
  # On the satellite nodes in the LINSTOR cluster specified in the plugin's deployment
  storagePool: "drbdpool"
  resourceGroup: "linstor-basic-storage-class"
  # Setting a fstype is required for "fsGroup" permissions to work correctly.
  # Currently supported: xfs/ext4
  csi.storage.k8s.io/fstype: xfs

DRBD options can be set as well in the parameters section. Valid keys are defined in the LINSTOR REST-API (e.g., DrbdOptions/Net/allow-two-primaries: "yes").

We can create the StorageClass with the following command:

kubectl create -f linstor-basic-sc.yaml

Now that our StorageClass is created, we can now create a PersistentVolumeClaim which can be used to provision volumes known both to Kubernetes and LINSTOR:

Listing 2. my-first-linstor-volume-pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: my-first-linstor-volume
spec:
  storageClassName: linstor-basic-storage-class
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi

We can create the PersistentVolumeClaim with the following command:

kubectl create -f my-first-linstor-volume-pvc.yaml

This will create a PersistentVolumeClaim known to Kubernetes, which will have a PersistentVolume bound to it, additionally LINSTOR will now create this volume according to the configuration defined in the linstor-basic-storage-class StorageClass. The LINSTOR volume’s name will be a UUID prefixed with csi- This volume can be observed with the usual linstor resource list. Once that volume is created, we can attach it to a pod. The following Pod spec will spawn a Fedora container with our volume attached that busy waits so it is not unscheduled before we can interact with it:

Listing 3. my-first-linstor-volume-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: fedora
  namespace: default
spec:
  containers:
  - name: fedora
    image: fedora
    command: [/bin/bash]
    args: ["-c", "while true; do sleep 10; done"]
    volumeMounts:
    - name: my-first-linstor-volume
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: my-first-linstor-volume
    persistentVolumeClaim:
      claimName: "my-first-linstor-volume"

We can create the Pod with the following command:

kubectl create -f my-first-linstor-volume-pod.yaml

Running kubectl describe pod fedora can be used to confirm that Pod scheduling and volume attachment succeeded.

To remove a volume, please ensure that no pod is using it and then delete the PersistentVolumeClaim via kubectl. For example, to remove the volume that we just made, run the following two commands, noting that the Pod must be unscheduled before the PersistentVolumeClaim will be removed:

kubectl delete pod fedora # unschedule the pod.

kubectl get pod -w # wait for pod to be unscheduled

kubectl delete pvc my-first-linstor-volume # remove the PersistentVolumeClaim, the PersistentVolume, and the LINSTOR Volume.

3.4.1. Available parameters in a StorageClass

The following storage class contains all currently available parameters to configure the provisioned storage

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: full-example
provisioner: linstor.csi.linbit.com
parameters:
  # CSI related parameters
  csi.storage.k8s.io/fstype: xfs
  # LINSTOR parameters
  autoPlace: "2"
  placementCount: "2"
  resourceGroup: "full-example"
  storagePool: "my-storage-pool"
  disklessStoragePool: "DfltDisklessStorPool"
  layerList: "drbd storage"
  placementPolicy: "AutoPlace"
  allowRemoteVolumeAccess: "true"
  encryption: "true"
  nodeList: "diskful-a diskful-b"
  clientList: "diskless-a diskless-b"
  replicasOnSame: "zone=a"
  replicasOnDifferent: "rack"
  disklessOnRemaining: "false"
  doNotPlaceWithRegex: "tainted.*"
  fsOpts: "nodiscard"
  mountOpts: "noatime"
  postMountXfsOpts: "extsize 2m"
  # Linstor properties
  property.linstor.csi.linbit.com/*: <x>
  # DRBD parameters
  DrbdOptions/*: <x>

3.4.2. csi.storage.k8s.io/fstype

Sets the file system type to create for volumeMode: FileSystem PVCs. Currently supported are:

  • ext4 (default)

  • xfs

3.4.3. autoPlace

autoPlace is an integer that determines the amount of replicas a volume of this StorageClass will have. For instance, autoPlace: "3" will produce volumes with three-way replication. If neither autoPlace nor nodeList are set, volumes will be automatically placed on one node.

If you use this option, you must not use nodeList.
You have to use quotes, otherwise Kubernetes will complain about a malformed StorageClass.
This option (and all options which affect autoplacement behavior) modifies the number of LINSTOR nodes on which the underlying storage for volumes will be provisioned and is orthogonal to which kubelets those volumes will be accessible from.

3.4.4. placementCount

placementCount is an alias for autoPlace

3.4.5. resourceGroup

The LINSTOR Resource Group (RG) to associate with this StorageClass. If not set, a new RG will be created for each new PVC.

3.4.6. storagePool

storagePool is the name of the LINSTOR storage pool that will be used to provide storage to the newly-created volumes.

Only nodes configured with this same storage pool with be considered for autoplacement. Likewise, for StorageClasses using nodeList all nodes specified in that list must have this storage pool configured on them.

3.4.7. disklessStoragePool

disklessStoragePool is an optional parameter that only effects LINSTOR volumes assigned disklessly to kubelets i.e., as clients. If you have a custom diskless storage pool defined in LINSTOR, you’ll specify that here.

3.4.8. layerList

A comma-separated list of layers to use for the created volumes. The available layers and their order are described towards the end of this section. Defaults to drbd,storage

3.4.9. placementPolicy

Select from one of the available volume schedulers:

  • AutoPlace, the default: Use LINSTOR autoplace, influenced by replicasOnSame and replicasOnDifferent

  • FollowTopology: Use CSI Topology information to place at least one volume in each “preferred” zone. Only useable if CSI Topology is enabled.

  • Manual: Use only the nodes listed in nodeList and clientList.

  • Balanced: EXPERIMENTAL Place volumes across failure domains, using the least used storage pool on each selected node.

3.4.10. allowRemoteVolumeAccess

Disable remote access to volumes. This implies that volumes can only be accessed from the initial set of nodes selected on creation. CSI Topology processing is required to place pods on the correct nodes.

3.4.11. encryption

encryption is an optional parameter that determines whether to encrypt volumes. LINSTOR must be configured for encryption for this to work properly.

3.4.12. nodeList

nodeList is a list of nodes for volumes to be assigned to. This will assign the volume to each node and it will be replicated among all of them. This can also be used to select a single node by hostname, but it’s more flexible to use replicasOnSame to select a single node.

If you use this option, you must not use autoPlace.
This option determines on which LINSTOR nodes the underlying storage for volumes will be provisioned and is orthogonal from which kubelets these volumes will be accessible.

3.4.13. clientList

clientList is a list of nodes for diskless volumes to be assigned to. Use in conjunction with nodeList.

3.4.14. replicasOnSame

replicasOnSame is a list of key or key=value items used as autoplacement selection labels when autoplace is used to determine where to provision storage. These labels correspond to LINSTOR node properties.

LINSTOR node properties are different from kubernetes node labels. You can see the properties of a node by running linstor node list-properties <nodename>. You can also set additional properties (“auxiliary properties”): linstor node set-property <nodename> --aux <key> <value>.

Let’s explore this behavior with examples assuming a LINSTOR cluster such that node-a is configured with the following auxiliary property zone=z1 and role=backups, while node-b is configured with only zone=z1.

If we configure a StorageClass with autoPlace: "1" and replicasOnSame: "zone=z1 role=backups", then all volumes created from that StorageClass will be provisioned on node-a, since that is the only node with all of the correct key=value pairs in the LINSTOR cluster. This is the most flexible way to select a single node for provisioning.

This guide assumes LINSTOR CSI version 0.10.0 or newer. All properties referenced in replicasOnSame and replicasOnDifferent are interpreted as auxiliary properties. If you are using an older version of LINSTOR CSI, you need to add the Aux/ prefix to all property names. So replicasOnSame: "zone=z1" would be replicasOnSame: "Aux/zone=z1" Using Aux/ manually will continue to work on newer LINSTOR CSI versions.

If we configure a StorageClass with autoPlace: "1" and replicasOnSame: "zone=z1", then volumes will be provisioned on either node-a or node-b as they both have the zone=z1 aux prop.

If we configure a StorageClass with autoPlace: "2" and replicasOnSame: "zone=z1 role=backups", then provisioning will fail, as there are not two or more nodes that have the appropriate auxiliary properties.

If we configure a StorageClass with autoPlace: "2" and replicasOnSame: "zone=z1", then volumes will be provisioned on both node-a and node-b as they both have the zone=z1 aux prop.

You can also use a property key without providing a value to ensure all replicas are placed on nodes with the same property value, with caring about the particular value. Assuming there are 4 nodes, node-a1 and node-a2 are configured with zone=a. node-b1 and node-b2 are configured with zone=b. Using autoPlace: "2" and replicasOnSame: "zone" will place on either node-a1 and node-a2 OR on node-b1 and node-b2.

3.4.15. replicasOnDifferent

replicasOnDifferent takes a list of properties to consider, same as replicasOnSame. There are two modes of using replicasOnDifferent:

  • Preventing volume placement on specific nodes:

    If a value is given for the property, the nodes which have that property-value pair assigned will be considered last.

    Example: replicasOnDifferent: "no-csi-volumes=true" will place no volume on any node with property no-csi-volumes=true unless there are not enough other nodes to fulfill the autoPlace setting.

  • Distribute volumes across nodes with different values for the same key:

    If no property value is given, LINSTOR will place the volumes across nodes with different values for that property if possible.

    Example: Assuming there are 4 nodes, node-a1 and node-a2 are configured with zone=a. node-b1 and node-b2 are configured with zone=b. Using a StorageClass with autoPlace: "2" and replicasOnDifferent: "zone", LINSTOR will create one replica on either node-a1 or node-a2 and one replica on either node-b1 or node-b2.

3.4.16. disklessOnRemaining

Create a diskless resource on all nodes that were not assigned a diskful resource.

3.4.17. doNotPlaceWithRegex

Do not place the resource on a node which has a resource with a name matching the regex.

3.4.18. fsOpts

fsOpts is an optional parameter that passes options to the volume’s filesystem at creation time.

Please note these values are specific to your chosen filesystem.

3.4.19. mountOpts

mountOpts is an optional parameter that passes options to the volume’s filesystem at mount time.

3.4.20. postMountXfsOpts

Extra arguments to pass to xfs_io, which gets called before right before first use of the volume.

3.4.21. property.linstor.csi.linbit.com/*

Parameters starting with property.linstor.csi.linbit.com/ are translated to Linstor properties that are set on the Resource Group associated with the StorageClass.

For example, to set DrbdOptions/auto-quorum to disabled, use:

property.linstor.csi.linbit.com/DrbdOptions/auto-quorum: disabled

The full list of options is available here

3.4.22. DrbdOptions/*: <x>

This option is deprecated, use the more general property.linstor.csi.linbit.com/* form.

Advanced DRBD options to pass to LINSTOR. For example, to change the replication protocol, use DrbdOptions/Net/protocol: "A".

3.5. Snapshots

Creating snapshots and creating new volumes from snapshots is done via the use of VolumeSnapshots, VolumeSnapshotClasses, and PVCs.

3.5.1. Adding snapshot support

LINSTOR supports the volume snapshot feature, which is currently in beta. To use it, you need to install a cluster wide snapshot controller. This is done either by the cluster provider, or you can use the LINSTOR chart.

By default, the LINSTOR chart will install its own snapshot controller. This can lead to conflict in some cases:

  • the cluster already has a snapshot controller

  • the cluster does not meet the minimal version requirements (>= 1.17)

In such a case, installation of the snapshot controller can be disabled:

--set csi-snapshotter.enabled=false

3.5.2. Using volume snapshots

Then we can create our VolumeSnapshotClass:

Listing 4. my-first-linstor-snapshot-class.yaml
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshotClass
metadata:
  name: my-first-linstor-snapshot-class
driver: linstor.csi.linbit.com
deletionPolicy: Delete

Create the VolumeSnapshotClass with kubectl:

kubectl create -f my-first-linstor-snapshot-class.yaml

Now we will create a volume snapshot for the volume that we created above. This is done with a VolumeSnapshot:

Listing 5. my-first-linstor-snapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: my-first-linstor-snapshot
spec:
  volumeSnapshotClassName: my-first-linstor-snapshot-class
  source:
    persistentVolumeClaimName: my-first-linstor-volume

Create the VolumeSnapshot with kubectl:

kubectl create -f my-first-linstor-snapshot.yaml

You can check that the snapshot creation was successful

kubectl describe volumesnapshots.snapshot.storage.k8s.io my-first-linstor-snapshot
...
Spec:
  Source:
    Persistent Volume Claim Name:  my-first-linstor-snapshot
  Volume Snapshot Class Name:      my-first-linstor-snapshot-class
Status:
  Bound Volume Snapshot Content Name:  snapcontent-b6072ab7-6ddf-482b-a4e3-693088136d2c
  Creation Time:                       2020-06-04T13:02:28Z
  Ready To Use:                        true
  Restore Size:                        500Mi

Finally, we’ll create a new volume from the snapshot with a PVC.

Listing 6. my-first-linstor-volume-from-snapshot.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-first-linstor-volume-from-snapshot
spec:
  storageClassName: linstor-basic-storage-class
  dataSource:
    name: my-first-linstor-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi

Create the PVC with kubectl:

kubectl create -f my-first-linstor-volume-from-snapshot.yaml

3.6. Volume Accessibility

LINSTOR volumes are typically accessible both locally and over the network.

By default, the CSI plugin will attach volumes directly if the Pod happens to be scheduled on a kubelet where its underlying storage is present. However, Pod scheduling does not currently take volume locality into account. The replicasOnSame parameter can be used to restrict where the underlying storage may be provisioned, if locally attached volumes are desired.

See placementPolicy to see how this default behavior can be modified.

3.7. Volume Locality Optimization using Stork

Stork is a scheduler extender plugin for Kubernetes which allows a storage driver to give the Kubernetes scheduler hints about where to place a new pod so that it is optimally located for storage performance. You can learn more about the project on its GitHub page.

The next Stork release will include the LINSTOR driver by default. In the meantime, you can use a custom-built Stork container by LINBIT which includes a LINSTOR driver, available on Docker Hub

3.7.1. Using Stork

By default, the operator will install the components required for Stork, and register a new scheduler called stork with Kubernetes. This new scheduler can be used to place pods near to their volumes.

apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  schedulerName: stork (1)
  containers:
  - name: busybox
    image: busybox
    command: ["tail", "-f", "/dev/null"]
    volumeMounts:
    - name: my-first-linstor-volume
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: my-first-linstor-volume
    persistentVolumeClaim:
      claimName: "test-volume"
1 Add the name of the scheduler to your pod.

Deployment of the scheduler can be disabled using

--set stork.enabled=false

3.8. Fast workload fail over using the High Availability Controller

The LINSTOR High Availability Controller (HA Controller) will speed up the fail over process for stateful workloads using LINSTOR for storage. It is deployed by default, and can be scaled to multiple replicas:

$ kubectl get pods -l app.kubernetes.io/name=linstor-op-ha-controller
NAME                                       READY   STATUS    RESTARTS   AGE
linstor-op-ha-controller-f496c5f77-fr76m   1/1     Running   0          89s
linstor-op-ha-controller-f496c5f77-jnqtc   1/1     Running   0          89s
linstor-op-ha-controller-f496c5f77-zcrqg   1/1     Running   0          89s

In the event of node failures, Kubernetes is very conservative in rescheduling stateful workloads. This means it can take more than 15 minutes for Pods to be moved from unreachable nodes. With the information available to DRBD and LINSTOR, this process can be sped up significantly.

The HA Controller enables fast fail over for:

  • Pods using DRBD backed PersistentVolumes. The DRBD resources must make use of the quorum functionality LINSTOR will configure this automatically for volumes with 2 or more replicas in clusters with at least 3 nodes.

  • The workload does not use any external resources in a way that could lead to a conflicting state if two instances try to use the external resource at the same time. While DRBD can ensure that only one instance can have write access to the storage, it cannot provide the same guarantee for external resources.

  • The Pod is marked with the linstor.csi.linbit.com/on-storage-lost: remove label.

3.8.1. Example

The following StatefulSet uses the HA Controller to manage fail over of a pod.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: my-stateful-app
spec:
  serviceName: my-stateful-app
  selector:
    matchLabels:
      app.kubernetes.io/name: my-stateful-app
  template:
    metadata:
      labels:
        app.kubernetes.io/name: my-stateful-app
        linstor.csi.linbit.com/on-storage-lost: remove (1)
    ...
1 The label is applied to Pod template, not the StatefulSet. The label was applied correctly, if your Pod appears in the output of kubectl get pods -l linstor.csi.linbit.com/on-storage-lost=remove.

Deploy the set and wait for the pod to start

$ kubectl get pod -o wide
NAME                                        READY   STATUS              RESTARTS   AGE     IP                NODE                    NOMINATED NODE   READINESS GATES
my-stateful-app-0                           1/1     Running             0          5m      172.31.0.1        node01.ha.cluster       <none>           <none>

Then one of the nodes becomes unreachable. Shortly after, Kubernetes will mark the node as NotReady

$ kubectl get nodes
NAME                    STATUS     ROLES     AGE    VERSION
master01.ha.cluster     Ready      master    12d    v1.19.4
master02.ha.cluster     Ready      master    12d    v1.19.4
master03.ha.cluster     Ready      master    12d    v1.19.4
node01.ha.cluster       NotReady   compute   12d    v1.19.4
node02.ha.cluster       Ready      compute   12d    v1.19.4
node03.ha.cluster       Ready      compute   12d    v1.19.4

After about 45 seconds, the Pod will be removed by the HA Controller and re-created by the StatefulSet

$ kubectl get pod -o wide
NAME                                        READY   STATUS              RESTARTS   AGE     IP                NODE                    NOMINATED NODE   READINESS GATES
my-stateful-app-0                           0/1     ContainerCreating   0          3s      172.31.0.1        node02.ha.cluster       <none>           <none>
$ kubectl get events --sort-by=.metadata.creationTimestamp -w
...
0s          Warning   ForceDeleted              pod/my-stateful-app-0                                                                   pod deleted because a used volume is marked as failing
0s          Warning   ForceDetached             volumeattachment/csi-d2b994ff19d526ace7059a2d8dea45146552ed078d00ed843ac8a8433c1b5f6f   volume detached because it is marked as failing
...

3.9. Upgrading a LINSTOR Deployment on Kubernetes

A LINSTOR Deployment on Kubernets can be upgraded to a new release using Helm.

Before upgrading to a new release, you should ensure you have an up-to-date backup of the LINSTOR database. If you are using the Etcd database packaged in the LINSTOR Chart, see here

Upgrades using the LINSTOR Etcd deployment require etcd to use persistent storage. Only follow these steps if Etcd was deployed using etcd.persistentVolume.enabled=true

Upgrades will update to new versions of the following components:

  • LINSTOR operator deployment

  • LINSTOR Controller

  • LINSTOR Satellite

  • LINSTOR CSI Driver

  • Etcd

  • Stork

Some versions require special steps, please take a look here The main command to upgrade to a new LINSTOR operator version is:

helm repo update
helm upgrade linstor-op linstor/linstor

If you used any customizations on the initial install, pass the same options to helm upgrade. The options currently in use can be retrieved from Helm.

# Retrieve the currently set options
$ helm get values linstor-op
USER-SUPPLIED VALUES:
USER-SUPPLIED VALUES: null
drbdRepoCred: drbdiocred
operator:
  satelliteSet:
    kernelModuleInjectionImage: drbd.io/drbd9-rhel8:v9.0.28
    storagePools:
      lvmThinPools:
      - devicePaths:
        - /dev/vdb
        name: thinpool
        thinVolume: thinpool
        volumeGroup: ""
# Save current options
$ helm get values linstor-op > orig.yaml
# modify values here as needed. for example selecting a newer DRBD image
$ vim orig.yaml
# Start the upgrade
$ helm upgrade linstor-op linstor/linstor -f orig.yaml

This triggers the rollout of new pods. After a short wait, all pods should be running and ready. Check that no errors are listed in the status section of LinstorControllers, LinstorSatelliteSets and LinstorCSIDrivers.

During the upgrade process, provisioning of volumes and attach/detach operations might not work. Existing volumes and volumes already in use by a pod will continue to work without interruption.

3.9.1. Upgrade instructions for specific versions

Some versions require special steps, see below.

Upgrade to v1.5

This version introduces a monitoring component for DRBD resources. This requires a new image and a replacement of the existing LinstorSatelliteSet CRD. Helm does not upgrade the CRDs on a chart upgrade, instead please run:

$ helm repo update
$ helm pull linstor/linstor --untar
$ kubectl replace -f linstor/crds/
customresourcedefinition.apiextensions.k8s.io/linstorcontrollers.linstor.linbit.com replaced
customresourcedefinition.apiextensions.k8s.io/linstorcsidrivers.linstor.linbit.com replaced
customresourcedefinition.apiextensions.k8s.io/linstorsatellitesets.linstor.linbit.com replaced

If you do not plan to use the provided monitoring you still need to apply the above steps, otherwise you will get an errors like the following

Error: UPGRADE FAILED: error validating "": error validating data: ValidationError(LinstorSatelliteSet.spec): unknown field "monitoringImage" in com.linbit.linstor.v1.LinstorSatelliteSet.spec
Some Helm versions fail to set the monitoring image even after replacing the CRDs. In that case, the in-cluster LinstorSatelliteSet will show an empty monitoringImage value. Edit the resource using kubectl edit linstorsatellitesets and set the value to drbd.io/drbd-reactor:v0.3.0 to enable monitoring.
Upgrade to v1.4

This version introduces a new default version for the Etcd image, so take extra care that Etcd is using persistent storage. Upgrading the Etcd image without persistent storage will corrupt the cluster.

If you are upgrading an existing cluster without making use of new Helm options, no additional steps are necessary.

If you plan to use the newly introduced additionalProperties and additionalEnv settings, you have to replace the installed CustomResourceDefinitions with newer versions. Helm does not upgrade the CRDs on a chart upgrade

$ helm pull linstor/linstor --untar
$ kubectl replace -f linstor/crds/
customresourcedefinition.apiextensions.k8s.io/linstorcontrollers.linstor.linbit.com replaced
customresourcedefinition.apiextensions.k8s.io/linstorcsidrivers.linstor.linbit.com replaced
customresourcedefinition.apiextensions.k8s.io/linstorsatellitesets.linstor.linbit.com replaced
Upgrade to v1.3

No additional steps necessary.

Upgrade to v1.2

LINSTOR operator v1.2 is supported on Kubernetes 1.17+. If you are using an older Kubernetes distribution, you may need to change the default settings, for example [the CSI provisioner](https://kubernetes-csi.github.io/docs/external-provisioner.html).

There is a known issue when updating the CSI components: the pods will not be updated to the newest image and the errors section of the LinstorCSIDrivers resource shows an error updating the DaemonSet. In this case, manually delete deployment/linstor-op-csi-controller and daemonset/linstor-op-csi-node. They will be re-created by the operator.

3.9.2. Creating Etcd Backups

To create a backup of the Etcd database and store it on your control host, run:

kubectl exec linstor-op-etcd-0 -- etcdctl snapshot save /tmp/save.db
kubectl cp linstor-op-etcd-0:/tmp/save.db save.db

These commands will create a file save.db on the machine you are running kubectl from.

4. LINSTOR Volumes in Openshift

This chapter describes the usage of LINSTOR in Openshift as managed by the operator and with volumes provisioned using the LINSTOR CSI plugin.

4.1. Openshift Overview

OpenShift is the official Red Hat developed and supported distribution of Kubernetes. As such, you can easily deploy Piraeus or the LINSTOR operator using Helm or via example yamls as mentioned in the previous chapter, LINSTOR Volumes in Kubernetes.

Some of the value of Red Hat’s Openshift is that it includes its own registry of supported and certified images and operators, in addition to a default and standard web console. This chapter describes how to install the Certified LINSTOR operator via these tools.

4.2. Deploying LINSTOR on Openshift

4.2.1. Before you Begin

LINBIT provides a certified LINSTOR operator via the RedHat marketplace. The operator eases deployment of LINSTOR on Kubernetes by installing DRBD, managing Satellite and Controller pods, and other related functions.

The operator itself is available from the Red Hat Marketplace.

Unlike deployment via the helm chart, the certified Openshift operator does not deploy the needed etcd cluster. You must deploy this yourself ahead of time. We do this via the etcd operator available on operatorhub.io.

It it advised that the etcd deployment uses persistent storage of some type. Either use an existing storage provisioner with a default StorageClass or simply use hostPath volumes.

Read the storage guide and configure a basic storage setup for LINSTOR.

Read the section on securing the deployment and configure as needed.

4.2.2. Deploying the operator pod

Once etcd and storage has been configured, we are now ready to install the LINSTOR operator. You can find the LINSTOR operator via the left-hand control pane of Openshift Web Console. Expand the “Operators” section and select “OperatorHub”. From here you need to find the LINSTOR operator. Either search for the term “LINSTOR” or filter only by “Marketplace” operators.

The LINSTOR operator can only watch for events and manage custom resources that are within the same namespace it is deployed within (OwnNamespace). This means the LINSTOR Controller, LINSTOR Satellites, and LINSTOR CSI Driver pods all need to be deployed in the same namespace as the LINSTOR Operator pod.

Once you have located the LINSTOR operator in the Marketplace, click the “Install” button and install it as you would any other operator.

At this point you should have just one pod, the operator pod, running.

Next we need to configure the remaining provided APIs.

4.2.3. Deploying the LINSTOR Controller

Again, navigate to the left-hand control pane of the Openshift Web Console. Expand the “Operators” section, but this time select “Installed Operators”. Find the entry for the “Linstor Operator”, then select the “LinstorController” from the “Provided APIs” column on the right.

From here you should see a page that says “No Operands Found” and will feature a large button on the right which says “Create LinstorController”. Click the “Create LinstorController” button.

Here you will be presented with options to configure the LINSTOR Controller. Either via the web-form view or the YAML View. Regardless of which view you select, make sure that the dbConnectionURL matches the endpoint provided from your etcd deployment. Otherwise, the defaults are usually fine for most purposes.

Lastly hit “Create”, you should now see a linstor-controller pod running.

4.2.4. Deploying the LINSTOR Satellites

Next we need to deploy the Satellites Set. Just as before navigate to the left-hand control pane of the Openshift Web Console. Expand the “Operators” section, but this time select “Installed Operators”. Find the entry for the “Linstor Operator”, then select the “LinstorSatelliteSet” from the “Provided APIs” column on the right.

From here you should see a page that says “No Operands Found” and will feature a large button on the right which says “Create LinstorSatelliteSet”. Click the “Create LinstorSatelliteSet” button.

Here you will be presented with the options to configure the LINSTOR Satellites. Either via the web-form view or the YAML View. One of the first options you’ll notice is the automaticStorageType. If set to “NONE” then you’ll need to remember to configure the storage pools yourself at a later step.

Another option you’ll notice is kernelModuleInjectionMode. I usually select “Compile” for portability sake, but selecting “ShippedModules” will be faster as it will install pre-compiled kernel modules on all the worker nodes.

Make sure the controllerEndpoint matches what is available in the kubernetes endpoints. The default is usually correct here.

Below is an example manifest:

apiVersion: linstor.linbit.com/v1
kind: LinstorSatelliteSet
metadata:
  name: linstor
  namespace: default
spec:
  satelliteImage: ''
  automaticStorageType: LVMTHIN
  drbdRepoCred: ''
  kernelModuleInjectionMode: Compile
  controllerEndpoint: 'http://linstor:3370'
  priorityClassName: ''
status:
  errors: []

Lastly hit “Create”, you should now see a linstor-node pod running on every worker node.

4.2.5. Deploying the LINSTOR CSI driver

Last bit left is the CSI pods to bridge the layer between the CSI and LINSTOR. Just as before navigate to the left-hand control pane of the OpenShift Web Console. Expand the “Operators” section, but this time select “Installed Operators”. Find the entry for the “Linstor Operator”, then select the “LinstorCSIDriver” from the “Provided APIs” column on the right.

From here you should see a page that says “No Operands Found” and will feature a large button on the right which says “Create LinstorCSIDriver”. Click the “Create LinstorCSIDriver” button.

Again, you will be presented with the options. Make sure that the controllerEndpoint is correct. Otherwise the defaults are fine for most use cases.

Lastly hit “Create”. You will now see a single “linstor-csi-controller” pod, as well as a “linstor-csi-node” pod on all worker nodes.

4.3. Interacting with LINSTOR in OpenShift.

The Controller pod includes a LINSTOR Client, making it easy to interact directly with LINSTOR. For instance:

oc exec deployment/linstor-controller -- linstor storage-pool list

This should only be necessary for investigating problems and accessing advanced functionality. Regular operation such as creating volumes should be achieved via the Kubernetes integration.

4.4. Configuration and deployment

Once the operator and all the needed pods are deployed, provisioning volumes simply follows the usual Kubernetes workflows.

As such, please see the previous chapter’s section on Basic Configuration and Deployment.

4.5. Deploying additional components

Some additional components are not included in the OperatorHub version of the LINSTOR Operator when compared to the Helm deployment. Most notably, this includes setting up Etcd and deploying the STORK integration.

Etcd can be deployed by using the Etcd Operator available in the OperatorHub.

4.5.1. Stork

To deploy STORK, you can use the single YAML deployment available at: https://charts.linstor.io/deploy/stork.yaml Download the YAML and replace every instance of MY-STORK-NAMESPACE with your desired namespace for STORK. You also need to replace MY-LINSTOR-URL with the URL of your controller. This value depends on the name you chose when creating the LinstorController resource. By default this would be http://linstor.<operator-namespace>.svc:3370

To apply the YAML to Openshift, either use oc apply -f <filename> from the command line or find the “Import YAML” option in the top right of the Openshift Web Console.

4.5.2. High Availability Controller

To deploy our High Availability Controller, you can use the single YAML deployment available at: https://charts.linstor.io/deploy/ha-controller.yaml

Download the YAML and replace:

To apply the YAML to Openshift, either use oc apply -f <filename> from the command line or find the “Import YAML” option in the top right of the Openshift Web Console.

4.5.3. Deploying via Helm on openshift

Alternatively, you can deploy the LINSTOR Operator using Helm instead. Take a look at the Kubernetes guide. Openshift requires changing some of the default values in our Helm chart.

If you chose to use Etcd with hostpath volumes for persistence (see here), you need to enable selinux relabelling. To do this pass --set selinux=true to the pv-hostpath install command.

For the LINSTOR Operator chart itself, you should change the following values:

global:
  setSecurityContext: false (1)
csi-snapshotter:
  enabled: false (2)
stork:
  schedulerTag: v1.18.6 (3)
etcd:
  podsecuritycontext:
    supplementalGroups: [1000] (4)
operator:
  satelliteSet:
    kernelModuleInjectionImage: drbd.io/drbd9-rhel8:v9.0.25 (5)
1 Openshift uses SCCs to manage security contexts.
2 The cluster wide CSI Snapshot Controller is already installed by Openshift.
3 Automatic detection of the Kubernetes Scheduler version fails in Openshift, you need to set it manually. Note: the tag does not have to match Openshift’s Kubernetes release.
4 If you choose to use Etcd deployed via Helm and use the pv-hostpath chart, Etcd needs to run as member of group 1000 to access the persistent volume.
5 The RHEL8 kernel injector also supports RHCOS.

Other overrides, such as storage pool configuration, HA deployments and more, are available and documented in the Kubernetes guide.

5. LINSTOR Volumes in Nomad

This chapter describes using LINSTOR and DRBD to provision volumes in Nomad.

5.1. Nomad Overview

Nomad is a simple and flexible workload orchestrator to deploy and manage containers and non-containerized applications across on-prem and clouds.

Nomad supports provisioning storage volumes via plugins conforming to the Container Storage Interface (CSI).

LINBIT distributes a CSI plugin in the form of container images from drbd.io. The plugin can be configured to work with a LINSTOR cluster that is deployed along or inside a Nomad cluster.

5.2. Deploying LINSTOR on Nomad

This section describes how you can deploy and configure a new LINSTOR cluster in Nomad.

If you want to install LINSTOR directly on your nodes, check out the guide on installing LINSTOR. You can skip this section and jump directly to deploying the CSI driver.

5.2.1. Preparing Nomad

In order to run LINSTOR, every Nomad agent needs to be configured to:

  • Support the docker driver and allow executing privileged containers

    To allow running privileged containers, add the following snippet to your Nomad agent configuration and restart Nomad

    Listing 7. /etc/nomad.d/docker-privileged.hcl
    plugin "docker" {
      config {
        allow_privileged = true
      }
    }
  • Support for container networking. If you don’t have the Container Network Interface plugins installed, you will only be able to use mode = "host" in your job networks. For most production setups, we recommend installing the default plugins:

    Head to the plugin release page, select the release archive appropriate for your distribution and unpack them in /opt/cni/bin. You might need to create the directory before unpacking.

  • Provide a host volume, allowing a container access to the hosts /dev directory

    To create a host volume, add the following snippet to your Nomad agent configuration and restart Nomad

    Listing 8. /etc/nomad.d/host-volume-dev.hcl
    client {
      host_volume "dev" {
        path = "/dev"
      }
    }

5.2.2. Create a LINSTOR Controller Job

The LINSTOR Controller is deployed as a service with no replicas. At any point in time, there can only be one LINSTOR Controller running in a cluster. It is possible to restart the controller on a new node, provided it still has access to the database. See LINSTOR high availability for more information.

The following example will create a Nomad job starting a single LINSTOR Controller in datacenter dc1 and connect to an external database.

Listing 9. linstor-controller.hcl
job "linstor-controller" {
  datacenters = ["dc1"] (1)
  type = "service"

  group "linstor-controller" {
    network {
      mode = "bridge"
      # port "linstor-api" { (2)
      #   static = 3370
      #   to = 3370
      # }
    }

    service { (3)
      name = "linstor-api"
      port = "3370"

      connect {
        sidecar_service {}
      }

      check {
        expose = true
        type = "http"
        name = "api-health"
        path = "/health"
        interval = "30s"
        timeout = "5s"
      }
    }

    task "linstor-controller" {
      driver = "docker"
      config {
        image = "drbd.io/linstor-controller:v1.13.0" (4)

        auth { (5)
          username = "example"
          password = "example"
          server_address = "drbd.io"
        }

        mount {
          type = "bind"
          source = "local"
          target = "/etc/linstor"
        }
      }

      # template { (6)
      #  destination = "local/linstor.toml"
      #  data = <<EOH
      #    [db]
      #    user = "example"
      #    password = "example"
      #    connection_url = "jdbc:postgresql://postgres.internal.example.com/linstor"
      #  EOH
      # }

      resources {
        cpu    = 500 # 500 MHz
        memory = 700 # 700MB
      }
    }
  }
}
1 Replace dc1 with your own datacenter name
2 This exposes the LINSTOR API on the host on port 3370.
Uncomment this section if your cluster is not configured with Consul Connect.
3 The service block is used to expose the LINSTOR API to other jobs via the service mesh.
If your cluster is not configured for Consul Connect you can remove this section.
4 This sets the LINSTOR Controller image to run. The latest images are available from drbd.io.
The use of the :latest tag is strongly discouraged, as it can quickly lead to version mismatches and unintended upgrades.
5 Sets the authentication to use when pulling the image. If pulling from drbd.io, you need to use your LINBIT customer login here. Read more about pulling from a private repo here.
6 This template can be used to set arbitrary configuration options for LINSTOR. This example configures an external database for LINSTOR. You can find a more detailed explanation of LINSTORs database options here and more on Nomad templates here.

Apply the job by running:

$ nomad job run linstor-controller.hcl
==> Monitoring evaluation "7d8911a7"
    Evaluation triggered by job "linstor-controller"
==> Monitoring evaluation "7d8911a7"
    Evaluation within deployment: "436f4b2d"
    Allocation "0b564c73" created: node "07754224", group "controller"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "7d8911a7" finished with status "complete"
Using a host volume for Linstors database

If you want to try LINSTOR without setting up an external database, you can make use of Linstors built-in filesystem based database. To make the database persistent, you need to ensure it is placed on a host volume.

Using a host volume means that only a single node is able to run the LINSTOR Controller. If the node is unavailable, the LINSTOR Cluster will also be unavailable. For alternatives, use an external (highly available) database or deploy the LINSTOR cluster directly on the hosts.

To create a host volume for the LINSTOR database, first create the directory on the host with the expected permissions

$ mkdir -p /var/lib/linstor
$ chown -R 1000:0 /var/lib/linstor

Then add the following snippet to your Nomad agent configuration and restart Nomad

Listing 10. /etc/nomad.d/host-volume-linstor-db.hcl
client {
  host_volume "linstor-db" {
    path = "/var/lib/linstor"
  }
}

Then, add the following snippets to the linstor-controller.hcl example from above and adapt the connection_url option from the configuration template.

Listing 11. job > group
volume "linstor-db" {
  type = "host"
  source = "linstor-db"
}
Listing 12. job > group > task
volume_mount {
  volume = "linstor-db"
  destination = "/var/lib/linstor"
}

template {
  destination = "local/linstor.toml"
  data = <<EOH
    [db]
    user = "linstor"
    password = "linstor"
    connection_url = "jdbc:h2:/var/lib/linstor/linstordb"
  EOH
}

5.2.3. Create a LINSTOR Satellite job

The LINSTOR Satellites are deployed as a system job in Nomad, running in a privileged container. In addition to the satellites, the job will also load the DRBD module along with other kernel modules used by LINSTOR.

The following example will create a Nomad job starting a LINSTOR satellite on every node in datacenter dc1.

Listing 13. linstor-satellite.hcl
job "linstor-satellite" {
  datacenters = ["dc1"] (1)
  type = "system"

  group "satellite" {
    network {
      mode = "host"
    }

    volume "dev" { (2)
      type = "host"
      source = "dev"
    }

    task "linstor-satellite" {
      driver = "docker"

      config {
        image = "drbd.io/linstor-satellite:v1.13.0" (3)

        auth { (4)
          username = "example"
          password = "example"
          server_address = "drbd.io"
        }

        privileged = true (5)
        network_mode = "host" (6)
      }

      volume_mount { (2)
        volume = "dev"
        destination = "/dev"
      }

      resources {
        cpu    = 500 # 500 MHz
        memory = 500 # 500MB
      }
    }

    task "drbd-loader" {
      driver = "docker"
      lifecycle {
        hook = "prestart" (7)
      }

      config {
        image = "drbd.io/drbd9-rhel8:v9.0.29" (8)

        privileged = true (5)
        auth { (4)
          username = "example"
          password = "example"
          server_address = "drbd.io"
        }
      }

      env {
        LB_HOW = "shipped_modules" (9)
      }

      volume_mount { (10)
        volume = "kernel-src"
        destination = "/usr/src"
      }
      volume_mount { (10)
        volume = "modules"
        destination = "/lib/modules"
      }
    }

    volume "modules" { (10)
      type = "host"
      source = "modules"
      read_only = true
    }

    volume "kernel-src" { (10)
      type = "host"
      source = "kernel-src"
      read_only = true
    }
  }
}
1 Replace dc1 with your own datacenter name
2 The dev host volume is the volume created in Preparing Nomad, which allows the satellite to manage the hosts block devices.
3 This sets the LINSTOR Satellite image to run. The latest images are available from drbd.io. The satellite image version has to match the version of the controller image.
The use of the :latest tag is strongly discouraged, as it can quickly lead to version mismatches and unintended upgrades.
4 Sets the authentication to use when pulling the image. If pulling from drbd.io, you need to use your LINBIT customer login here. Read more about pulling from a private repo here.
5 In order to configure storage devices, DRBD and load kernel modules, the containers need to be running in privileged mode.
6 The satellite needs to communicate with DRBD, which requires access to the netlink interface running in the hosts network.
7 The drbd-loader task will be executed once at the start of the satellite and load DRBD and other useful kernel modules.
8 The drbd-loader is specific to the distribution you are using. Available options are:
  • drbd.io/drbd9-bionic for Ubuntu 18.04 (Bionic Beaver)

  • drbd.io/drbd9-focal for Ubuntu 20.04 (Focal Fossa)

  • drbd.io/drbd9-rhel8 for RHEL 8

  • drbd.io/drbd9-rhel7 for RHEL 7

9 The drbd-loader container can be configured via environment variables. LB_HOW tells the container how to insert the DRBD kernel module. Available options are:
shipped_modules

uses the prepackaged RPMs or DEBs delivered with the container.

compile

Compile DRBD from source. Requires access to the kernel headers (see below).

deps_only

Only try to load existing modules used by the LINSTOR satellite (for example dm_thin_pool and dm_cache).

10 In order for the drbd-loader container to build DRBD or load existing modules, it needs access to a hosts /usr/src and /lib/modules respectively.

This requires setting up additional host volumes on every node. The following snippet needs to be added to every Nomad agent confiration, then Nomad needs to be restarted.

Listing 14. /etc/nomad.d/drbd-loader-volumes.hcl
client {
  host_volume "modules" {
    path = "/lib/modules"
    read_only = true
  }
  host_volume "kernel-src" {
    path = "/usr/src"
    read_only = true
  }
}

Apply the job by running:

$ nomad job run linstor-satellite.hcl
==> Monitoring evaluation "0c07469d"
    Evaluation triggered by job "linstor-satellite"
==> Monitoring evaluation "0c07469d"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "0c07469d" finished with status "complete"

5.2.4. Configuring LINSTOR in Nomad

Once the linstor-controller and linstor-satellite jobs are running, you can start configuring the cluster using the linstor command line tool.

This can done:

  • directly by nomad exec-ing into the linstor-controller container

  • using the drbd.io/linstor-client container on the host running the linstor-controller

    docker run -it --rm --net=host drbd.io/linstor-client node create
  • by installing the linstor-client package on the host running the linstor-controller.

In all cases, you need to add the satellites to your cluster and create some storage pools. For example, to add the node nomad-01.example.com and configure a LVM Thin storage pool, you would run:

$ linstor node create nomad-01.example.com
$ linstor storage-pool create lvmthin nomad-01.example.com thinpool linstor_vg/thinpool
The CSI driver requires your satellites to be named after their hostname. To be precise, the satellite name needs to match Nomads attr.unique.hostname attribute on the node.

5.3. Deploying the LINSTOR CSI Driver on Nomad

The CSI driver is deployed as a system job, meaning it runs on every node in the cluster.

The following example will create a Nomad job starting a LINSTOR CSI Driver on every node in datacenter dc1.

Listing 15. linstor-csi.hcl
job "linstor-csi" {
  datacenters = ["dc1"] (1)
  type = "system"

  group "csi" {
    network {
      mode = "bridge"
    }

    service {
      connect {
        sidecar_service { (2)
          proxy {
            upstreams {
              destination_name = "linstor-api"
              local_bind_port  = 8080
            }
          }
        }
      }
    }

    task "csi-plugin" {
      driver = "docker"
      config {
        image = "drbd.io/linstor-csi:v0.13.1" (3)

        auth { (4)
          username = "example"
          password = "example"
          server_address = "drbd.io"
        }

        args = [
          "--csi-endpoint=unix://csi/csi.sock",
          "--node=${attr.unique.hostname}", (5)
          "--linstor-endpoint=http://${NOMAD_UPSTREAM_ADDR_linstor_api}", (6)
          "--log-level=info"
        ]

        privileged = true (7)
      }

      csi_plugin { (8)
        id = "linstor.csi.linbit.com"
        type = "monolith"
        mount_dir = "/csi"
      }

      resources {
        cpu    = 100 # 100 MHz
        memory = 200 # 200MB
      }
    }
  }
}
1 Replace dc1 with your own datacenter name
2 The sidecar_service stanza enables use of the service mesh generated by using Consul Connect. If you have not configured this feature in Nomad, or you are using an external LINSTOR Controller, you can skip this configuration.
3 This sets the LINSTOR CSI Driver image to run. The latest images are available from drbd.io.
The use of the :latest tag is strongly discouraged, as it can quickly lead to version mismatches and unintended upgrades.
4 Sets the authentication to use when pulling the image. If pulling from drbd.io, you need to use your LINBIT customer login here. Read more about pulling from a private repo here.
5 This argument sets the node name used by the CSI driver to identify itself in the LINSTOR API. By default, this is set to the nodes hostname.
6 This argument sets the LINSTOR API endpoint. If you are not using the consul service mesh (see Nr. 2 above), this needs to be set to the Controllers API endpoint. The endpoint needs to be reachable from every node this is deployed on.
7 The CSI driver needs to execute mount commands, requiring privileged containers.
8 The csi_plugin stanza informs Nomad that this task is a CSI plugin. The Nomad agent will forward requests for volumes to one of the jobs containers.

Apply the job by running:

$ nomad job run linstor-csi.hcl
==> Monitoring evaluation "0119f19c"
    Evaluation triggered by job "linstor-csi"
==> Monitoring evaluation "0119f19c"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "0119f19c" finished with status "complete"

5.4. Using LINSTOR volumes in Nomad

Volumes in Nomad are created using a volume-specification.

As an example, the following specification requests a 1GiB volume with 2 replicas from the LINSTOR storage pool thinpool.

Listing 16. vol1.hcl
id = "vol1" (1)
name = "vol1" (2)

type = "csi"
plugin_id = "linstor.csi.linbit.com"

capacity_min = "1GiB"
capacity_max = "1GiB"

capability {
  access_mode = "single-node-writer" (3)
  attachment_mode = "file-system" (4)
}

mount_options {
  fs_type = "ext4" (5)
}

parameters { (6)
  "resourceGroup" = "default-resource",
  "storagePool" = "thinpool",
  "autoPlace" = "2"
}
1 The id is used to reference this volume in Nomad. Used in the volume.source field of a job specification.
2 The name is used when creating the volume in the backend (i.e. LINSTOR). Ideally this matches the id and is a valid LINSTOR resource name. If the name would not be valid, LINSTOR CSI will generate a random compatible name.
3 What kind of access the volume should support. LINSTOR CSI supports: single-node-reader-only::Allow read only access on one node at a time
single-node-writer

Allow read and write access on one node at a time

multi-node-reader-only

Allow read only access from multiple nodes.

4 Can be file-system or block-device.
5 Specify the file system to use. LINSTOR CSI supports ext4 and xfs.
6 Additional parameters to pass to LINSTOR CSI. The example above requests the resource be part of the default-resource resource group and should deploy 2 replicas.

For a complete list of available parameters, you can check out the guide on Kubernetes storage classes. Kubernetes, like Nomad, makes use of the CSI plugin.

To create the volume, run the following command:

$ nomad volume create vol1.hcl
Created external volume vol1 with ID vol1
$ nomad volume status
Container Storage Interface
ID    Name  Plugin ID               Schedulable  Access Mode
vol1  vol1  linstor.csi.linbit.com  true         <none>
$ linstor resource list
╭──────────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node                 ┊ Port ┊ Usage  ┊ Conns ┊    State ┊ CreatedOn           ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════╡
┊ vol1         ┊ nomad-01.example.com ┊ 7000 ┊ Unused ┊ Ok    ┊ UpToDate ┊ 2021-06-15 14:56:32 ┊
┊ vol1         ┊ nomad-02.example.com ┊ 7000 ┊ Unused ┊ Ok    ┊ UpToDate ┊ 2021-06-15 14:56:32 ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────╯

5.4.1. Using volumes in jobs

To use the volume in a job, add the volume and volume_mount stanzas to the job specification:

job "example" {
  ...

  group "example" {
    volume "example-vol" {
      type = "csi"
      source = "vol1"
      attachment_mode = "file-system"
      access_mode = "single-node-writer"
    }

    task "mount-example" {
      volume_mount {
        volume = "example-vol"
        destination = "/data"
      }

      ...
    }
  }
}

5.4.2. Creating snapshots of volumes

LINSTOR can create snapshots of existing volumes, provided the underlying storage pool driver supports snapshots.

The following command creates a snapshot named snap1 of the volume vol1.

$ nomad volume snapshot create vol1 snap1
Snapshot ID  Volume ID  Size     Create Time  Ready?
snap1        vol1       1.0 GiB  None         true
$ linstor s l
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ SnapshotName ┊ NodeNames                                  ┊ Volumes  ┊ CreatedOn           ┊ State      ┊
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ vol1         ┊ snap1        ┊ nomad-01.example.com, nomad-02.example.com ┊ 0: 1 GiB ┊ 2021-06-15 15:04:10 ┊ Successful ┊
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

You can use a snapshot to pre-populate an existing volume with data from the snapshot

$ cat vol2.hcl
id = "vol2"
name = "vol2"
snapshot_id = "snap1"

type = "csi"
plugin_id = "linstor.csi.linbit.com"
...

$ nomad volume create vol2.hcl
Created external volume vol2 with ID vol2

6. LINSTOR Volumes in Proxmox VE

This chapter describes DRBD in Proxmox VE via the LINSTOR Proxmox Plugin.

6.1. Proxmox VE Overview

Proxmox VE is an easy to use, complete server virtualization environment with KVM, Linux Containers and HA.

‘linstor-proxmox’ is a Perl plugin for Proxmox that, in combination with LINSTOR, allows to replicate VM disks on several Proxmox VE nodes. This allows to live-migrate active VMs within a few seconds and with no downtime without needing a central SAN, as the data is already replicated to multiple nodes.

6.2. Upgrades

If this is a fresh installation, skip this section and continue with Proxmox Plugin Installation.

6.2.1. From 4.x to 5.x

Version 5 of the plugin drops compatibility with the legacy configuration options “storagepool” and “redundancy”. Version 5 requires a “resourcegroup” option, and obviously a LINSTOR resource group. The old options should be removed from the config.

Configuring LINSTOR is described in Section LINSTOR Configuration, a typical example follows: Let’s assume the pool was set to “mypool”, and redundancy to 3.

# linstor resource-group create --storage-pool=mypool --place-count=3 drbdMypoolThree
# linstor volume-group create drbdMypoolThree
# vi /etc/pve/storage.cfg
drbd: drbdstorage
   content images,rootdir
   controller 10.11.12.13
   resourcegroup drbdMypoolThree

6.2.2. From 5.x to 6.x

With version 6 PVE added additional parameters to some functions and rightfully reset their “APIAGE”. This means that old plugins, while actually usable as they don’t use any of these changed functions do not work anymore. Please upgrade to plugin version 5.2.1 at least.

6.3. Proxmox Plugin Installation

LINBIT provides a dedicated public repository for Proxmox VE users. This repository not only contains the Proxmox plugin, but the whole DRBD SDS stack including a DRBD SDS kernel module and user space utilities.

The DRBD9 kernel module is installed as a dkms package (i.e., drbd-dkms), therefore you’ll have to install pve-headers package, before you set up/install the software packages from LINBIT’s repositories. Following that order, ensures that the kernel module will build properly for your kernel. If you don’t plan to install the latest Proxmox kernel, you have to install kernel headers matching your current running kernel (e.g., pve-headers-$(uname -r)). If you missed this step, then still you can rebuild the dkms package against your current kernel, (kernel headers have to be installed in advance), by issuing apt install --reinstall drbd-dkms command.

LINBIT’s repository can be enabled as follows, where “$PVERS” should be set to your Proxmox VE major version (e.g., “7”, not “7.1”):

# wget -O- https://packages.linbit.com/package-signing-pubkey.asc | apt-key add -
# PVERS=7 && echo "deb http://packages.linbit.com/proxmox/ proxmox-$PVERS drbd-9" > \
	/etc/apt/sources.list.d/linbit.list
# apt update && apt install linstor-proxmox

6.4. LINSTOR Configuration

For the rest of this guide we assume that you have a LINSTOR cluster configured as described in Initializing your cluster. Start the “linstor-controller” on one node, and the “linstor-satellite” on all nodes. The preferred way to use the plugin, starting from version 4.1.0, is via LINSTOR resource groups and a single volume group within every resource group. LINSTOR resource groups are described in Resource groups. All the required LINSTOR configuration (e.g., redundancy count) has to be set on the resource group.

6.5. Proxmox Plugin Configuration

The final step is to provide a configuration for Proxmox itself. This can be done by adding an entry in the /etc/pve/storage.cfg file, with a content similar to the following.

drbd: drbdstorage
   content images,rootdir
   controller 10.11.12.13
   resourcegroup defaultpool

The “drbd” entry is fixed and you are not allowed to modify it, as it tells to Proxmox to use DRBD as storage backend. The “drbdstorage” entry can be modified and is used as a friendly name that will be shown in the PVE web GUI to locate the DRBD storage. The “content” entry is also fixed, so do not change it. The redundancy (specified in the resource group) specifies how many replicas of the data will be stored in the cluster. The recommendation is to set it to 2 or 3 depending on your setup. The data is accessible from all nodes, even if some of them do not have local copies of the data. For example, in a 5 node cluster, all nodes will be able to access 3 copies of the data, no matter where they are stored in. The “controller” parameter must be set to the IP of the node that runs the LINSTOR controller service. Only one node can be set to run as LINSTOR controller at the same time. If that node fails, start the LINSTOR controller on another node and change that value to its IP address.

A configuration using different storage pools in different resource groups would look like this:

drbd: drbdstorage
   content images,rootdir
   controller 10.11.12.13
   resourcegroup defaultpool

drbd: fastdrbd
   content images,rootdir
   controller 10.11.12.13
   resourcegroup ssd

drbd: slowdrbd
   content images,rootdir
   controller 10.11.12.13
   resourcegroup backup

By now, you should be able to create VMs via Proxmox’s web GUI by selecting “drbdstorage“, or any other of the defined pools as storage location.

Starting from version 5 of the plugin one can set the option “preferlocal yes”. If it is set, the plugin tries to create a diskful assignment on the node that issued the storage create command. With this option one can make sure the VM gets local storage if possible. Without that option LINSTOR might place the storage on nodes ‘B’ and ‘C’, while the VM is initially started on node ‘A’. This would still work as node ‘A’ then would get a diskless assignment, but having local storage might be preferred.

NOTE: DRBD supports only the raw disk format at the moment.

At this point you can try to live migrate the VM – as all data is accessible on all nodes (even on Diskless nodes) – it will take just a few seconds. The overall process might take a bit longer if the VM is under load and if there is a lot of RAM being dirtied all the time. But in any case, the downtime should be minimal and you will see no interruption at all.

Table 1. Table Configuration Options
Option Meaning

controller

The IP of the LINSTOR controller (‘,’ separated list allowed)

resourcegroup

The name of a LINSTOR resource group which defines the deployment of new VMs. As described above

preferlocal

Prefer to create local storage (yes/no). As decribed above

statuscache

Time in seconds status information is cached, 0 means no extra cache. Relevant on huge clusters with hundreds of resources. This has to be set on all drbd storages in /etc/pve/storage.cfg to take effect.

apicrt

Path to the client certificate

apikey

Path to the client private key

apica

Path to the CA certificate

6.6. Making the Controller Highly-Available (optional)

Making LINSTOR highly-available is a matter of making the LINSTOR controller highly-available. This step is described in Section LINSTOR high availability.

The last — but crucial — step is to configure the Proxmox plugin to be able to connect to multiple LINSTOR controllers. It will use the first one it receives an answer from. This is done by adding a comma-separated list of controllers in the controller section of the plugin like this:

drbd: drbdstorage
   content images,rootdir
   controller 10.11.12.13,10.11.12.14,10.11.12.15
   resourcegroup defaultpool

7. LINSTOR Volumes in OpenNebula

This chapter describes DRBD in OpenNebula via the usage of the LINSTOR storage driver addon.

Detailed installation and configuration instructions and be found in the README.md file of the driver’s source.

7.1. OpenNebula Overview

OpenNebula is a flexible and open source cloud management platform which allows its functionality to be extended via the use of addons.

The LINSTOR addon allows the deployment of virtual machines with highly available images backed by DRBD and attached across the network via DRBD’s own transport protocol.

7.2. OpenNebula addon Installation

Installation of the LINSTOR storage addon for OpenNebula requires a working OpenNebula cluster as well as a working LINSTOR cluster.

With access to LINBIT’s customer repositories you can install the linstor-opennebula with

# apt install linstor-opennebula

or

# yum install linstor-opennebula

Without access to LINBIT’s prepared packages you need to fall back to instructions on it’s GitHub page.

A DRBD cluster with LINSTOR can be installed and configured by following the instructions in this guide, see Initializing your cluster.

The OpenNebula and DRBD clusters can be somewhat independent of one another with the following exception: OpenNebula’s Front-End and Host nodes must be included in both clusters.

Host nodes do not need a local LINSTOR storage pools, as virtual machine images are attached to them across the network [1].

7.3. Deployment Options

It is recommended to use LINSTOR resource groups to configure the deployment how you like it, see OpenNebula resource group. Previous auto-place and deployment nodes modes are deprecated.

7.4. Configuration

7.4.1. Adding the driver to OpenNebula

Modify the following sections of /etc/one/oned.conf

Add linstor to the list of drivers in the TM_MAD and DATASTORE_MAD sections:

TM_MAD = [
  executable = "one_tm",
  arguments = "-t 15 -d dummy,lvm,shared,fs_lvm,qcow2,ssh,vmfs,ceph,linstor"
]
DATASTORE_MAD = [
    EXECUTABLE = "one_datastore",
    ARGUMENTS  = "-t 15 -d dummy,fs,lvm,ceph,dev,iscsi_libvirt,vcenter,linstor -s shared,ssh,ceph,fs_lvm,qcow2,linstor"

Add new TM_MAD_CONF and DS_MAD_CONF sections:

TM_MAD_CONF = [
    NAME = "linstor", LN_TARGET = "NONE", CLONE_TARGET = "SELF", SHARED = "yes", ALLOW_ORPHANS="yes",
    TM_MAD_SYSTEM = "ssh,shared", LN_TARGET_SSH = "NONE", CLONE_TARGET_SSH = "SELF", DISK_TYPE_SSH = "BLOCK",
    LN_TARGET_SHARED = "NONE", CLONE_TARGET_SHARED = "SELF", DISK_TYPE_SHARED = "BLOCK"
]
DS_MAD_CONF = [
    NAME = "linstor", REQUIRED_ATTRS = "BRIDGE_LIST", PERSISTENT_ONLY = "NO",
    MARKETPLACE_ACTIONS = "export"
]

After making these changes, restart the opennebula service.

7.4.2. Configuring the Nodes

The Front-End node issues commands to the Storage and Host nodes via Linstor

Storage nodes hold disk images of VMs locally.

Host nodes are responsible for running instantiated VMs and typically have the storage for the images they need attached across the network via Linstor diskless mode.

All nodes must have DRBD9 and Linstor installed. This process is detailed in the User’s Guide for DRBD9

It is possible to have Front-End and Host nodes act as storage nodes in addition to their primary role as long as they the meet all the requirements for both roles.

Front-End Configuration

Please verify that the control node(s) that you hope to communicate with are reachable from the Front-End node. linstor node list for locally running Linstor controllers and linstor --controllers "<IP:PORT>" node list for remotely running Linstor Controllers is a handy way to test this.

Host Configuration

Host nodes must have Linstor satellite processes running on them and be members of the same Linstor cluster as the Front-End and Storage nodes, and may optionally have storage locally. If the oneadmin user is able to passwordlessly ssh between hosts then live migration may be used with the even with the ssh system datastore.

Storage Node Configuration

Only the Front-End and Host nodes require OpenNebula to be installed, but the oneadmin user must be able to passwordlessly access storage nodes. Refer to the OpenNebula install guide for your distribution on how to manually configure the oneadmin user account.

The Storage nodes must use storage pools created with a driver that’s capable of making snapshots, such as the thin LVM plugin.

In this example preparation of thinly-provisioned storage using LVM for Linstor, you must create a volume group and thinLV using LVM on each storage node.

Example of this process using two physical volumes (/dev/sdX and /dev/sdY) and generic names for the volume group and thinpool. Make sure to set the thinLV’s metadata volume to a reasonable size, once it becomes full it can be difficult to resize:

pvcreate /dev/sdX /dev/sdY
vgcreate drbdpool /dev/sdX /dev/sdY
lvcreate -l 95%VG --poolmetadatasize 8g -T /dev/drbdpool/drbdthinpool

Then you’ll create storage pool(s) on Linstor using this as the backing storage.

If you are using ZFS storage pools or thick-LVM, please use LINSTOR_CLONE_MODE copy otherwise you will have problems deleting linstor resources, because of ZFS parent-child snapshot relationships.

7.4.3. Permissions for Oneadmin

The oneadmin user must have passwordless sudo access to the mkfs command on the Storage nodes

oneadmin ALL=(root) NOPASSWD: /sbin/mkfs
Groups

Be sure to consider the groups that oneadmin should be added to in order to gain access to the devices and programs needed to access storage and instantiate VMs. For this addon, the oneadmin user must belong to the disk group on all nodes in order to access the DRBD devices where images are held.

usermod -a -G disk oneadmin

7.4.4. Creating a New Linstor Datastore

Create a datastore configuration file named ds.conf and use the onedatastore tool to create a new datastore based on that configuration. There are two mutually exclusive deployment options: LINSTOR_AUTO_PLACE and LINSTOR_DEPLOYMENT_NODES. If both are configured, LINSTOR_AUTO_PLACE is ignored. For both of these options, BRIDGE_LIST must be a space separated list of all storage nodes in the Linstor cluster.

7.4.5. OpenNebula resource group

Since version 1.0.0 LINSTOR supports resource groups. A resource group is a centralized point for settings that all resources linked to that resource group share.

Create a resource group and volume group for your datastore, it is mandatory to specify a storage-pool within the resource group, otherwise monitoring space for opennebula will not work. Here we create one with 2 node redundancy and use a created opennebula-storagepool:

linstor resource-group create OneRscGrp --place-count 2 --storage-pool opennebula-storagepool
linstor volume-group create

Now add a OpenNebula datastore using the LINSTOR plugin:

cat >ds.conf <<EOI
NAME = linstor_datastore
DS_MAD = linstor
TM_MAD = linstor
TYPE = IMAGE_DS
DISK_TYPE = BLOCK
LINSTOR_RESOURCE_GROUP = "OneRscGrp"
COMPATIBLE_SYS_DS = 0
BRIDGE_LIST = "alice bob charlie"  #node names
EOI

onedatastore create ds.conf

7.4.6. Plugin attributes

LINSTOR_CONTROLLERS

LINSTOR_CONTROLLERS can be used to pass a comma separated list of controller ips and ports to the Linstor client in the case where a Linstor controller process is not running locally on the Front-End, e.g.:

LINSTOR_CONTROLLERS = "192.168.1.10:8080,192.168.1.11:6000"

LINSTOR_CLONE_MODE

Linstor supports 2 different clone modes and are set via the LINSTOR_CLONE_MODE attribute:

  • snapshot

The default mode is snapshot it uses a linstor snapshot and restores a new resource from this snapshot, which is then a clone of the image. This mode is usually faster than using the copy mode as snapshots are cheap copies.

  • copy

The second mode is copy it creates a new resource with the same size as the original and copies the data with dd to the new resource. This mode will be slower than snapshot, but is more robust as it doesn’t rely on any snapshot mechanism, it is also used if you are cloning an image into a different linstor datastore.

7.4.7. Deprecated attributes

The following attributes are deprecated and will be removed in version after the 1.0.0 release.

LINSTOR_STORAGE_POOL

LINSTOR_STORAGE_POOL attribute is used to select the LINSTOR storage pool your datastore should use. If resource groups are used this attribute isn’t needed as the storage pool can be select by the auto select filter options. If LINSTOR_AUTO_PLACE or LINSTOR_DEPLOYMENT_NODES is used and LINSTOR_STORAGE_POOL is not set, it will fallback to the DfltStorPool in LINSTOR.

LINSTOR_AUTO_PLACE

The LINSTOR_AUTO_PLACE option takes a level of redundancy which is a number between one and the total number of storage nodes. Resources are assigned to storage nodes automatically based on the level of redundancy.

LINSTOR_DEPLOYMENT_NODES

Using LINSTOR_DEPLOYMENT_NODES allows you to select a group of nodes that resources will always be assigned to. Please note that the bridge list still contains all of the storage nodes in the Linstor cluster.

7.4.8. LINSTOR as system datastore

Linstor driver can also be used as a system datastore, configuration is pretty similar to normal datastores, with a few changes:

cat >system_ds.conf <<EOI
NAME = linstor_system_datastore
TM_MAD = linstor
TYPE = SYSTEM_DS
LINSTOR_RESOURCE_GROUP = "OneSysRscGrp"
BRIDGE_LIST = "alice bob charlie"  # node names
EOI

onedatastore create system_ds.conf

Also add the new sys datastore id to the COMPATIBLE_SYS_DS to your image datastores (COMMA separated), otherwise the scheduler will ignore them.

If you want live migration with volatile disks you need to enable the --unsafe option for KVM, see: opennebula-doc

7.5. Live Migration

Live migration is supported even with the use of the ssh system datastore, as well as the nfs shared system datastore.

7.6. Free Space Reporting

Free space is calculated differently depending on whether resources are deployed automatically or on a per node basis.

For datastores which place per node, free space is reported based on the most restrictive storage pools from all nodes where resources are being deployed. For example, the capacity of the node with the smallest amount of total storage space is used to determine the total size of the datastore and the node with the least free space is used to determine the remaining space in the datastore.

For a datastore which uses automatic placement, size and remaining space are determined based on the aggregate storage pool used by the datastore as reported by LINSTOR.

8. LINSTOR Volumes in Openstack

This chapter describes using LINSTOR to provision persistent, replicated, and high-performance block storage for Openstack.

8.1. Openstack Overview

Openstack consists of a wide range of individual services; The service responsible for provisioning and managing block storage is called Cinder. Other openstack services such as the compute instance service Nova can request volumes from Cinder. Cinder will then make a volume accessible to the requesting service.

LINSTOR can integrate with Cinder using a volume driver. The volume driver translates calls to the Cinder API to LINSTOR commands. For example: requesting a volume from Cinder will create new resources in LINSTOR, Cinder Volume snapshots translate to snapshots in LINSTOR and so on.

8.2. LINSTOR for Openstack Installation

An initial installation and configuration of DRBD and LINSTOR must be completed prior to using the OpenStack driver.

At this point you should be able to list your storage cluster nodes using the LINSTOR client:

$ linstor node info
╭────────────────────────────────────────────────────────────────────────────╮
┊ Node                      ┊ NodeType  ┊ Addresses                 ┊ State  ┊
╞════════════════════════════════════════════════════════════════════════════╡
┊ cinder-01.openstack.test  ┊ COMBINED  ┊ 10.43.224.21:3366 (PLAIN) ┊ Online ┊
┊ cinder-02.openstack.test  ┊ COMBINED  ┊ 10.43.224.22:3366 (PLAIN) ┊ Online ┊
┊ storage-01.openstack.test ┊ SATELLITE ┊ 10.43.224.11:3366 (PLAIN) ┊ Online ┊
┊ storage-02.openstack.test ┊ SATELLITE ┊ 10.43.224.12:3366 (PLAIN) ┊ Online ┊
┊ storage-03.openstack.test ┊ SATELLITE ┊ 10.43.224.13:3366 (PLAIN) ┊ Online ┊
╰────────────────────────────────────────────────────────────────────────────╯

You should configure one or more storage pools per node. This guide assumes the storage pool is named cinderpool. LINSTOR should list the storage pool for each node, including the diskless storage pool created by default.

$ linstor storage-pool list
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node                      ┊ Driver   ┊ PoolName        ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ cinder-01.openstack.test  ┊ DISKLESS ┊                 ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ cinder-02.openstack.test  ┊ DISKLESS ┊                 ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ storage-01.openstack.test ┊ DISKLESS ┊                 ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ storage-02.openstack.test ┊ DISKLESS ┊                 ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ storage-03.openstack.test ┊ DISKLESS ┊                 ┊              ┊               ┊ False        ┊ Ok    ┊
┊ cinderpool           ┊ storage-01.openstack.test ┊ LVM_THIN ┊ ssds/cinderpool ┊      100 GiB ┊       100 GiB ┊ True         ┊ Ok    ┊
┊ cinderpool           ┊ storage-02.openstack.test ┊ LVM_THIN ┊ ssds/cinderpool ┊      100 GiB ┊       100 GiB ┊ True         ┊ Ok    ┊
┊ cinderpool           ┊ storage-03.openstack.test ┊ LVM_THIN ┊ ssds/cinderpool ┊      100 GiB ┊       100 GiB ┊ True         ┊ Ok    ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

8.2.1. Upgrades of LINSTOR driver

If this is a fresh installation, skip this section and continue with Install the LINSTOR driver.

From 1.x to 2.x

Driver version 2 dropped some static configuration options in favour of managing these options at runtime via volume types.

Option in cinder.conf Status Replace with

linstor_autoplace_count

removed

Use the linstor:redundancy property on the volume type. Using a value of 0 for full cluster replication is not supported, use the advanced options of the LINSTOR autoplacer

linstor_controller_diskless

removed

No replacement needed, the driver will create a diskless resource on the cinder host when required

linstor_default_blocksize

removed

This setting had no effect.

linstor_default_storage_pool_name

deprecated

This setting is deprecated for removal in a future version. Use the linstor:storage_pool property on the volume type instead.

linstor_default_uri

deprecated

Replaced by the more aptly named linstor_uris.

linstor_default_volume_group_name

removed

Creating nodes and storage pools was completely removed in Driver version 2. See LINSTOR for Openstack Installation

linstor_volume_downsize_factor

removed

This setting served no purpose, it was removed without replacement.

8.2.2. Install the LINSTOR driver

Starting with OpenStack Stein, the LINSTOR driver is part of the Cinder project. While the driver can be used as is, it might be missing features or fixes available in newer version. Due to OpenStacks update policy for stable versions, most improvements to the driver will not get back-ported to older stable releases.

LINBIT maintains a fork of the Cinder repository with all improvements to the LINSTOR driver backported to the supported stable versions. Currently, these are:

OpenStack Release Included Version LINBIT Version LINBIT Branch

wallaby

1.0.1

2.0.0

linstor/stable/wallaby

victoria

1.0.1

2.0.0

linstor/stable/victoria

ussuri

1.0.1

2.0.0

linstor/stable/ussuri

train

1.0.0

2.0.0

linstor/stable/train

stein

1.0.0

2.0.0

linstor/stable/stein

rocky

n/a

n/a

n/a

queens

n/a

n/a

n/a

pike

n/a

n/a

n/a

The exact steps to enable the Linstor Driver depend on your OpenStack distribution. In general, the python-linstor package needs to be installed on all hosts running the Cinder volume service. The next section will cover the installation process for common OpenStack distributions.

DevStack

DevStack is a great way to try out OpenStack in a lab environment. To use the most recent driver use the following DevStack configuration:

Listing 17. local.conf
# This ensures the Linstor Driver has access to the 'python-linstor' package.
#
# This is needed even if using the included driver!
USE_VENV=True
ADDITIONAL_VENV_PACKAGES=python-linstor

# This is required to select the LINBIT version of the driver
CINDER_REPO=https://github.com/LINBIT/openstack-cinder.git
# Replace linstor/stable/victoria with the reference matching your Openstack release.
CINDER_BRANCH=linstor/stable/victoria
Kolla

Kolla packages OpenStack components in containers. They can then be deployed, for example using Kolla Ansible You can take advantage of the available customisation options for kolla containers to set up the Linstor driver.

To ensure that the required python-linstor package is installed, use the following override file:

Listing 18. template-override.j2
{% extends parent_template %}

# Cinder
{% set cinder_base_pip_packages_append = ['python-linstor'] %}

To install the LINBIT version of the driver, update your kolla-build.conf

Listing 19. /etc/kolla/kolla-build.conf
[cinder-base]
type = git
location = https://github.com/LINBIT/openstack-cinder.git
# Replace linstor/stable/victoria with the reference matching your Openstack release.
reference = linstor/stable/victoria

To rebuild the Cinder containers, run:

# A private registry used to store the kolla container images
REGISTRY=deployment-registry.example.com
# The image namespace in the registry
NAMESPACE=kolla
# The tag to apply to all images. Use the release name for compatibility with kolla-ansible
TAG=victoria
kolla-build -t source --template-override template-override.j2 cinder --registry $REGISTRY --namespace $NAMESPACE --tag $TAG
Kolla Ansible

When deploying OpenStack using Kolla Ansible, you need to make sure that:

  • the custom Cinder images, created in the section above, are used

  • deployment of Cinder services is enabled

Listing 20. /etc/kolla/globals.yml
# use "source" images
kolla_install_type: source
# use the same registry as for running kolla-build above
docker_registry: deployment-registry.example.com
# use the same namespace as for running kolla-build above
docker_namespace: kolla
# deploy cinder block storage service
enable_cinder: "yes"
# disable verification of cinder backends, kolla-ansible only supports a small subset of available backends for this
skip_cinder_backend_check: True
# add the LINSTOR backend to the enabled backends. For backend configuration see below
cinder_enabled_backends:
  - name: linstor-drbd

You can place the Linstor driver configuration in one of the override directories for kolla-ansible. For more details on the available configuration options, see the section below.

Listing 21. /etc/kolla/config/cinder/cinder-volume.conf
[linstor-drbd]
volume_backend_name = linstor-drbd
volume_driver = cinder.volume.drivers.linstordrv.LinstorDrbdDriver
linstor_uris = linstor://cinder-01.openstack.test,linstor://cinder-02.openstack.test
OpenStack Ansible

OpenStack Ansible provides Ansible playbooks to configure and deploy of OpenStack environments. It allows for fine-grained customization of the deployment, letting you set up the Linstor driver directly.

Listing 22. /etc/openstack_ansile/user_variables.yml
cinder_git_repo: https://github.com/LINBIT/openstack-cinder.git
cinder_git_install_branch: linstor/stable/victoria

cinder_user_pip_packages:
  - python-linstor

cinder_backends: (1)
  linstor-drbd:
   volume_backend_name: linstor-drbd
   volume_driver: cinder.volume.drivers.linstordrv.LinstorDrbdDriver
   linstor_uris: linstor://cinder-01.openstack.test,linstor://cinder-02.openstack.test
1 A detailed description of the available backend parameters can be found in the section below
Generic Cinder deployment

For other forms of OpenStack deployments, this guide can only provide non-specific hints.

To update the Linstor driver version, find your cinder installation. Some likely paths are:

/usr/lib/python3.6/dist-packages/cinder/
/usr/lib/python3.6/site-packages/cinder/
/usr/lib/python2.7/dist-packages/cinder/
/usr/lib/python2.7/site-packages/cinder/

The Linstor driver consists of a single file called linstordrv.py, located in the Cinder directory:

$CINDER_PATH/volume/drivers/linstordrv.py

To update the driver, replace the file with one from the LINBIT repository

RELEASE=linstor/stable/victoria
curl -fL "https://raw.githubusercontent.com/LINBIT/openstack-cinder/$RELEASE/cinder/volume/drivers/linstordrv.py" > $CINDER_PATH/volume/drivers/linstordrv.py

You might also need to remove the Python cache for the update to be registered:

rm -rf $CINDER_PATH/volume/drivers/__pycache__

8.3. Configure a Linstor Backend for Cinder

To use the Linstor driver, configure the Cinder volume service. This is done by editing the Cinder configuration file and then restarting the Cinder Volume service.

Most of the time, the Cinder configuration file is located at /etc/cinder/cinder.conf. Some deployment options allow manipulating this file in advance. See the section above for specifics.

To configure a new volume backend using Linstor, add the following section to cinder.conf

[linstor-drbd]
volume_backend_name = linstor-drbd (1)
volume_driver = cinder.volume.drivers.linstordrv.LinstorDrbdDriver (2)
linstor_uris = linstor://cinder-01.openstack.test,linstor://cinder-02.openstack.test (3)
linstor_trusted_ca = /path/to/trusted/ca.cert (4)
linstor_client_key = /path/to/client.key (5)
linstor_client_cert = /path/to/client.cert (5)
# Deprecated or removed in 2.0.0
linstor_default_storage_pool_name = cinderpool (6)
linstor_autoplace_count = 2 (7)
linstor_controller_diskless = true (8)
# non-linstor-specific options
... (9)
The parameters described here are based on the latest release provided by Linbit. The driver included in OpenStack might not support all of these parameters. Take a look at the OpenStack driver documentation to find out more.
1 The name of the volume backend. Needs to be unique in the Cinder configuration. The whole section should share the same name. This name is referenced again in cinder.conf in the enabled_backends setting and when creating a new volume type.
2 The version of the Linstor driver to use. There are two options:
  • cinder.volume.drivers.linstordrv.LinstorDrbdDriver

  • cinder.volume.drivers.linstordrv.LinstorIscsiDriver

    Which driver you should use depends on your Linstor set up and requirements. Details on each choice are documented in the section below.

3 The URL(s) of the Linstor Controller(s). Multiple Controllers can be specified to make use of Linstor High Availability. If not set, defaults to linstor://localhost.
In driver versions prior to 2.0.0, this option is called linstor_default_uri
4 If HTTPS is enabled the referenced certificate is used to verify the Linstor Controller authenticity.
5 If HTTPS is enabled the referenced key and certificate will be presented to the Linstor Controller for authentication.
6 Deprecated in 2.0.0, use volume types instead. The storage pools to use when placing resources. Applies to all diskfull resources created. Defaults to DfltStorPool.
7 Removed in 2.0.0, use volume types instead. The number of replicas to create for the given volume. A value of 0 will create a replica on all nodes. Defaults to 0.
8 *Removed in 2.0.0, volumes are created on demand by the driver. If set to true, ensures that at least one (diskless) replica is deployed on the Cinder Controller host. This is useful for ISCSI transports. Defaults to true.
9 You can specify more generic Cinder options here, for example target_helper = tgtadm for the ISCSI connector.
You can also configure multiple Linstor backends, choosing a different name and configuration options for each.

After configuring the Linstor backend, it should also be enabled. Add it to the list of enabled backends in cinder.conf, and optionally set is as the default backend:

[DEFAULT]
...
default_volume_type = linstor-drbd-volume
enabled_backends = lvm,linstor-drbd
...

As a last step, if you changed the Cinder configuration or updated the driver itself, make sure to restart the Cinder service(s). Please check the documentation for your OpenStack Distribution on how to restart services.

8.3.1. Choice of Transport Protocol

The Transport Protocol in Cinder is how clients (for example nova-compute) access the actual volumes. With Linstor, you can choose between two different drivers that use different transports.

  • cinder.volume.drivers.linstordrv.LinstorDrbdDriver, which uses DRBD as transport

  • cinder.volume.drivers.linstordrv.LinstorIscsiDriver, which uses ISCSI as transport

Using DRBD as Transport Protocol

The LinstorDrbdDriver works by ensuring a replica of the volume is available locally on the node where a client (i.e. nova-compute) issued a request. This only works if all compute nodes are also running Linstor Satellites that are part of the same Linstor cluster.

The advantages of this option are:

  • Once set up, the Cinder host is no longer involved in the data path. All read and write to the volume are handled by the local DRBD module, which will handle replication across its configured peers.

  • Since the Cinder host is not involved in the data path, any disruptions to the Cinder service do not affect volumes that are already attached.

Known limitations:
  • Not all hosts and hypervisors support using DRBD volumes. This restricts deployment to Linux hosts and kvm hypervisors.

  • Resizing of attached and in-use volumes does not fully work. While the resize itself is successful, the compute service will not propagate it to the VM until after a restart.

  • Multi-attach (attaching the same volume on multiple VMs) is not supported.

  • Encrypted volumes only work if udev rules for DRBD devices are in place.

    udev rules are either part of the drbd-utils package or have their own drbd-udev package.
Using ISCSI as Transport Protocol

The default way to export Cinder volumes is via iSCSI. This brings the advantage of maximum compatibility – iSCSI can be used with every hypervisor, be it VMWare, Xen, HyperV, or KVM.

The drawback is that all data has to be sent to a Cinder node, to be processed by an (userspace) iSCSI daemon; that means that the data needs to pass the kernel/userspace border, and these transitions will cost some performance.

Another drawback is the introduction of a single point of failure. If a Cinder node running the iSCSI daemon crashes, other nodes lose access to their volumes. There are ways to configure Cinder for automatic fail-over to mitigate this, but it requires considerable effort.

In driver versions prior to 2.0.0, the Cinder host needs access to a local replica of every volume. This can be achieved by either setting linstor_controller_diskless=True or using linstor_autoplace_count=0. Newer driver versions will create such a volume on demand.

8.3.2. Verify status of Linstor backends

To verify that all backends are up and running, you can use the OpenStack command line client:

$ openstack volume service list
+------------------+----------------------------------------+------+---------+-------+----------------------------+
| Binary           | Host                                   | Zone | Status  | State | Updated At                 |
+------------------+----------------------------------------+------+---------+-------+----------------------------+
| cinder-scheduler | cinder-01.openstack.test               | nova | enabled | up    | 2021-03-10T12:24:37.000000 |
| cinder-volume    | cinder-01.openstack.test@linstor-drbd  | nova | enabled | up    | 2021-03-10T12:24:34.000000 |
| cinder-volume    | cinder-01.openstack.test@linstor-iscsi | nova | enabled | up    | 2021-03-10T12:24:35.000000 |
+------------------+----------------------------------------+------+---------+-------+----------------------------+

If you have the Horizon GUI deployed, check Admin > System Information > Block Storage Service instead.

In the above example all configured services are enabled and up. If there are any issues, please check the logs of the Cinder Volume service.

8.4. Create a new volume type for Linstor

Before creating volumes using Cinder, you have to create a volume type. This can either be done via the command line:

# Create a volume using the default backend
$ openstack volume type create default
+-------------+--------------------------------------+
| Field       | Value                                |
+-------------+--------------------------------------+
| description | None                                 |
| id          | 58365ffb-959a-4d91-8821-5d72e5c39c26 |
| is_public   | True                                 |
| name        | default                              |
+-------------+--------------------------------------+
# Create a volume using a specific backend
$ openstack volume type create --property volume_backend_name=linstor-drbd linstor-drbd-volume
+-------------+--------------------------------------+
| Field       | Value                                |
+-------------+--------------------------------------+
| description | None                                 |
| id          | 08562ea8-e90b-4f95-87c8-821ac64630a5 |
| is_public   | True                                 |
| name        | linstor-drbd-volume                  |
| properties  | volume_backend_name='linstor-drbd'   |
+-------------+--------------------------------------+

Alternatively, you can create volume types via the Horizon GUI. Navigate to Admin > Volume > Volume Types and click “Create Volume Type”. You can assign it a backend by adding the volume_backend_name as “Extra Specs” to it.

8.4.1. Advanced Configuration of volume types

Each volume type can be customized by adding properties or “Extra Specs” as they are called in the Horizon GUI.

To add a property to a volume type on the command line use:

openstack volume type set linstor_drbd_b --property linstor:redundancy=5

Alternatively, you can set the property via the GUI by navigating tp Admin > Volume > Volume Types. In the Actions column, open the dropdown menu and click the View Extra Specs button. This opens a dialog you can use to create, edit and delete properties.

Available volume type properties
linstor:diskless_on_remaining

Create diskless replicas on non-selected nodes after auto-placing.

linstor:do_not_place_with_regex

Do not place the resource on a node which has a resource with a name matching the regex.

linstor:layer_list

Comma-separated list of layers to apply for resources. If empty, defaults to DRBD,Storage.

linstor:provider_list

Comma-separated list of providers to use. If empty, Linstor will automatically choose a suitable provider.

linstor:redundancy

Number of replicas to create. Defaults to 2.

linstor:replicas_on_different

A comma-separated list of key or key=value items used as autoplacement selection labels when autoplace is used to determine where to provision storage.

linstor:replicas_on_same

A comma-separated list of key or key=value items used as autoplacement selection labels when autoplace is used to determine where to provision storage.

linstor:storage_pool

Comma-separated list of storage pools to use when auto-placing.

linstor:property:*

If a key is prefixed by linstor:property:, it is interpreted as a LINSTOR property. The property gets set on the Resource Group created for the volume type.

For example: To change the quorum policy DrbdOptions/auto-quorum needs to be set. This can be done by setting the linstor:property:DrbdOptions/auto-quorum property.

8.5. Using volumes

Once you have a volume type configured, you can start using it to provision new volumes.

For example, to create a simple 1Gb volume on the command line you can use:

openstack volume create --type linstor-drbd-volume --size 1 --availability-zone nova linstor-test-vol
openstack volume list
If you set default_volume_type = linstor-drbd-volume in your /etc/cinder/cinder.conf, you may omit the --type linstor-drbd-volume from the openstack volume create …​ command above.

8.6. Troubleshooting

This section describes what to do in case you encounter problems with using Linstor volumes and snapshots.

8.6.1. Checking for error messages in Horizon

Every volume and snapshot has a Messages tab in the Horizon dashboard. In case of errors, the list of messages can be used as a starting point for further investigation. Some common messages in case of errors:

create volume from backend storage:Driver failed to create the volume.

There was an error creating a new volume. Check the Cinder Volume service logs for more details

schedule allocate volume:Could not find any available weighted backend.

If this is the only error message, this means the Cinder Scheduler could not find a volume backend suitable for creating the volume. This is most likely because:

  • The volume backend is offline. See Verify status of Linstor backends

  • The volume backend has not enough free capacity to fulfil the request. Check the output of cinder get-pools --detail and linstor storage-pool list to ensure that the requested capacity is available.

8.6.2. Checking the Cinder Volume Service

The Linstor driver is called as part of the Cinder Volume service.

Distribution Log location or command

DevStack

journalctl -u devstack@c-vol

8.6.3. Checking the compute srvice logs

Some issues will not be logged in the Cinder Service but in the actual consumer of the volumes, most likely the compute service (Nova). As with the volume service, the exact host and location to check depends on your Openstack distribution:

Distribution Log location or command

DevStack

journalctl -u devstack@n-cpu

9. LINSTOR Volumes in Docker

This chapter describes LINSTOR volumes in Docker as managed by the LINSTOR Docker Volume Plugin.

9.1. Docker Overview

Docker is a platform for developing, shipping, and running applications in the form of Linux containers. For stateful applications that require data persistence, Docker supports the use of persistent volumes and volume_drivers.

The LINSTOR Docker Volume Plugin is a volume driver that provisions persistent volumes from a LINSTOR cluster for Docker containers.

9.2. LINSTOR Plugin for Docker Installation

To install the linstor-docker-volume plugin provided by LINBIT, you’ll need to have a working LINSTOR cluster. After that the plugin can be installed from the public docker hub.

# docker plugin install linbit/linstor-docker-volume --grant-all-permissions
The --grant-all-permissions flag will automatically grant all permissions needed to successfully install the plugin. If you’d like to manually accept these, omit the flag from the command above.

9.3. LINSTOR Plugin for Docker Configuration

As the plugin has to communicate to the LINSTOR controller via the LINSTOR python library, we must tell the plugin where to find the LINSTOR Controller node in its configuration file:

# cat /etc/linstor/docker-volume.conf
[global]
controllers = linstor://hostnameofcontroller

A more extensive example could look like this:

# cat /etc/linstor/docker-volume.conf
[global]
storagepool = thin-lvm
fs = ext4
fsopts = -E discard
size = 100MB
replicas = 2

9.4. Example Usage

The following are some examples of how you might use the LINSTOR Docker Volume Plugin. In the following we expect a cluster consisting of three nodes (alpha, bravo, and charlie).

9.4.1. Example 1 – typical docker pattern

On node alpha:

$ docker volume create -d linbit/linstor-docker-volume \
        --opt fs=xfs --opt size=200 lsvol
$ docker run -it --rm --name=cont \
        -v lsvol:/data --volume-driver=linbit/linstor-docker-volume busybox sh
$ root@cont: echo "foo" > /data/test.txt
$ root@cont: exit

On node bravo:

$ docker run -it --rm --name=cont \
        -v lsvol:/data --volume-driver=linbit/linstor-docker-volume busybox sh
$ root@cont: cat /data/test.txt
  foo
$ root@cont: exit
$ docker volume rm lsvol

9.4.2. Example 2 – one diskful assignment by name, two nodes diskless

$ docker volume create -d linbit/linstor-docker-volume --opt nodes=bravo lsvol

9.4.3. Example 3 – one diskful assignment, no matter where, two nodes diskless

$ docker volume create -d linbit/linstor-docker-volume --opt replicas=1 lsvol

9.4.4. Example 4 – two diskful assignments by name, charlie diskless

$ docker volume create -d linbit/linstor-docker-volume --opt nodes=alpha,bravo lsvol

9.4.5. Example 5 – two diskful assignments, no matter where, one node diskless

$ docker volume create -d linbit/linstor-docker-volume --opt replicas=2 lsvol

9.4.6. Example 6 – using LINSTOR volumes with services from Docker swarm manager node

$ docker service create \
        --mount type=volume,src=lsvol,dst=/data,volume-driver=linbit/linstor-docker-volume \
        --name swarmsrvc busybox sh -c "while true; do sleep 1000s; done"
Docker services do not accept the -v or --volume syntax, you must use the --mount syntax. Docker run will accept either syntax.

10. Exporting Highly Available Storage using LINSTOR Gateway

LINSTOR Gateway manages highly available iSCSI targets and NFS exports by leveraging on LINSTOR and Pacemaker. Setting up LINSTOR – including storage pools and resource groups – as well as Corosync and Pacemaker’s properties are a prerequisite to use this tool.

10.1. Requirements

10.1.1. LINSTOR

A LINSTOR cluster is required to operate LINSTOR Gateway. It is highly recommended to run the LINSTOR controller as a Pacemaker resource. This needs to be configured manually. Such a resource could look like the following: (optional)

primitive p_linstor-controller systemd:linstor-controller \
        op start interval=0 timeout=100s \
        op stop interval=0 timeout=100s \
        op monitor interval=30s timeout=100s

For both iSCSI and NFS, storage pool , resource group and a volume group for LINSTOR Gateway needs to be present. Lets do it;

Creating the storage pool in 3 nodes using the phsical device /dev/sdb;

linstor physical-storage create-device-pool --pool-name lvpool LVM LINSTOR1 /dev/sdb --storage-pool lvmpool
linstor physical-storage create-device-pool --pool-name lvpool LVM LINSTOR2 /dev/sdb --storage-pool lvmpool
linstor physical-storage create-device-pool --pool-name lvpool LVM LINSTOR3 /dev/sdb --storage-pool lvmpool

We also need resource groups and volume groups;

linstor rg c iSCSI_group --storage-pool lvmpool --place-count 2
linstor rg c nfs_group --storage-pool lvmpool --place-count 3
linstor vg c iSCSI_group
linstor vg c nfs_group

For a more detailed explanation of the storage pool, resource group and volume group creation, check the LINSTOR user guide

10.1.2. Pacemaker

A working Corosync/Pacemaker cluster is expected on the machine where LINSTOR Gateway is running. The drbd-attr resource agent is required to run LINSTOR Gateway. This is included in LINBIT’s drbd-utils package for Ubuntu based distributions, or the drbd-pacemaker package on RHEL/CentOS. LINSTOR Gateway sets up all required Pacemaker resource and constraints by itself, except for the LINSTOR controller resource.

10.1.3. iSCSI & NFS

LINSTOR Gateway uses Pacemaker’s ocf::heartbeat:iSCSITarget resource agent for its iSCSI integration, which requires an iSCSI implementation to be installed. Using targetcli is recommended.

For iSCSI, please install targetcli.

yum install targetcli

For nfs, nfs-server needs to be enabled and ready;

systemctl enable --now nfs-server

10.2. Preparation

First, let’s check that all the components are available. This guide assumes you already installed and configured a LINSTOR cluster. Volume Group, Storage Pool and Resource Group should be defined before using linstor-iscsi or linstor-nfs.

Tools need to be present in server;

  • linstor-client (managing the LINSTOR cluster)

  • drbd-attr resource agent (part of drbd-utils in Debian/Ubuntu, and part of drbd-pacemaker for other Linux distrubutions)

  • targetcli (for iSCSI)

  • nfs-utils, nfs-server

  • pcs or crmsh for pacemaker client. (checking the status of the iSCSI or nfs targets)

10.3. Checking the Cluster

Check the LINSTOR cluster status with;

[root@LINSTOR1 ~]# linstor n l
╭────────────────────────────────────────────────────────────╮
┊ Node     ┊ NodeType  ┊ Addresses                  ┊ State  ┊
╞════════════════════════════════════════════════════════════╡
┊ LINSTOR1 ┊ COMBINED  ┊ 172.16.16.111:3366 (PLAIN) ┊ Online ┊
┊ LINSTOR2 ┊ SATELLITE ┊ 172.16.16.112:3366 (PLAIN) ┊ Online ┊
┊ LINSTOR3 ┊ SATELLITE ┊ 172.16.16.113:3366 (PLAIN) ┊ Online ┊
╰────────────────────────────────────────────────────────────╯

Check the LINSTOR storage pool list with;

root@LINSTOR1 ~]# linstor sp l
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node     ┊ Driver   ┊ PoolName ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ LINSTOR1 ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ LINSTOR2 ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ LINSTOR3 ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊
┊ lvmpool              ┊ LINSTOR1 ┊ LVM      ┊ lvpool   ┊    10.00 GiB ┊     10.00 GiB ┊ False        ┊ Ok    ┊
┊ lvmpool              ┊ LINSTOR2 ┊ LVM      ┊ lvpool   ┊    10.00 GiB ┊     10.00 GiB ┊ False        ┊ Ok    ┊
┊ lvmpool              ┊ LINSTOR3 ┊ LVM      ┊ lvpool   ┊    10.00 GiB ┊     10.00 GiB ┊ False        ┊ Ok    ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

LINSTOR resource group list (please do not forget to create volume group for the resource group with: linstor vg c iscsi_group)

[root@LINSTOR1 ~]# linstor rg l
╭────────────────────────────────────────────────────────────────╮
┊ ResourceGroup ┊ SelectFilter            ┊ VlmNrs ┊ Description ┊
╞════════════════════════════════════════════════════════════════╡
┊ DfltRscGrp    ┊ PlaceCount: 2           ┊        ┊             ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ iscsi_group   ┊ PlaceCount: 2           ┊ 0      ┊             ┊
┊               ┊ StoragePool(s): lvmpool ┊        ┊             ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ nfs_group     ┊ PlaceCount: 3           ┊ 0      ┊             ┊
┊               ┊ StoragePool(s): lvmpool ┊        ┊             ┊
╰────────────────────────────────────────────────────────────────╯

Check and disable stonith. Stonith is a technique for fencing node in clusters. We have 3 nodes here so quorum will be used instead of fencing.

pcs property set stonith-enabled=false
pcs property show

Check the pacemaker cluster health

[root@LINSTOR1 ~]# pcs status
Cluster name: LINSTOR
Cluster Summary:
  * Stack: corosync
  * Current DC: LINSTOR1 (version 2.0.5.linbit-1.0.el7-ba59be712) - partition with quorum
  * Last updated: Wed Mar 24 21:24:10 2021
  * Last change:  Wed Mar 24 21:24:01 2021 by root via cibadmin on LINSTOR1
  * 3 nodes configured
  * 0 resource instances configured

Node List:
  * Online: [ LINSTOR1 LINSTOR2 LINSTOR3 ]

Full List of Resources:
  * No resources

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

10.4. Setting up iSCSI target

Now, everything looks good, let’s start creating our first iSCSI lun. linstor-iscsi tool will be used for all iSCSI related actions. Please check “linstor-iscsi help” for detailed usage. At first it creates a new resource within the LINSTOR system under the specified name and using the specified resource group. After that it creates resource primitives in the Pacemaker cluster including all necessary order and location constraints. The Pacemaker primitives are prefixed with p_, contain the resource name and a resource type postfix.

linstor-iscsi create --iqn=iqn.2021-04.com.linbit:lun4 --ip=172.16.16.101/24 --username=foo --lun=4 --password=bar --resource-group=iSCSI_group --size=1G

This command will create a 1G iSCSI disk with the provided username and password in the resource group defined iSCSI_group DRBD and pacemaker resources will be automatically created by linstor-iscsi. You can check the pacemaker resources with “pcs status” command.

[root@LINSTOR1 ~]# linstor-iscsi list
+-----------------------------+-----+---------------+-----------+--------------+---------+
|             IQN             | LUN | Pacemaker LUN | Pacemaker | Pacemaker IP | LINSTOR |
+-----------------------------+-----+---------------+-----------+--------------+---------+
| iqn.2020-06.com.linbit:lun4 |   4 |       ✓       |     ✓     |      ✓       |    ✓    |
+-----------------------------+-----+---------------+-----------+--------------+---------+

10.5. Deleting iSCSI target

The following command will delete the iSCSI target from pacemaker as well as the LINSTOR cluster;

linstor-iscsi delete -i iqn.2021-04.com.linbit:lun4 -l 4

10.6. Setting up NFS export

Before creating the nfs exports you need to tell LINSTOR that filesystem will be used for NFS exports will be EXT4. in order to do that, we’ll apply a property to the resource group of NFS resources simply by typing;

linstor rg set-property nfs_group FileSystem/Type ext4

The following command will create a NFS export in the cluster. At first it creates a new resource within the LINSTOR system under the specified name and using the specified resource group. After that it creates resource primitives in the Pacemaker cluster including all necessary order and location constraints. The Pacemaker primites are prefixed with p_, contain the resource name and a resource type postfix.

linstor-nfs create --resource=nfstest --service-ip=172.16.16.102/32 --allowed-ips=172.16.16.0/24 --resource-group=nfs_group --size=1G

You can simply list the nfs exports with the command below;

[root@LINSTOR1 ~]# LINSTOR-nfs list
+---------------+------------------+-----------------------+------------+------------+
| Resource name | LINSTOR resource | Filesystem mountpoint | NFS export | Service IP |
+---------------+------------------+-----------------------+------------+------------+
| nfstest       |        ✓         |           ✓           |     ✓      |     ✓      |
+---------------+------------------+-----------------------+------------+------------+

10.7. Deleting NFS Export

The following command will delete the nfs export from pacemaker as well as the LINSTOR cluster;

[root@LINSTOR1 ~]# linstor-nfs delete -r nfstest

11. LINSTOR EXOS Integration

The EXOS storage manager from Seagate could be configured as one large block device managed by LINSTOR like a local drive but this would prevent concurrent sharing of LINSTOR resources between multiple servers out of the same pool.

LINSTOR integration with EXOS enables multiple server nodes to allocate and connect to LINSTOR resources serviced by the same EXOS pool. Thus all of the EXOS storage management features such as SSD/HDD tiering, SSD caching, snapshots, and thin provisioning are available for LINSTOR resources and Kubernetes Storage Classes.

After configuration, LINSTOR will dynamically map Resource replicas as LUNs presented to server nodes through one of the two EXOS controllers.

Since the EXOS controllers are managed by a secure network API, LINSTOR must be configured with proper networking and username/password combination. The diagram below is showing the relationship between LINSTOR cluster and EXOS Enclosures.

ExosIntegeration
Multi-host setup allows up to 8 LINSTOR nodes to be directly connected with 48Gbit SAS links for low latency and high throughput.

Load balancing and server failover are managed & enabled by LINSTOR while volume creation is handled by the EXOS hardware RAID engine.

The EXOS storage provider in LINSTOR offers native integration with EXOS’ REST-API.

This section will describe how to enable EXOS integration and configure LINSTOR to manage storage backed by an EXOS enclosure.

EXOS storage systems offer a feature rich set of configuration options to match any enterprise storage demand. To maximize ease of use, this guide is based on the following defaults and assumptions:

  1. Dual Controllers – EXOS systems controllers are Active/Active with automatic failover. Both controllers IP address must be configured also in the LINSTOR properties for full support.

  2. Dual EXOS Pools – Optimal performance is achieved when data from pool A is accessed through Controller A. If a node is connected to both Controller A and B of same controller, LINSTOR will configure Linux multipath which will detect best route.

  3. EXOS Pool Serial Numbers – When a EXOS pool is created, it receives a unique serial number. Each one has to be configured as a backing storage in LINSTOR to create a link between EXOS enclosure & LINSTOR. With that information, LINSTOR can understand if you are referring to EXOS Pool A or Pool B.

  4. Creating EXOS Pools – The administrator is required to create EXOS Pools A and B prior to configuring LINSTOR. EXOS features such as thin provisioning, auto tiering, and snapshot options are selected at this time.

  5. Replicas Within Enclosures – EXOS system have redundant controllers, power supplies and communication paths to the drives. Some administrators may desire that resource replicas are not stored in the same enclosure. In this case the administrator must create multiple LINSTOR Pools configured with only one EXOS pool member from each enclosure.

11.1. EXOS Properties as a LINSTOR Storage Provider

LINSTOR’s native integration with EXOS is configured by setting a few properties on the LINSTOR Controller and creating the appropriate LINSTOR objects specific to your EXOS enclosures, as described in the sections below.

The information in the table below is needed from your EXOS enclosures. This information will be used to populate the appropriate LINSTOR Controller properties and LINSTOR objects in the sub-sections that follow.

Table 2. Required Information
EXOS Information Description Placeholder in Command Examples

EXOS Enclosure Name

Uniquely selected by the Admin for a given EXOS enclosure

<exos_encl_name>

Controller Hostname

The DNS resolvable hostname for one of the Controllers

<exos_ctrlr_name>

Controller IP

IP address of controller

<exos_ctrlr_ip>

REST-API Username

Username for REST-API of all EXOS controllers under the given enclosure

<exos_rest_user>

REST-API Password

Password for REST-API of all EXOS controllers under the given enclosure

<exos_rest_pass>

EXOS Pool Serial Number

The serial number of an EXOS pool to become a member of a LINSTOR Pool

<exos_pool_sn>

11.2. Configuration Steps

Configuring a topology of LINSTOR server nodes and multiple EXOS Storage systems is described by these steps:

  1. Set global or unique EXOS Controller usernames and passwords.

  2. Define EXOS enclosures and Controller network identities.

  3. Create node to enclosure to pool mapping matching physical SAS cabling.

11.2.1. Step 1: EXOS Usernames and Passwords

Usernames and passwords can be unique for each EXOS enclosure or be common for all enclosures depending on how the system administrator has deployed the EXOS systems. The default EXOS username and password will be used if not set for a given EXOS controller. The defaults are set as follows:

# linstor exos set-defaults --username <exos_rest_name>
# linstor exos set-defaults --password <exos_rest_pass>

Unique usernames and passwords For EXOS controllers are set by:

# linstor controller set-property
    StorDriver/Exos/<exos_encl_name>/username <exos_rest_name>
# linstor controller set-property
    StorDriver/Exos/<exos_encl_name>/Password <exos_rest_pass>
Passwords entered in this fashion will show up as plaintext when using get-defaults.

With the above command, LINSTOR will store your password in cleartext in the LINSTOR properties and visible by a simple linstor controller list-properties command. You can hide it under an environment variable, and use the UsernameEnv and/or PasswordEnv properties. This tells LINSTOR to look in environment variable for the actual username/password, as shown in the following example:

LINSTOR will not modify the environment variables, only read from them. Storage admin has to make sure the env-vars are correctly set.
# echo $EXOS_PW
mySecretPassword
# linstor controller set-property \
    StorDriver/Exos/<exos_encl_name>/PasswordEnv EXOS_PW

If both property-versions (i.e. Password and PasswordEnv) are set, the non-environment version is preferred.

If the satellite is started before the environment variable is set, the satellite needs to be restarted in order to see the new environment variable

11.2.2. Step 2: Define EXOS enclosures and controller identities.

Registering an EXOS enclosure in LINSTOR can be done with the create command:

# linstor exos create <exos_encl_name> <exos_ctrl_a_ip> [<exos_ctrl_b_ip>]

If no special --username or --password is given, the above mentioned defaults are used.

The Controller’s DNS name and IP address may be used interchangeably.

If you wish to use a hostname that is not DNS resolvable to reference your EXOS enclosure within LINSTOR, you may use any name in place of <exos_hostname>, but you will also have to supply the enclosure’s IP address: linstor node create <desired_name> <enclosure_ip>

Use the following example to create and inspect the current controller settings:

# linstor exos create Alpha 172.16.16.12 172.16.16.13
# linstor exos list
+------------------------------------------------------------------+
| Enclosure | Ctrl A IP    | Ctrl B IP    | Health | Health Reason |
|==================================================================|
| Alpha     | 172.16.16.12 | 172.16.16.13 | OK     |               |
+------------------------------------------------------------------+

For a more in-depth view, you can always ask the LINSTOR controller and/or the LINSTOR nodes for the Exos-related properties:

# linstor controller list-properties | grep Exos
| StorDriver/Exos/Alpha/A/IP                | 172.16.16.12         |
| StorDriver/Exos/Alpha/B/IP                | 172.16.16.13         |

11.2.3. Step 3: Create Node to Enclosure to Pool mapping.

A LINSTOR Satellite node can be created as usual.

# linstor node create <satellite_hostname>

The storage pool can also be created as usual in LINSTOR. Only the name of the previously registered EXOS enclosure as well as the serial number of the EXOS pool needs to be specified:

# linstor storage-pool create exos \
  <satellite_hostname> <linstor_pool_name> <exos_encl_name> <exos_pool_sn>

the linstor_pool_name can be set to (almost) any unique string for the LINSTOR deployment.

Here is an example of mapping an EXOS Pool in EXOS enclosure Alpha to two Satellite nodes:

# linstor storage-pool create exos \
   node1 poolA Alpha 00c0ff29a5f5000095a2075d01000000
# linstor storage-pool create exos \
   node2 poolA Alpha 00c0ff29a5f5000095a2075d01000000

After creating an exos storage pool the LINSTOR Satellite will scan the given EXOS enclosure for connected ports. If cabled, these ports will be listed in the following command:

# linstor exos map -p
+----------------------------------------------+
| Node Name | Enclosure Name | Connected Ports |
|==============================================|
| node1     | Alpha          | A0, B0          |
| node2     | Alpha          | A1, B1          |
+----------------------------------------------+

The pool configuration is shown by:

hr01u09:~ # linstor sp list -s poolA -p
+----------------------------------------------------------------------------------------------+
| StoragePool | Node  | Driver   | PoolName                               | FreeCapacity | ... |
|==============================================================================================|
| poolA       | node1 | EXOS     | Alpha_00c0ff29a5f5000095a2075d01000000 |      581 TiB | ... |
| poolA       | node2 | EXOS     | Alpha_00c0ff29a5f5000095a2075d01000000 |      581 TiB | ... |
+----------------------------------------------------------------------------------------------+

Detailed description of all the available EXOS commands is found with built in help.

# linstor exos -h

11.3. Creating Resources Backed by EXOS Storage Pools

Creating LINSTOR resources from EXOS backed storage-pools follows normal LINSTOR usage patterns as described in other sections of the LINSTOR User’s Guide such as the sections describing LINSTOR resource groups or the more granular resource-definition, volume-definition, resource creation workflow.


1. If a host is also a storage node, it will use a local copy of an image if that is available