LINSTOR用户指南

请先看这个

本指南旨在为软件定义存储解决方案LINSTOR的用户提供最终参考指南和手册。

本指南始终假设您正在使用最新版本的LINSTOR和相关工具。

本指南的组织如下:

LINSTOR

1. 基本管理任务/设置

LINSTOR是一个用于Linux系统上存储的配置管理系统。它管理节点集群上的LVM逻辑卷和/或ZFS ZVOLs。它利用DRBD在不同节点之间进行复制,并为用户和应用程序提供块存储设备。它通过bcache管理SSD中HDD备份数据的快照、加密和缓存。

1.1. 概念和术语

本节将介绍一些核心概念和术语,您需要熟悉这些概念和术语才能理解LINSTOR是如何工作和部署存储的。这一部分是以 “ground up” 的方式布置的。

1.1.1. 可安装组件

linstor-controller

LINSTOR设置至少需要一个活动的controller和一个或多个satellites。

The linstor-controller relies on a database that holds all configuration information for the whole cluster. It makes all decisions that need to have a view of the whole cluster. Multiple controllers can be used for LINSTOR but only one can be active.

linstor-satellite

linstor-satellite 运行在每个LINSTOR使用本地存储或为服务提供存储的节点上。它是无状态的;它从控制器接收所需的所有信息。它运行 lvcreatedrbdadm 等程序。它行为方式就像一个节点代理。

linstor-client

linstor-client 是一个命令行实用程序,用于向系统发送命令并检查系统的状态。

1.1.2. 对象

对象是LINSTOR呈现给最终用户或应用程序的最终结果,例如Kubernetes/OpenShift、复制块设备(DRBD)、NVMeOF目标等。

节点

节点是参与LINSTOR集群的服务器或容器。 Node 属性定义如下:

  • 确定节点参与哪个LINSTOR集群

  • 设置节点的角色:Controller, Satellite, Auxiliary

  • 网络接口 对象定义了节点的网络连接

网络接口

顾名思义,这就是定义节点网络接口的接口/地址的方式。

Definitions

Definitions定义了一个对象的属性,它们可以被看作是概要文件或模板。创建的对象将继承definitions中定义的配置。必须在创建关联对象之前定义definitions。例如,在创建 Resource 之前,必须先创建 ResourceDefinition_

StoragePoolDefinition
  • 定义存储池的名称

ResourceDefinition

资源 definitions定义资源的以下属性:

  • DRBD资源的名称

  • 用于资源连接的DRBD的TCP端口

VolumeDefinition

Volume definitions 定义如下:

  • DRBD资源的卷

  • 卷的大小

  • DRBD资源卷的卷号

  • 卷的元数据属性

  • 用于与DRBD卷关联的DRBD设备的次要编号

StoragePool

StoragePool 用于标识LINSTOR上下文中的存储。它定义了:

  • 特定节点上存储池的配置

  • 用于群集节点上的存储池的存储后端驱动程序(LVM、ZFS等)

  • 要传递给存储备份驱动程序的参数和配置

Resource

LINSTOR现在已经扩展了它的能力,在DRBD之外管理更广泛的存储技术。 Resource 包括:

  • 表示在 ResourceDefinition 中定义的DRBD资源的位置

  • 在群集中的节点上放置资源

  • 定义节点上 ResourceDefinition 的位置

Volume

Volumes是 Resource 的子集。一个 Resource 可能有多个volumes,例如,您可能希望将数据库存储在比MySQL集群中的日志慢的存储上。通过将 volumes 保持在单个 resource 下,实际上就是在创建一致性组。 Volume 属性还可以定义更细粒度级别的属性。

1.2. 更广泛的背景

虽然LINSTOR可以用来使DRBD的管理更加方便,但它通常与更高层次的软件栈集成。这种集成已经存在于Kubernetes、OpenStack、OpenNebula和Proxmox中。本指南中包含了在这些环境中部署LINSTOR的特定章节。

LINSTOR使用的南向驱动程序是LVM、thinLVM和ZFS。

1.3. 包

LINSTOR以.rpm和.deb两种包分发:

  1. linstor-client 包含命令行客户端程序。这取决于通常已经安装的python。在RHEL8系统中,需要创建python符号链接

  2. linstor-controllerlinstor-satellite 都包含服务的系统单元文件。它们依赖于Java runtime environment(JRE)1.8版(headless)或更高版本。

有关这些软件包的更多详细信息,请参见上面的Installable Components部分。

如果您订阅了LINBIT的支持,那么您将可以通过我们的官方仓库访问我们经过认证的二进制文件。

1.4. 安装

如果要在容器中使用LINSTOR,请跳过此主题并使用下面的 “容器” 部分进行安装。

1.4.1. Ubuntu Linux

如果您想选择使用DRBD创建复制存储,则需要安装 drbd-dkmsdrbd-utils 。这些包需要安装在所有节点上。您还需要选择一个卷管理器,ZFS或LVM,在本例中我们使用的是LVM。

# apt install -y drbd-dkms drbd-utils lvm2

根据节点是LINSTOR controller、satellite还是两者(组合)将确定该节点上需要哪些包。对于组合型节点,我们需要controller和satellite 的LINSTOR包。

组合节点:

# apt install linstor-controller linstor-satellite  linstor-client

这将使我们剩余的节点成为我们的Satellites,因此我们需要在它们上安装以下软件包:

# apt install linstor-satellite  linstor-client

1.4.2. SUSE Linux企业服务器

SLES高可用性扩展(HAE)包括DRBD。

On SLES, DRBD is normally installed via the software installation component of YaST2. It comes bundled with the High Availability package selection.

当我们下载DRBD的最新模块时,我们可以检查LVM工具是否也是最新的。喜欢命令行安装的用户可用以下命令获得最新的DRBD和LVM版本:

# zypper install drbd lvm2

根据节点是LINSTOR controller、satellite还是两者(组合)将确定该节点上需要哪些包。对于组合型节点,我们需要controller和satellite 的LINSTOR包。

组合节点:

# zypper install linstor-controller linstor-satellite  linstor-client

这将使我们剩余的节点成为我们的Satellites,因此我们需要在它们上安装以下软件包:

# zypper install linstor-satellite  linstor-client

1.4.3. CentOS

CentOS从第5版开始就有了DRBD 8。对于DRBD 9,您需要查看EPEL和类似的源代码。或者,如果您与LINBIT有签订支持合同,您可以利用我们的RHEL 8存储库。可以使用 yum 安装DRBD。我们也可以检查最新版本的LVM工具。

如果要复制存储,LINSTOR 需要 DRBD 9。这需要配置外部存储库,可以是LINBIT的,也可以是第三方的。
# yum install drbd kmod-drbd lvm2

根据节点是LINSTOR controller、satellite还是两者(组合)将确定该节点上需要哪些包。对于组合型节点,我们需要controller和satellite 的LINSTOR包。

在RHEL8系统上,需要安装python2才能让linstor客户端工作。

组合节点:

# yum install linstor-controller linstor-satellite  linstor-client

这将使我们剩余的节点成为我们的Satellites,因此我们需要在它们上安装以下软件包:

# yum install linstor-satellite  linstor-client

1.5. 升级

LINSTOR不支持滚动升级,controller和satellites必须具有相同的版本,否则controller将丢弃具有 版本不匹配 的satellite。但这不是问题,因为satellite不会做任何动作,只要它没有连接到controller,DRBD就不会被任何方式中断。

If you are using the embedded default H2 database and the linstor-controller package is upgraded an automatic backup file of the database will be created in the default /var/lib/linstor directory. This file is a good restore point if for any reason a linstor-controller database migration should fail, than it is recommended to report the error to LINBIT and restore the old database file and downgrade to your previous controller version.

如果使用任何外部数据库或etcd,建议手动备份当前数据库以获得还原点。

因此,首先升级controller主机上的 linstor-controllerlinstor-client 包,然后重新启动 linstor-controler, controller应启动,其所有客户端应显示 OFFLINE(VERSION_MISMATCH)。之后,您可以继续升级所有卫星节点上的`linstor-satellite`, 并重新启动它们,在短时间重新连接后,它们都应再次显示 ONLINE,并且您的升级已完成。

1.6. 容器

LINSTOR也可用容器运行。基础镜像可以在LINBIT的容器registry仓库 drbd.io 中找到。

要访问这些镜像,首先必须登录registry(访问sales@linbit.com获取凭据):

# docker login drbd.io

The containers available in this repository are:

  • drbd.io/drbd9-rhel8

  • drbd.io/drbd9-rhel7

  • drbd.io/drbd9-sles15sp1

  • drbd.io/drbd9-bionic

  • drbd.io/drbd9-focal

  • drbd.io/linstor-csi

  • drbd.io/linstor-controller

  • drbd.io/linstor-satellite

  • drbd.io/linstor-client

通过在浏览器中打开http://drbd.io,可以可用image的最新列表。确保通过 “http”访问主机,因为registry的image本身通过 “https”提供服务。

To load the kernel module, needed only for LINSTOR satellites, you’ll need to run a drbd9-$dist container in privileged mode. The kernel module containers either retrieve an official LINBIT package from a customer repository, use shipped packages, or they try to build the kernel modules from source. If you intend to build from source, you need to have the according kernel headers (e.g., kernel-devel) installed on the host. There are 4 ways to execute such a module load container:

  • Building from shipped source

  • Using a shipped/pre-built kernel module

  • 指定LINBIT节点哈希和容器版本。

  • 绑定装入现有仓库配置。

从已分发代码生成的示例(基于RHEL):

# docker run -it --rm --privileged -v /lib/modules:/lib/modules \
  -v /usr/src:/usr/src:ro \
  drbd.io/drbd9-rhel7

Example using a module shipped with the container, which is enabled by not bind-mounting /usr/src:

# docker run -it --rm --privileged -v /lib/modules:/lib/modules \
  drbd.io/drbd9-rhel8

Example using a hash and a distribution (rarely used):

# docker run -it --rm --privileged -v /lib/modules:/lib/modules \
  -e LB_DIST=rhel7.7 -e LB_HASH=ThisIsMyNodeHash \
  drbd.io/drbd9-rhel7

Example using an existing repo config (rarely used):

# docker run -it --rm --privileged -v /lib/modules:/lib/modules \
  -v /etc/yum.repos.d/linbit.repo:/etc/yum.repos.d/linbit.repo:ro \
  drbd.io/drbd9-rhel7
In both cases (hash + distribution, as well as bind-mounting a repo) the hash or config has to be from a node that has a special property set. Feel free to contact our support, and we set this property.
到现在为止(即,DRBD 9之前的版本 \9.0.17″),您必须使用容器化DRBD内核模块,而不是将内核模块加载到主机系统上。如果要使用容器,则不应在主机系统上安装DRBD内核模块。对于DRBD版本9.0.17或更高版本,可以像往常一样在主机系统上安装内核模块,但需要确保使用 usermode_helper=disabled`参数(例如, `modprobe drbd usermode_helper=disabled )加载模块。

然后以守护进程的身份运行LINSTOR satellite容器, 也需具有特权:

# docker run -d --name=linstor-satellite --net=host -v /dev:/dev --privileged drbd.io/linstor-satellite
net=host 是容器化的 drbd-utils 通过netlink与主机内核通信所必需的。

要将LINSTOR控制器容器作为守护进程运行,请将主机上的端口 337033763377 映射到该容器:

# docker run -d --name=linstor-controller -p 3370:3370 -p 3376:3376 -p 3377:3377 drbd.io/linstor-controller

要与容器化LINSTOR集群交互,可以使用通过包安装在系统上的LINSTOR客户端,也可以通过容器化LINSTOR客户端。要使用LINSTOR客户端容器,请执行以下操作:

# docker run -it --rm -e LS_CONTROLLERS=<controller-host-IP-address> drbd.io/linstor-client node list

从这里开始,您可以使用LINSTOR客户端初始化集群,并开始使用典型的LINSTOR模式创建资源。

要停止并删除守护的容器和映像,请执行以下操作:

# docker stop linstor-controller
# docker rm linstor-controller

1.7. 初始化集群

我们假设在 所有 集群节点上已完成以下步骤:

  1. DRBD9内核模块已安装并加载

  2. 已安装 drbd-utils

  3. LVM 工具已安装

  4. linstor-controller 和/或 linstor-satellite 的依赖项已安装

  5. linstor-client 已安装在 linstor-controller 节点上

在已安装linstor-controller的主机上启动并启用该服务:

# systemctl enable --now linstor-controller

如果您确定linstor控制器服务在安装时自动启用,则还可以使用以下命令:

# systemctl start linstor-controller

1.8. 使用LINSTOR客户端

无论何时运行LINSTOR命令行客户端,它都需要知道listor-controller的运行位置。如果不指定它,它将尝试访问本地运行的linstor-controller,该控制器侦听IP 127.0.0.1 端口是 3376 。这样,我们可以在与 linstor-controller 相同的主机上使用 linstor-client

linstor-satellite 需要端口3366和3367。 linstor-controller 需要端口3376和3377。确保防火墙上允许这些端口。
# linstor node list

应该输出一个空列表,而不是一条错误信息。

您可以在任何其他机器上使用 linstor 命令,但随后您需要告诉客户端如何找到linstor-controller。如下所示,可以将其指定为命令行选项、环境变量或全局文件:

# linstor --controllers=alice node list
# LS_CONTROLLERS=alice linstor node list

或者,您可以创建 /etc/linstor/linstor-client.conf 文件,并按如下方式填充它。

[global]
controllers=alice

如果配置了多个linstor控制器,只需在逗号分隔的列表中指定它们即可。linstor客户机会按照列出的顺序进行尝试。

linstor-client命令也可以通过只写参数的起始字母来快速方便地使用,例如: linstor node listlinstor n l

1.9. 向集群添加节点

The next step is to add nodes to your LINSTOR cluster.

# linstor node create bravo 10.43.70.3

If the IP is omitted, the client will try to resolve the given node-name as host-name by itself.

Linstor will automatically detect the node’s local uname -n which is later used for the DRBD-resource.

使用 `linstor node list`时,将看到新节点标记为脱机。现在启动并启用该节点上的 linstor-satellite,以便服务在重新启动时也启动:

# systemctl enable --now  linstor-satellite

如果您确定该服务已默认启用并在重新启动时启动,则还可以使用 systemctl start linstor-satellite

大约10秒后,您将看到 linstor node list 中的状态变为联机。当然,在controller知道satellite节点的存在之前,satellite进程已经启动完毕。

如果承载controller的节点也应该为LINSTOR集群提供存储空间,则必须将其添加为节点并启动linstor-satellite。

If you want to have other services wait until the linstor-satellite had a chance to create the necessary devices (i.e. after a boot), you can update the corresponding .service file and change Type=simple to Type=notify.

This will cause the satellite to delay sending the READY=1 message to systemd until the controller connects, sends all required data to the satellite and the satellite at least tried once to get the devices up and running.

1.10. 存储池

StoragePools在LINSTOR上下文中用于标识存储。要对多个节点的存储池进行分组,只需在每个节点上使用相同的名称。例如,一种有效的方法是给所有的ssd取一个名字,给所有的hdd取另一个名字。

在每个提供存储的主机上,您需要创建LVM VG或ZFS zPool。使用一个LINSTOR存储池名称标识的VG和zPool在主机上可能有不同的VG或zPool名称,但请您自己考虑,是否在所有节点上使用相同的VG或zPool名称。

# vgcreate vg_ssd /dev/nvme0n1 /dev/nvme1n1 [...]

然后需要向LINSTOR注册:

# linstor storage-pool create lvm alpha pool_ssd vg_ssd
# linstor storage-pool create lvm bravo pool_ssd vg_ssd
存储池名称和公共元数据称为 存储池定义 。上面列出的命令隐式创建了存储池定义。使用 linstor storage-pool-definition list 可以看到这一点。也可以显式创建存储池定义,但不是必需的。

要列出您可以使用的存储池,请执行以下操作:

# linstor storage-pool list

或者使用短版本

# linstor sp l

如果由于附加的资源或快照(其中一些卷位于另一个仍在运行的存储池中)而阻止删除存储池,则会在相应list命令的 status 列中给出提示(例如 linstor resource list ) 手动删除丢失的存储池中的LINSTOR对象后,可以再次执行lost命令,以确保完全删除了存储池及其剩余对象。

1.10.1. 每个后端设备的存储池

在仅只有一种存储和热修复存储设备功能的集群中,可以选择一种模型,在该模型中为每个物理备份设备创建一个存储池。此模型的优点是将故障域限制在单个存储设备上。

1.10.2. Physical storage command

Since linstor-server 1.5.2 and a recent linstor-client, LINSTOR can create LVM/ZFS pools on a satellite for you. The linstor-client has the following commands to list possible disks and create storage pools, but such LVM/ZFS pools are not managed by LINSTOR and there is no delete command, so such action must be done manually on the nodes.

# linstor physical-storage list

Will give you a list of available disks grouped by size and rotational(SSD/Magnetic Disk).

It will only show disks that pass the following filters:

  • The device size must be greater than 1GiB

  • The device is a root device (not having children) e.g.: /dev/vda, /dev/sda

  • The device does not have any file-system or other blkid marker (wipefs -a might be needed)

  • The device is no DRBD device

With the create-device-pool command you can create a LVM pool on a disk and also directly add it as a storage-pool in LINSTOR.

# linstor physical-storage create-device-pool --pool-name lv_my_pool LVMTHIN node_alpha /dev/vdc --storage-pool newpool

If the --storage-pool option was provided, LINSTOR will create a storage-pool with the given name.

For more options and exact command usage please check the linstor-client help.

1.11. 资源组

资源组是资源定义的父对象,其中对资源组所做的所有属性更改都将由其资源定义的子级继承。资源组还存储自动放置规则的设置,并可以根据存储的规则生成资源定义。

简单地说,资源组就像模板,定义从它们创建的资源的特性。对这些伪模板的更改将应用于从资源组中创建的所有资源,并具有追溯性。

使用资源组定义资源配置方式应被视为部署由LINSTOR配置的卷的典型方法。后面描述从 资源定义卷定义 创建每个 资源 的章节应仅在特殊情况下使用。
即使您选择不在LINSTOR集群中创建和使用 资源组 ,从 资源定义卷定义 创建的所有资源都将存在于 DfltRscGrp 资源组 中。

使用资源组部署资源的简单模式如下:

# linstor resource-group create my_ssd_group --storage-pool pool_ssd --place-count 2
# linstor volume-group create my_ssd_group
# linstor resource-group spawn-resources my_ssd_group my_ssd_res 20G

上述命令创建出一个名为 my_ssd_res 的资源,其中一个定义了双副本数的20GB卷将从名为 pool_ssd 的存储池的节点自动配置。

一个更有用的模式可能是创建一个资源组,其中的设置是您确定的最适合您的用例的。也许您必须对卷的一致性进行夜间联机验证,在这种情况下,您可以创建一个资源组,其中已经设置了您选择的 verify-alg ,以便从该组生成的资源预先配置为 verify-alg 集:

# linstor resource-group create my_verify_group --storage-pool pool_ssd --place-count 2
# linstor resource-group drbd-options --verify-alg crc32c my_verify_group
# linstor volume-group create my_verify_group
# for i in {00..19}; do
    linstor resource-group spawn-resources my_verify_group res$i 10G
  done

上述命令创建出20个10GiB资源,每个资源都预先配置了 crc32cverify alg

您可以通过在相应的 资源定义卷定义 上设置选项来优化从资源组派生的单个资源或卷的设置。例如,如果上面示例中的 res11 被一个接收大量小的随机写入的、非常活跃的数据库使用,则可能需要增加该特定资源的 al-extents :

# linstor resource-definition drbd-options --al-extents 6007 res11

If you configure a setting in a resource-definition that is already configured on the resource-group it was spawned from, the value set in the resource-definition will override the value set on the parent resource-group. For example, if the same ‘res11’ was required to use the slower but more secure ‘sha256’ hash algorithm in its verifications, setting the ‘verify-alg’ on the resource-definition for ‘res11’ would override the value set on the resource-group:

# linstor resource-definition drbd-options --verify-alg sha256 res11
继承设置的层次结构的一个经验法则是 “更接近” 资源或卷的值: 卷定义 设置优先于 卷组 设置, 资源定义 设置优先于 资源组 设置。

1.12. 集群配置

1.12.1. 可用的存储插件

LINSTOR支持如下的存储插件:

  • Thick LVM

  • Thin LVM with a single thin pool

  • Thick ZFS

  • Thin ZFS

1.13. 创建和部署资源/卷

在下面的场景中,我们假设目标是创建一个大小为 500 GB 的资源 备份,该资源在三个集群节点之间复制。

首先,我们创建一个新的资源定义:

# linstor resource-definition create backups

其次,我们在该资源定义中创建一个新的卷定义:

# linstor volume-definition create backups 500G

如果要更改卷定义的大小,只需执行以下操作:

# linstor volume-definition set-size backups 0 100G

参数 0 是资源 backups 中的卷数。必须提供此参数,因为资源可以有多个卷,并且它们由所谓的卷号标识。这个数字可以通过列出卷定义来找到。

只有在没有资源的情况下,才能减小卷定义的大小。尽管如此,即使部署了资源,也可以增加大小。

到目前为止,我们只在LINSTOR的数据库中创建了对象,没有在存储节点上创建一个LV。现在你可以选择将安置任务派发给LINSTOR,或者自己做。

1.13.1. 手动放置

使用 resource create 命令,可以将资源定义显式分配给命名节点。

# linstor resource create alpha backups --storage-pool pool_hdd
# linstor resource create bravo backups --storage-pool pool_hdd
# linstor resource create charlie backups --storage-pool pool_hdd

1.13.2. 自动放置

autoplace之后的值告诉LINSTOR您想要多少个副本。storage-pool 选项应该很明显。

# linstor resource create backups --auto-place 3 --storage-pool pool_hdd

可能不太明显的是,您可以省略 --storage-pool 选项,然后LINSTOR可以自己选择一个存储池。选择遵循以下规则:

  • 忽略当前用户无权访问的所有节点和存储池

  • 忽略所有无磁盘存储池

  • 忽略没有足够可用空间的所有存储池

The remaining storage pools will be rated by different strategies. LINSTOR has currently three strategies:

MaxFreeSpace: This strategy maps the rating 1:1 to the remaining free space of the storage pool. However, this strategy only considers the actually allocated space (in case of thinly provisioned storage pool this might grow with time without creating new resources) MinReservedSpace: Unlike the “MaxFreeSpace”, this strategy considers the reserved spaced. That is the space that a thin volume can grow to before reaching its limit. The sum of reserved spaces might exceed the storage pools capacity, which is as overprovisioning. MinRscCount: Simply the count of resources already deployed in a given storage pool MaxThroughput: For this strategy, the storage pool’s Autoplacer/MaxThroughput property is the base of the score, or 0 if the property is not present. Every Volume deployed in the given storage pool will subtract its defined sys/fs/blkio_throttle_read and sys/fs/blkio_throttle_write property- value from the storage pool’s max throughput. The resulting score might be negative.

The scores of the strategies will be normalized, weighted and summed up, where the scores of minimizing strategies will be converted first to allow an overall maximization of the resulting score.

The weights of the strategies can be configured with

linstor controller set-property Autoplacer/Weights/$name_of_the_strategy $weight

whereas the strategy-names are listed above and the weight can be an arbitrary decimal.

To keep the behaviour of the autoplacer similar to the old one (due to compatibility), all strategies have a default-weight of 0, except the MaxFreeSpace which has a weight of 1.
Neither 0 nor a negative score will prevent a storage pool from getting selected, just making them to be considered later.

Finally LINSTOR tries to find the best matching group of storage pools meeting all requirements. This step also considers other autoplacement restrictions as --replicas-on-same, --replicas-on-different and others.

These two arguments, --replicas-on-same and --replicas-on-different expect the name of a property within the Aux/ namespace. The following example shows that the client automatically prefixes the testProperty with the Aux/ namespace.

linstor resource-group create testRscGrp --replicas-on-same testProperty
SUCCESS:
Description:
    New resource group 'testRscGrp' created.
Details:
    Resource group 'testRscGrp' UUID is: 35e043cb-65ab-49ac-9920-ccbf48f7e27d

linstor resource-group list
+-----------------------------------------------------------------------------+
| ResourceGroup | SelectFilter                         | VlmNrs | Description |
|-============================================================================|
| DfltRscGrp    | PlaceCount: 2                        |        |             |
|-----------------------------------------------------------------------------|
| testRscGrp    | PlaceCount: 2                        |        |             |
|               | ReplicasOnSame: ['Aux/testProperty'] |        |             |
+-----------------------------------------------------------------------------+
如果一切顺利,DRBD资源现在已经由LINSTOR创建。这时可以通过使用 lsblk 命令查找DRBD块设备来检查,设备名应该类似于 drbd0000

现在我们应该可以加载我们资源的块设备并开始使用LINSTOR。

2. 高阶LINSTOR任务

2.1. LINSTOR high availability

By default a LINSTOR cluster consists of exactly one LINSTOR controller. Making LINSTOR highly-available involves providing replicated storage for the controller database, multiple LINSTOR controllers where only one is active at a time, and a service manager that takes care of mounting/unmounting the highly-available storage and starting/stopping LINSTOR controllers.

2.1.1. Highly-available Storage

For configuring the highly-available storage we use LINSTOR itself. This has the advantage that the storage is under LINSTOR control and can for example be easily extended to new cluster nodes. Just create a new resource with 200MB in size. This could look like this, you certainly need to adapt the storage pool name:

# linstor resource-definition create linstor_db
# linstor volume-definition create linstor_db 200M
# linstor resource create linstor_db -s pool1 --auto-place 3

From now on we assume the resource’s name is “linstor_db”. It is crucial that your cluster qualifies for auto-quorum (see Section 自动仲裁策略).

After the resource is created, it is time to move the LINSTOR DB to the new storage and to create a systemd mount service. First we stop the current controller and disable it, as it will be managed by drbdd later.

# systemctl disable --now linstor-controller
# cat /etc/systemd/system/var-lib-linstor.mount
[Unit]
Description=Filesystem for the LINSTOR controller

[Mount]
# you can use the minor like /dev/drbdX or the udev symlink
What=/dev/drbd/by-res/linstor_db/0
Where=/var/lib/linstor
# mv /var/lib/linstor{,.orig}
# mkfs.ext4 /dev/drbd/by-res/linstor_db/0
# systemctl start var-lib-linstor.mount
# cp -r /var/lib/linstor.orig/* /var/lib/linstor
# systemctl start linstor-controller

copy the /etc/systemd/system/var-lib-linstor.mount mount file to all the standby nodes for the linstor controller. Again, do not systemctl enable any of these services, they get managed by drbdd.

2.1.2. Multiple LINSTOR controllers

The next step is to install LINSTOR controllers on all nodes that have access to the “linstor” DRBD resource (as they need to mount the DRBD volume) and which you want to become a possible LINSTOR controller. It is important that the controllers are manged by drbdd, so make sure the linstor-controller.service is disabled on all nodes! To be sure, execute systemctl disable linstor-controller on all cluster nodes and systemctl stop linstor-controller on all nodes except the one it is currently running from the previous step.

2.1.3. Managing the services

For starting and stopping the mount service and the linstor-controller service we use drbdd. Install this component on all nodes that could become a LINSTOR controller and edit their /etc/drbdd.toml configuration file. It should contain an enabled promoter plugin section like this:

[[promoter]]
[promoter.resources.linstor_db]
start = ["var-lib-linstor.mount", "linstor-controller.service"]

Depending on your requirements you might also want to set an on-stop-failure action.

After that restart drbdd and enable it on all the nodes you configured it.

# systemctl restart drbdd
# systemctl enable drbdd

Check that there are no warnings from drbdd in the logs by running systemctl status drbdd. As there is already an active LINSTOR controller things will just stay the way they are.

The last but never the less important step is to configure the LINSTOR satellite services to not delete (and then regenerate) the resource file for the LINSTOR controller DB at its startup. Also the satellite needs to start after drbdd. Do not edit the service files directly, but use systemctl edit. Edit the service file on all nodes that could become a LINSTOR controller and that are also LINSTOR satellites.

# systemctl edit linstor-satellite
[Service]
Environment=LS_KEEP_RES=linstor_db
[Unit]
After=drbdd.service

After this change you should execute systemctl restart linstor-satellite on all satellite nodes.

Be sure to configure your LINSTOR client for use with multiple controllers as described in the section titled, 使用LINSTOR客户端 and make sure you also configured your integration plugins (e.g., the Proxmox plugin) to be ready for multiple LINSTOR controllers.

2.2. DRBD客户端

通过使用 --drbd-diskless 选项而不是 --storage-pool ,可以在节点上拥有永久无盘drbd设备。这意味着资源将显示为块设备,并且可以在没有现有存储设备的情况下装载到文件系统。资源的数据通过网络在具有相同资源的其他节点上访问。

# linstor resource create delta backups --drbd-diskless
不推荐使用选项 --diskless, 因为已过时。请改用 --drbd-diskless--nvme-initiator

2.3. LINSTOR – DRBD一致性组/多个卷

所谓的一致性组是DRBD的一个特性。由于LINSTOR的主要功能之一是使用DRBD管理存储集群,因此本用户指南中提到了这一点。 单个资源中的多个卷是一个一致性组。

这意味着一个资源的不同卷上的更改在其他Satellites上以相同的时间顺序复制。

因此,如果资源中的不同卷上有相互依赖的数据,也不必担心时间问题。

要在LINSTOR资源中部署多个卷,必须创建两个同名的卷定义。

# linstor volume-definition create backups 500G
# linstor volume-definition create backups 100G

2.4. 一个资源到不同存储池的卷

这可以通过在将资源部署到节点之前将 StorPoolName 属性设置为卷定义来实现:

# linstor resource-definition create backups
# linstor volume-definition create backups 500G
# linstor volume-definition create backups 100G
# linstor volume-definition set-property backups 0 StorPoolName pool_hdd
# linstor volume-definition set-property backups 1 StorPoolName pool_ssd
# linstor resource create alpha backups
# linstor resource create bravo backups
# linstor resource create charlie backups
由于使用 volume-definition create 命令时没有使用 --vlmnr 选项,因此LINSTOR将从0开始分配卷号。在以下两行中,0和1表示这些自动分配的卷号。

这里的 resource create 命令不需要 --storage-pool 选项。在这种情况下,LINSTOR使用 fallback 存储池。找到该存储池后,LINSTOR按以下顺序查询以下对象的属性:

  • 卷定义

  • Resource

  • 资源定义

  • 节点

如果这些对象都不包含 StorPoolName 属性,则controller将返回硬编码的 DfltStorPool 字符串作为存储池。

这还意味着,如果在部署资源之前忘记定义存储池,则会收到一条错误消息,即LINSTOR找不到名为 DfltStorPool 的存储池。

2.5. 无DRBD的LINSTOR

LINSTOR也可以在没有DRBD的情况下使用。没有DRBD,LINSTOR能够从LVM和ZFS支持的存储池中配置卷,并在LINSTOR集群中的各个节点上创建这些卷。

目前,LINSTOR支持创建LVM和ZFS卷,可以选择在这些卷上分层luk、DRBD和/或/NVMe-TCP的NVMe的某些组合。

例如,我们在LINSTOR集群中定义了一个Thin LVM支持的存储池,名为 thin-lvm:

# linstor --no-utf8 storage-pool list
+--------------------------------------------------------------+
| StoragePool | Node      | Driver   | PoolName          | ... |
|--------------------------------------------------------------|
| thin-lvm    | linstor-a | LVM_THIN | drbdpool/thinpool | ... |
| thin-lvm    | linstor-b | LVM_THIN | drbdpool/thinpool | ... |
| thin-lvm    | linstor-c | LVM_THIN | drbdpool/thinpool | ... |
| thin-lvm    | linstor-d | LVM_THIN | drbdpool/thinpool | ... |
+--------------------------------------------------------------+

我们可以使用LINSTOR在 linstor-d 上创建一个100GiB大小的精简LVM,使用以下命令:

# linstor resource-definition create rsc-1
# linstor volume-definition create rsc-1 100GiB
# linstor resource create --layer-list storage \
          --storage-pool thin-lvm linstor-d rsc-1

你应该看到 linstor-d 上有一个新的瘦LVM。通过使用 --machine-readable 标志集列出LINSTOR资源,可以从LINSTOR提取设备路径:

# linstor --machine-readable resource list | grep device_path
            "device_path": "/dev/drbdpool/rsc-1_00000",

如果要将DRBD放置在此卷上(这是LINSTOR中ZFS或LVM支持的卷的默认 --layer-list 选项),可以使用以下资源创建模式:

# linstor resource-definition create rsc-1
# linstor volume-definition create rsc-1 100GiB
# linstor resource create --layer-list drbd,storage \
          --storage-pool thin-lvm linstor-d rsc-1

然后,您将看到有一个新的 Thin LVM被创建出来用于支撑 linstor-d 上的DRBD卷:

# linstor --machine-readable resource list | grep -e device_path -e backing_disk
            "device_path": "/dev/drbd1000",
            "backing_disk": "/dev/drbdpool/rsc-1_00000",

下表显示了哪个层可以后跟哪个子层:

Layer Child layer

DRBD

CACHE, WRITECACHE, NVME, LUKS, STORAGE

CACHE

WRITECACHE, NVME, LUKS, STORAGE

WRITECACHE

CACHE, NVME, LUKS, STORAGE

NVME

CACHE, WRITECACHE, LUKS, STORAGE

LUKS

STORAGE

STORAGE

一个层只能在层列表中出现一次
For information about the prerequisites for the LUKS layer, refer to the Encrypted Volumes section of this User’s Guide.

2.5.1. NVME-OF/NVME-TCP Linstor Layer

NVMe-oF/NVMe-TCP允许LINSTOR将无盘资源连接到具有相同资源的节点,数据存储在NVMe结构上。这就带来了这样一个优势:通过网络访问数据,无需使用本地存储就可以装载资源。在这种情况下,LINSTOR不使用DRBD,因此不会复制LINSTOR提供的NVMe资源,数据存储在一个节点上。

NVMe-oF仅在支持RDMA的网络上工作,NVMe-TCP在每个可以承载IP流量的网络上工作。如果您想了解有关NVMe-oF/NVMe-TCP的NVMe的更多信息,请访问https://www.linbit.com/en/nvme-linstor-swordfish/。

要将NVMe-oF/NVMe-TCP与LINSTOR一起使用,需要在充当Satellite的每个节点上安装包 nvme-cli,并将NVMe-oF/NVMe-TCP用于资源:

如果不使用Ubuntu,请使用合适的命令在OS安装软件包 – SLES: zypper – CentOS: yum
# apt install nvme-cli

要使资源使用NVMe-oF/NVMe-TCP,必须在创建资源定义时提供一个附加参数:

# linstor resource-definition create nvmedata  -l nvme,storage
默认情况下,使用DRBD时-l(layer-stack)参数设置为 drbd, storage。如果要创建既不使用NVMe也不使用DRBD的LINSTOR资源,则必须将 -l 参数设置为仅使用 storage

为我们的资源创建卷定义:

# linstor volume-definition create nvmedata 500G

在节点上创建资源之前,必须知道数据将在本地存储在哪里,以及哪个节点通过网络访问它。

首先,我们在存储数据的节点上创建资源:

# linstor resource create alpha nvmedata --storage-pool pool_ssd

在将通过网络访问资源数据的节点上,必须将资源定义为无盘:

# linstor resource create beta nvmedata -d

-d 参数将此节点上的资源创建为无盘。

现在,您可以在一个节点上装载资源 nvmedata

If your nodes have more than one NIC you should force the route between them for NVMe-of/NVME-TCP, otherwise multiple NICs could cause troubles.

2.5.2. OpenFlex™ Layer

Since version 1.5.0 the additional Layer openflex can be used in LINSTOR. From LINSTOR’s perspective, the OpenFlex Composable Infrastructure takes the role of a combined layer acting as a storage layer (like LVM) and also providing the allocated space as an NVMe target. OpenFlex has a REST API which is also used by LINSTOR to operate with.

As OpenFlex combines concepts of LINSTOR’s storage as well as NVMe-layer, LINSTOR was added both, a new storage driver for the storage pools as well as a dedicated openflex layer which uses the mentioned REST API.

In order for LINSTOR to communicate with the OpenFlex-API, LINSTOR needs some additional properties, which can be set once on controller level to take LINSTOR-cluster wide effect:

  • StorDriver/Openflex/ApiHost specifies the host or IP of the API entry-point

  • StorDriver/Openflex/ApiPort this property is glued with a colon to the previous to form the basic http://ip:port part used by the REST calls

  • StorDriver/Openflex/UserName the REST username

  • StorDriver/Openflex/UserPassword the password for the REST user

Once that is configured, we can now create LINSTOR objects to represent the OpenFlex architecture. The theoretical mapping of LINSTOR objects to OpenFlex objects are as follows: Obviously an OpenFlex storage pool is represented by a LINSTOR storage pool. As the next thing above a LINSTOR storage pool is already the node, a LINSTOR node represents an OpenFlex storage device. The OpenFlex objects above storage device are not mapped by LINSTOR.

When using NVMe, LINSTOR was designed to run on both sides, the NVMe target as well as on the NVMe initiator side. In the case of OpenFlex, LINSTOR cannot (or even should not) run on the NVMe target side as that is completely managed by OpenFlex. As LINSTOR still needs nodes and storage pools to represent the OpenFlex counterparts, the LINSTOR client was extended with special node create commands since 1.0.14. These commands not only accept additionally needed configuration data, but also starts a “special satellite” besides the already running controller instance. This special satellites are completely LINSTOR managed, they will shutdown when the controller shuts down and will be started again when the controller starts. The new client command for creating a “special satellite” representing an OpenFlex storage device is:

$ linstor node create-openflex-target ofNode1 192.168.166.7 000af795789d

The arguments are as follows:

  • ofNode1 is the node name which is also used by the standard linstor node create command

  • 192.168.166.7 is the address on which the provided NVMe devices can be accessed. As the NVMe devices are accessed by a dedicated network interface, this address differs from the address specified with the property StorDriver/Openflex/ApiHost. The latter is used for the management / REST API.

  • 000af795789d is the identifier for the OpenFlex storage device.

The last step of the configuration is the creation of LINSTOR storage pools:

$ linstor storage-pool create openflex ofNode1 sp0 0
  • ofNode1 and sp0 are the node name and storage pool name, respectively, just as usual for the LINSTOR’s create storage pool command

  • The last 0 is the identifier of the OpenFlex storage pool within the previously defined storage device

Once all necessary storage pools are created in LINSTOR, the next steps are similar to the usage of using an NVMe resource with LINSTOR. Here is a complete example:

# set the properties once
linstor controller set-property StorDriver/Openflex/ApiHost 10.43.7.185
linstor controller set-property StorDriver/Openflex/ApiPort 80
linstor controller set-property StorDriver/Openflex/UserName myusername
linstor controller set-property StorDriver/Openflex/UserPassword mypassword

# create a node for openflex storage device "000af795789d"
linstor node create-openflex-target ofNode1 192.168.166.7 000af795789d

# create a usual linstor satellite. later used as nvme initiator
linstor node create bravo

# create a storage pool for openflex storage pool "0" within storage device "000af795789d"
linstor storage-pool create openflex ofNode1 sp0 0

# create resource- and volume-definition
linstor resource-definition create backupRsc
linstor volume-definition create backupRsc 10G

# create openflex-based nvme target
linstor resource create ofNode1 backupRsc --storage-pool sp0 --layer-list openflex

# create openflex-based nvme initiator
linstor resource create bravo backupRsc --nvme-initiator --layer-list openflex
In case a node should access the OpenFlex REST API through a different host than specified with + linstor controller set-property StorDriver/Openflex/ApiHost 10.43.7.185 you can always use LINSTOR’s inheritance mechanism for properties. That means simply define the same property on the node-level you need it, i.e. + linstor node set-property ofNode1 StorDriver/Openflex/ApiHost 10.43.8.185

2.5.3. 写入缓存层

一个 DM writecache 设备由两个设备、即一个存储设备和一个缓存设备组成。LINSTOR可以设置这样一个writecache设备,但是需要一些额外的信息,比如存储池和缓存设备的大小。

# linstor storage-pool create lvm node1 lvmpool drbdpool
# linstor storage-pool create lvm node1 pmempool pmempool

# linstor resource-definition create r1
# linstor volume-definition create r1 100G

# linstor volume-definition set-property r1 0 Writecache/PoolName pmempool
# linstor volume-definition set-property r1 0 Writecache/Size 1%

# linstor resource create node1 r1 --storage-pool lvmpool --layer-list WRITECACHE,STORAGE

The two properties set in the examples are mandatory, but can also be set on controller level which would act as a default for all resources with WRITECACHE in their --layer-list. However, please note that the Writecache/PoolName refers to the corresponding node. If the node does not have a storage-pool named pmempool you will get an error message.

DM writecache 中所需的4个必需参数要么通过属性配置,要么由LINSTOR计算。上述链接中列出的可选属性也可以通过属性进行设置。有关 Writecache/* 属性键的列表,请参见 linstor controller set property—​help

使用 --layer-list DRBD,WRITECACHE,STORAGE 将DRBD配置为使用外部元数据时,只有备份设备将使用writecache,而不是保存外部元数据的设备。

2.5.4. Cache Layer

LINSTOR can also setup a DM-Cache device, which is very similar to the DM-Writecache from the previous section. The major difference is that a cache device is composed by three devices: one storage device, one cache device and one meta device. The LINSTOR properties are quite similar to those of the writecache but are located in the Cache namespace:

# linstor storage-pool create lvm node1 lvmpool drbdpool
# linstor storage-pool create lvm node1 pmempool pmempool

# linstor resource-definition create r1
# linstor volume-definition create r1 100G

# linstor volume-definition set-property r1 0 Cache/CachePool pmempool
# linstor volume-definition set-property r1 0 Cache/Size 1%

# linstor resource create node1 r1 --storage-pool lvmpool --layer-list CACHE,STORAGE
Instead of Writecache/PoolName (as when configuring the Writecache layer) the Cache layer’s only required property is called Cache/CachePool. The reason for this is that the Cache layer also has a Cache/MetaPool which can be configured separately or it defaults to the value of Cache/CachePool.

Please see linstor controller set-property --help for a list of Cache/* property-keys and default values for omitted properties.

Using --layer-list DRBD,CACHE,STORAGE while having DRBD configured to use external metadata, only the backing device will use a cache, not the device holding the external metadata.

2.5.5. Storage Layer

For some storage providers LINSTOR has special properties:

  • StorDriver/LvcreateOptions: The value of this property is appended to every lvcreate …​ call LINSTOR executes.

  • StorDriver/ZfscreateOptions: The value of this property is appended to every zfs create …​ call LINSTOR executes.

  • StorDriver/WaitTimeoutAfterCreate: If LINSTOR expects a device to appear after creation (for example after calls of lvcreate, zfs create,…​), LINSTOR waits per default 500ms for the device to appear. These 500ms can be overridden by this property.

  • StorDriver/dm_stats: If set to true LINSTOR calls dmstats create $device after creation and dmstats delete $device --allregions after deletion of a volume. Currently only enabled for LVM and LVM_THIN storage providers.

2.6. 管理网络接口卡

LINSTOR可以在一台机器上处理多个网络接口卡(NICs),在LINSTOR中称为 netif

创建附属节点时,将隐式创建第一个名为 defaultnetif 。使用 node create 命令的 --interface name 选项,可以给它一个不同的名称。

其他NIC的创建方式如下:

# linstor node interface create alpha 100G_nic 192.168.43.221
# linstor node interface create alpha 10G_nic 192.168.43.231

NIC仅由IP地址标识,名称是任意的,与Linux使用的接口名称 无关 。可以将NIC分配给存储池,以便每当在这样的存储池中创建资源时,DRBD通信通过其指定的NIC路由。

# linstor storage-pool set-property alpha pool_hdd PrefNic 10G_nic
# linstor storage-pool set-property alpha pool_ssd PrefNic 100G_nic

FIXME 描述了如何通过特定的 netif 路由controller <-> 客户端通信。

2.7. 加密卷

LINSTOR可以处理drbd卷的透明加密。dm-crypt用于对存储设备中提供的存储进行加密。

In order to use dm-crypt please make sure to have cryptsetup installed before you start the satellite

使用加密的基本步骤:

  1. 在控制器上禁用用户安全性(一旦身份验证生效,这将被废弃)

  2. 创建主密码

  3. Add luks to the layer-list. Note that all plugins (e.g., Proxmox) require a DRBD layer as the top most layer if they do not explicitly state otherwise.

  4. 不要忘记在controller重新启动后重新输入主密码。

2.7.1. 禁用用户安全

Linstor controller上禁用用户安全性是一次操作,之后会被持久化。

  1. 通过 systemctl stop linstor-controller 停止正在运行的linstor控制器`

  2. 在调试模式下启动linstor控制器: /usr/share/linstor-server/bin/Controller -c /etc/linstor -d

  3. 在调试控制台中输入:setSecLvl secLvl(NO_SECURITY)

  4. 使用调试关闭命令停止linstor控制器: shutdown

  5. 使用systemd重新启动控制器:systemctl start linstor-controller

2.7.2. 加密命令

下面是有关命令的详细信息。

在LINSTOR可以加密任何卷之前,需要创建主密码。这可以通过linstor客户端完成。

# linstor encryption create-passphrase

`crypt create-passphrase`将等待用户输入初始主密码(因为所有其他crypt命令都没有参数)。

如果您想更改主密码,可以使用以下方法:

# linstor encryption modify-passphrase

luks 层可以在创建资源定义或资源本身时添加,而建议使用前一种方法,因为它将自动应用于从该资源定义创建的所有资源。

# linstor resource-definition create crypt_rsc --layer-list luks,storage

要输入主密码(在controller重新启动后),请使用以下命令:

# linstor encryption enter-passphrase
无论何时重新启动linstor controller,用户都必须向控制器发送主密码,否则linstor无法重新打开或创建加密卷。

2.7.3. Automatic Passphrase

It is possible to automate the process of creating and re-entering the master passphrase.

To use this, either an environment variable called MASTER_PASSPHRASE or an entry in /etc/linstor/linstor.toml containing the master passphrase has to be created.

The required linstor.toml looks like this:

[encrypt]
passphrase="example"

If either one of these is set, then every time the controller starts it will check whether a master passphrase already exists. If there is none, it will create a new master passphrase as specified. Otherwise, the controller enters the passphrase.

If a master passphrase is already configured, and it is not the same one as specified in the environment variable or linstor.toml, the controller will be unable to re-enter the master passphrase and react as if the user had entered a wrong passphrase. This can only be resolved through manual input from the user, using the same commands as if the controller was started without the automatic passphrase.
In case the master passphrase is set in both an environment variable and the linstor.toml, only the master passphrase from the linstor.toml will be used.

2.8. 检查集群状态

LINSTOR提供各种命令来检查集群的状态。这些命令以 list- 前缀开头,并提供各种筛选和排序选项。--groupby 选项可用于按多个维度对输出进行分组和排序。

# linstor node list
# linstor storage-pool list --groupby Size

2.9. 管理快照

精简LVM和ZFS存储池支持快照。

2.9.1. 创建快照

假设在某些节点上放置了名为 resource1 的资源定义,则可以按如下方式创建快照:

# linstor snapshot create resource1 snap1

这将在资源所在的所有节点上创建快照。LINSTOR将确保即使在资源处于活动使用状态时也创建一致的快照。

Setting the resource-definition property AutoSnapshot/RunEvery LINSTOR will automatically create snapshots every X minute. The optional property AutoSnapshot/Keep can be used to clean-up old snapshots which were created automatically. No manually created snapshot will be cleaned-up / deleted. If AutoSnapshot/Keep is omitted (or ⇐ 0), LINSTOR will keep the last 10 snapshots by default.

# linstor resource-definition set-property AutoSnapshot/RunEvery 15
# linstor resource-definition set-property AutoSnapshot/Keep 5

2.9.2. 还原快照

以下步骤将快照还原到新资源。即使原始资源已从创建快照的节点中删除,也可能发生这种情况。

首先使用与快照中的卷匹配的卷定义新资源:

# linstor resource-definition create resource2
# linstor snapshot volume-definition restore --from-resource resource1 --from-snapshot snap1 --to-resource resource2

此时,如果需要,可以应用其他配置。然后,准备好后,根据快照创建资源:

# linstor snapshot resource restore --from-resource resource1 --from-snapshot snap1 --to-resource resource2

这将在快照所在的所有节点上放置新资源。也可以显式选择放置资源的节点;请参阅帮助( linstor snapshot resource restore -h ).

2.9.3. 回滚快照

LINSTOR可以将资源回滚到快照状态。回滚时资源不能被使用。也就是说,它不能安装在任何节点上。如果资源正在使用中,请考虑是否可以通过restoring the snapshot来实现您的目标。

回滚执行如下:

# linstor snapshot rollback resource1 snap1

资源只能回滚到最近的快照。要回滚到旧快照,请首先删除中间快照。

2.9.4. 删除快照

可以按如下方式删除现有快照:

# linstor snapshot delete resource1 snap1

2.9.5. Shipping a snapshot

Both, the source as well as the target node have to have the resource for snapshot shipping deployed. Additionally, the target resource has to be deactivated.

# linstor resource deactivate nodeTarget resource1
Deactivating a resource with DRBD in its layer-list can NOT be reactivated again. However, a successfully shipped snapshot of a DRBD resource can still be restored into a new resource.

To manually start the snapshot-shipping, use:

# linstor snapshot ship --from-node nodeSource --to-node nodeTarget --resource resource1

By default, the snapshot-shipping uses tcp ports from the range 12000-12999. To change this range, the property SnapshotShipping/TcpPortRange, which accepts a to-from range, can be set on the controller:

# linstor controller set-property SnapshotShipping/TcpPortRange 10000-12000

A resource can also be periodically shipped. To accomplish this, it is mandatory to set the properties SnapshotShipping/TargetNode as well as SnapshotShipping/RunEvery on the resource-definition. SnapshotShipping/SourceNode can also be set, but if omitted LINSTOR will choose an active resource of the same resource-definition.

To allow incremental snapshot-shipping, LINSTOR has to keep at least the last shipped snapshot on the target node. The property SnapshotShipping/Keep can be used to specify how many snapshots LINSTOR should keep. If the property is not set (or ⇐ 0) LINSTOR will keep the last 10 shipped snapshots by default.

# linstor resource-definition set-property resource1 SnapshotShipping/TargetNode nodeTarget
# linstor resource-definition set-property resource1 SnapshotShipping/SourceNode nodeSource
# linstor resource-definition set-property resource1 SnapshotShipping/RunEvery 15
# linstor resource-definition set-property resource1 SnapshotShipping/Keep 5

2.10. 设置资源选项

DRBD选项是使用LINSTOR命令设置的。将忽略诸如 /etc/drbd.d/global_common.conf 文件中未由LINSTOR管理的配置。以下命令显示用法和可用选项:

# linstor controller drbd-options -h
# linstor resource-definition drbd-options -h
# linstor volume-definition drbd-options -h
# linstor resource drbd-peer-options -h

例如,很容易为名为 backups 的资源设置DRBD协议:

# linstor resource-definition drbd-options --protocol C backups

2.11. 添加和删除磁盘

LINSTOR可以在无盘和有盘之间转换资源。这是通过 resource toggle disk 命令实现的,该命令的语法类似于 resource create

例如,将磁盘添加到 alpha 上的无盘资源 backups

# linstor resource toggle-disk alpha backups --storage-pool pool_ssd

再次删除此磁盘:

# linstor resource toggle-disk alpha backups --diskless

2.11.1. 迁移磁盘

为了在节点之间移动资源而不减少任何点的冗余,可以使用LINSTOR的磁盘迁移功能。首先在目标节点上创建一个无盘资源,然后使用 --migrate from 选项添加一个磁盘。这将等到数据已同步到新磁盘,然后删除源磁盘。

例如,要将资源 backupsalpha 迁移到 bravo

# linstor resource create bravo backups --drbd-diskless
# linstor resource toggle-disk bravo backups --storage-pool pool_ssd --migrate-from alpha

2.12. LINSTOR的DRBD代理

LINSTOR希望DRBD代理在相关连接所涉及的节点上运行。它目前不支持通过单独节点上的DRBD代理进行连接。

假设我们的集群由本地网络中的节点 alphabravo 以及远程站点上的节点 charlie 组成,每个节点都部署了名为 backups 的资源定义。然后,可以为 charlie 的连接启用DRBD Proxy,如下所示:

# linstor drbd-proxy enable alpha charlie backups
# linstor drbd-proxy enable bravo charlie backups

DRBD代理配置可以通过以下命令定制:

# linstor drbd-proxy options backups --memlimit 100000000
# linstor drbd-proxy compression zlib backups --level 9

LINSTOR不会自动优化远程复制的DRBD配置,因此您可能需要设置一些配置选项,例如协议:

# linstor resource-connection drbd-options alpha charlie backups --protocol A
# linstor resource-connection drbd-options bravo charlie backups --protocol A

请与LINBIT联系以获得优化配置的帮助。

2.12.1. Automatically enable DRBD Proxy

LINSTOR can also be configured to automatically enable the above mentioned Proxy connection between two nodes. For this automation, LINSTOR first needs to know on which site each node is.

# linstor node set-property alpha Site A
# linstor node set-property bravo Site A
# linstor node set-property charlie Site B

As the Site property might also be used for other site-based decisions in future features, the DrbdProxy/AutoEnable also has to be set to true:

# linstor controller set-property DrbdProxy/AutoEnable true

This property can also be set on node, resource-definition, resource and resource-connection level (from left to right in increasing priority, whereas the controller is the left-most, i.e. least prioritized level)

Once this initialization steps are completed, every newly created resource will automatically check if it has to enable DRBD proxy to any of its peer-resources.

2.13. 外部数据库

It is possible to have LINSTOR working with an external database provider like PostgreSQL, MariaDB and since version 1.1.0 even ETCD key value store is supported.

To use an external database there are a few additional steps to configure. You have to create a DB/Schema and user to use for linstor, and configure this in the /etc/linstor/linstor.toml.

2.13.1. PostgreSQL

A sample PostgreSQL linstor.toml looks like this:

[db]
user = "linstor"
password = "linstor"
connection_url = "jdbc:postgresql://localhost/linstor"

2.13.2. MariaDB/MySQL

MariaDB linstor.toml 示例如下:

[db]
user = "linstor"
password = "linstor"
connection_url = "jdbc:mariadb://localhost/LINSTOR?createDatabaseIfNotExist=true"
The LINSTOR schema/database is created as LINSTOR so make sure the MariaDB connection string refers to the LINSTOR schema, as in the example above.

2.13.3. ETCD

ETCD是一个分布式键值存储,它使得在HA设置中保持LINSTOR数据库的分布式变得容易。ETCD驱动程序已经包含在LINSTOR控制器包中,只需要在 LINSTOR.toml 中进行配置。

有关如何安装和配置ETCD的更多信息,请访问:https://etcd.io/docs[ETCD docs]

下面是 linstor.toml 中的[db]部分示例:

[db]
## only set user/password if you want to use authentication, only since LINSTOR 1.2.1
# user = "linstor"
# password = "linstor"

## for etcd
## do not set user field if no authentication required
connection_url = "etcd://etcdhost1:2379,etcdhost2:2379,etcdhost3:2379"

## if you want to use TLS, only since LINSTOR 1.2.1
# ca_certificate = "ca.pem"
# client_certificate = "client.pem"

## if you want to use client TLS authentication too, only since LINSTOR 1.2.1
# client_key_pkcs8_pem = "client-key.pkcs8"
## set client_key_password if private key has a password
# client_key_password = "mysecret"

2.14. LINSTOR REST-API

To make LINSTOR’s administrative tasks more accessible and also available for web-frontends a REST-API has been created. The REST-API is embedded in the linstor-controller and since LINSTOR 0.9.13 configured via the linstor.toml configuration file.

[http]
  enabled = true
  port = 3370
  listen_addr = "127.0.0.1"  # to disable remote access

如果您想使用REST-API,可以在以下链接中找到当前文档:https://app.swaggerhub.com/apis-docs/Linstor/Linstor/

2.14.1. LINSTOR REST-API HTTPS

HTTP REST-API也可以通过HTTPS运行,如果您使用任何需要授权的功能,则强烈建议您使用它。所以你必须用一个有效的证书创建一个java密钥库文件,这个证书将用于加密所有的HTTPS通信。

下面是一个简单的示例,说明如何使用java运行时中包含的 keytool 创建自签名证书:

keytool -keyalg rsa -keysize 2048 -genkey -keystore ./keystore_linstor.jks\
 -alias linstor_controller\
 -dname "CN=localhost, OU=SecureUnit, O=ExampleOrg, L=Vienna, ST=Austria, C=AT"

keytool 将要求密码来保护生成的密钥库文件,这是LINSTOR controller配置所需的。在 linstor.toml 文件中,必须添加以下部分:

[https]
  keystore = "/path/to/keystore_linstor.jks"
  keystore_password = "linstor"

现在(重新)启动 linstor controller ,端口3371上应该可以使用HTTPS REST-API。

有关如何导入其他证书的更多信息,请访问:https://docs.oracle.com/javase/8/docs/technotes/tools/unix/keytool.html

当启用HTTPS时,对HTTP/v1/ REST-API的所有请求都将重定向到HTTPS重定向。
LINSTOR REST-API HTTPS受限客户端访问

可以通过在控制器上使用SSL信任库来限制客户端访问。基本上,您为您的客户机创建一个证书并将其添加到信任库中,然后客户机使用该证书进行身份验证。

首先创建客户端证书:

keytool -keyalg rsa -keysize 2048 -genkey -keystore client.jks\
 -storepass linstor -keypass linstor\
 -alias client1\
 -dname "CN=Client Cert, OU=client, O=Example, L=Vienna, ST=Austria, C=AT"

然后我们将此证书导入controller信任库:

keytool -importkeystore\
 -srcstorepass linstor -deststorepass linstor -keypass linstor\
 -srckeystore client.jks -destkeystore trustore_client.jks

并在 linstor.toml 配置文件中启用信任库:

[https]
  keystore = "/path/to/keystore_linstor.jks"
  keystore_password = "linstor"
  truststore = "/path/to/trustore_client.jks"
  truststore_password = "linstor"

现在重新启动controller,如果没有正确的证书,将无法再访问控制器API。

LINSTOR客户机需要PEM格式的证书,因此在使用之前,我们必须将java密钥库证书转换为PEM格式。

# Convert to pkcs12
keytool -importkeystore -srckeystore client.jks -destkeystore client.p12\
 -storepass linstor -keypass linstor\
 -srcalias client1 -srcstoretype jks -deststoretype pkcs12

# use openssl to convert to PEM
openssl pkcs12 -in client.p12 -out client_with_pass.pem

为了避免一直输入PEM文件密码,可以方便地删除密码。

openssl rsa -in client_with_pass.pem -out client1.pem
openssl x509 -in client_with_pass.pem >> client1.pem

现在,这个PEM文件可以很容易地在客户端使用:

linstor --certfile client1.pem node list

--certfile 参数也可以添加到客户端配置文件中,有关详细信息,请参见使用LINSTOR客户端

2.15. Logging

Linstor uses SLF4J with Logback as binding. This gives Linstor the possibility to distinguish between the log levels ERROR, WARN, INFO, DEBUG and TRACE (in order of increasing verbosity). In the current linstor version (1.1.2) the user has the following four methods to control the logging level, ordered by priority (first has highest priority):

  1. TRACE mode可以使用调试控制台, 通过 enableddisabled 分别指明:

    Command ==> SetTrcMode MODE(enabled)
    SetTrcMode           Set TRACE level logging mode
    New TRACE level logging mode: ENABLED
  2. 启动controller或satellite时,可以传递命令行参数:

    java ... com.linbit.linstor.core.Controller ... --log-level INFO
    java ... com.linbit.linstor.core.Satellite  ... --log-level INFO
  3. The recommended place is the logging section in the configuration file. The default configuration file location is /etc/linstor/linstor.toml for the controller and /etc/linstor/linstor_satellite.toml for the satellite. Configure the logging level as follows:

    [logging]
       level="INFO"
  4. 由于Linstor使用Logback作为实现,还可以使用 /usr/share/Linstor server/lib/Logback.xml 。目前只有这种方法支持不同组件的不同日志级别,如下例所示:

    <?xml version="1.0" encoding="UTF-8"?>
    <configuration scan="false" scanPeriod="60 seconds">
    <!--
     Values for scanPeriod can be specified in units of milliseconds, seconds, minutes or hours
     https://logback.qos.ch/manual/configuration.html
    -->
     <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
       <!-- encoders are assigned the type
            ch.qos.logback.classic.encoder.PatternLayoutEncoder by default -->
       <encoder>
         <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger - %msg%n</pattern>
       </encoder>
     </appender>
     <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
       <file>${log.directory}/linstor-${log.module}.log</file>
       <append>true</append>
       <encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
         <Pattern>%d{yyyy_MM_dd HH:mm:ss.SSS} [%thread] %-5level %logger - %msg%n</Pattern>
       </encoder>
       <rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
         <FileNamePattern>logs/linstor-${log.module}.%i.log.zip</FileNamePattern>
         <MinIndex>1</MinIndex>
         <MaxIndex>10</MaxIndex>
       </rollingPolicy>
       <triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
         <MaxFileSize>2MB</MaxFileSize>
       </triggeringPolicy>
     </appender>
     <logger name="LINSTOR/Controller" level="INFO" additivity="false">
       <appender-ref ref="STDOUT" />
       <!-- <appender-ref ref="FILE" /> -->
     </logger>
     <logger name="LINSTOR/Satellite" level="INFO" additivity="false">
       <appender-ref ref="STDOUT" />
       <!-- <appender-ref ref="FILE" /> -->
     </logger>
     <root level="WARN">
       <appender-ref ref="STDOUT" />
       <!-- <appender-ref ref="FILE" /> -->
     </root>
    </configuration>

有关 logback.xml 的更多详细信息,请参见https://logback.qos.ch/manual/index.html[Logback Manual]。

如果没有使用上述任何配置方法,Linstor将默认为 INFO 日志级别。

2.16. Monitoring

Since LINSTOR 1.8.0, a Prometheus /metrics HTTP path is provided with LINSTOR and JVM specific exports.

The /metrics path also supports 3 GET arguments to reduce LINSTOR’s reported data:

  • resource

  • storage_pools

  • error_reports

These are all default true, to disabled e.g. error-report data: http://localhost:3370/metrics?error_reports=false

2.16.1. Health check

The LINSTOR-Controller also provides a /health HTTP path that will simply return HTTP-Status 200 if the controller can access its database and all services are up and running. Otherwise it will return HTTP error status code 500 Internal Server Error.

2.17. 安全Satellite连接

可以让LINSTOR在controller和satellites连接之间使用SSL安全TCP连接。在不深入讨论java的SSL引擎如何工作的情况下,我们将向您提供使用java运行时环境中的 keytool 的命令行片段,介绍如何使用安全连接配置3节点设置。节点设置如下所示:

节点 alpha 只充当controller。节点 bravo 和节点 charlie 只充当卫星。

下面是生成这样一个密钥库设置的命令,当然你应该根据您的环境编辑值。

# create directories to hold the key files
mkdir -p /tmp/linstor-ssl
cd /tmp/linstor-ssl
mkdir alpha bravo charlie


# create private keys for all nodes
keytool -keyalg rsa -keysize 2048 -genkey -keystore alpha/keystore.jks\
 -storepass linstor -keypass linstor\
 -alias alpha\
 -dname "CN=Max Mustermann, OU=alpha, O=Example, L=Vienna, ST=Austria, C=AT"

keytool -keyalg rsa -keysize 2048 -genkey -keystore bravo/keystore.jks\
 -storepass linstor -keypass linstor\
 -alias bravo\
 -dname "CN=Max Mustermann, OU=bravo, O=Example, L=Vienna, ST=Austria, C=AT"

keytool -keyalg rsa -keysize 2048 -genkey -keystore charlie/keystore.jks\
 -storepass linstor -keypass linstor\
 -alias charlie\
 -dname "CN=Max Mustermann, OU=charlie, O=Example, L=Vienna, ST=Austria, C=AT"

# import truststore certificates for alpha (needs all satellite certificates)
keytool -importkeystore\
 -srcstorepass linstor -deststorepass linstor -keypass linstor\
 -srckeystore bravo/keystore.jks -destkeystore alpha/certificates.jks

keytool -importkeystore\
 -srcstorepass linstor -deststorepass linstor -keypass linstor\
 -srckeystore charlie/keystore.jks -destkeystore alpha/certificates.jks

# import controller certificate into satellite truststores
keytool -importkeystore\
 -srcstorepass linstor -deststorepass linstor -keypass linstor\
 -srckeystore alpha/keystore.jks -destkeystore bravo/certificates.jks

keytool -importkeystore\
 -srcstorepass linstor -deststorepass linstor -keypass linstor\
 -srckeystore alpha/keystore.jks -destkeystore charlie/certificates.jks

# now copy the keystore files to their host destinations
ssh root@alpha mkdir /etc/linstor/ssl
scp alpha/* root@alpha:/etc/linstor/ssl/
ssh root@bravo mkdir /etc/linstor/ssl
scp bravo/* root@bravo:/etc/linstor/ssl/
ssh root@charlie mkdir /etc/linstor/ssl
scp charlie/* root@charlie:/etc/linstor/ssl/

# generate the satellite ssl config entry
echo '[netcom]
  type="ssl"
  port=3367
  server_certificate="ssl/keystore.jks"
  trusted_certificates="ssl/certificates.jks"
  key_password="linstor"
  keystore_password="linstor"
  truststore_password="linstor"
  ssl_protocol="TLSv1.2"
' | ssh root@bravo "cat > /etc/linstor/linstor_satellite.toml"

echo '[netcom]
  type="ssl"
  port=3367
  server_certificate="ssl/keystore.jks"
  trusted_certificates="ssl/certificates.jks"
  key_password="linstor"
  keystore_password="linstor"
  truststore_password="linstor"
  ssl_protocol="TLSv1.2"
' | ssh root@charlie "cat > /etc/linstor/linstor_satellite.toml"

现在只需启动controller和satellites,并添加带有 --通信类型SSL 的节点。

2.18. Automatisms for DRBD-Resources

2.18.1. 自动仲裁策略

LINSTOR在资源上自动配置仲裁策略 当仲裁可实现时 。这意味着,每当您有至少两个磁盘和一个或多个无磁盘资源分配,或三个或多个磁盘资源分配时,LINSTOR将自动为您的资源启用仲裁策略。

相反,当达到仲裁所需的资源分配少于最小值时,LINSTOR将自动禁用仲裁策略。

这是通过 DrbdOptions/auto-quorum 属性控制的,该属性可应用于 linstor-controllerresource-groupresource-definitionDrbdOptions/auto-quorum, 属性的接受值为 disabledsuspend ioio error

DrbdOptions/auto-quorum 属性设置为 disabled 将允许您手动或更细粒度地控制资源的仲裁策略(如果您愿意)。

DrbdOptions/auto-quorum 的默认策略是 quorum-moston-no-quorum io-error 。有关DRBD的仲裁功能及其行为的更多信息,请参阅 quorum section of the DRBD user’s guide.
如果未禁用 DrbdOptions/auto-quorum ,则 DrbdOptions/auto-quorum 策略将覆盖任何手动配置的属性。

例如,要手动设置名为 my_ssd_groupresource-group 的仲裁策略,可以使用以下命令:

# linstor resource-group set-property my_ssd_group DrbdOptions/auto-quorum disabled
# linstor resource-group set-property my_ssd_group DrbdOptions/Resource/quorum majority
# linstor resource-group set-property my_ssd_group DrbdOptions/Resource/on-no-quorum suspend-io

您可能希望完全禁用DRBD的仲裁功能。要做到这一点,首先需要在适当的LINSTOR对象上禁用’DrbdOptions/auto-quorum’,然后相应地设置DRBD quorum特性。例如,使用以下命令完全禁用 my_ssd_groupresource-group 上的仲裁:

# linstor resource-group set-property my_ssd_group DrbdOptions/auto-quorum disabled
# linstor resource-group set-property my_ssd_group DrbdOptions/Resource/quorum off
# linstor resource-group set-property my_ssd_group DrbdOptions/Resource/on-no-quorum
在上面的命令中将 DrbdOptions/Resource/on-no-quorum 设置为空值将完全从对象中删除该属性。

2.18.2. Auto-Evict

If a satellite is offline for a prolonged period of time, LINSTOR can be configured to declare that node as evicted. This triggers an automated reassignment of the affected DRBD-resources to other nodes to ensure a minimum replica count is kept.

This feature uses the following properties to adapt the behaviour.

DrbdOptions/AutoEvictMinReplicaCount sets the number of replicas that should always be present. You can set this property on the controller to change a global default, or on a specific resource-definition or resource-group to change it only for that resource-definition or resource-group. If this property is left empty, the place-count set for the auto-placer of the corresponding resource-group will be used.

DrbdOptions/AutoEvictAfterTime describes how long a node can be offline in minutes before the eviction is triggered. You can set this property on the controller to change a global default, or on a single node to give it a different behavior. The default value for this property is 60 minutes.

DrbdOptions/AutoEvictMaxDisconnectedNodes sets the percentage of nodes that can be not reachable (for whatever reason) at the same time. If more than the given percent of nodes are offline at the same time, the auto-evict will not be triggered for any node , since in this case LINSTOR assumes connection problems from the controller. This property can only be set for the controller, and only accepts a value between 0 and 100. The default value is 34. If you wish to turn the auto-evict-feature off, simply set this property to 0. If you want to always trigger the auto-evict, regardless of how many satellites are unreachable, set it to 100.

DrbdOptions/AutoEvictAllowEviction is an additional property that can stop a node from being evicted. This can be useful for various cases, for example if you need to shut down a node for maintenance. You can set this property on the controller to change a global default, or on a single node to give it a different behavior. It accepts true and false as values and per default is set to true on the controller. You can use this property to turn the auto-evict feature off by setting it to false on the controller, although this might not work completely if you already set different values for individual nodes, since those values take precedence over the global default.

After the linstor-controller loses the connection to a satellite, aside from trying to reconnect, it starts a timer for that satellite. As soon as that timer exceeds DrbdOptions/AutoEvictAfterTime and all of the DRBD-connections to the DRBD-resources on that satellite are broken, the controller will check whether or not DrbdOptions/AutoEvictMaxDisconnectedNodes has been met. If it hasn’t, and DrbdOptions/AutoEvictAllowEviction is true for the node in question, the satellite will be marked as EVICTED. At the same time, the controller will check for every DRBD-resource whether the number of resources is still above DrbdOptions/AutoEvictMinReplicaCount. If it is, the resource in question will be marked as DELETED. If it isn’t, an auto-place with the settings from the corresponding resource-group will be started. Should the auto-place fail, the controller will try again later when changes that might allow a different result, such as adding a new node, have happened. Resources where an auto-place is necessary will only be marked as DELETED if the corresponding auto-place was successful.

The evicted satellite itself will not be able to reestablish connection with the controller. Even if the node is up and running, a manual reconnect will fail. It is also not possible to delete the satellite, even if it is working as it should be. The satellite can, however, be restored. This will remove the EVICTED-flag from the satellite and allow you to use it again. Previously configured network interfaces, storage pools, properties and similar entities as well as non-DRBD-related resources and resources that could not be autoplaced somewhere else will still be on the satellite. To restore a satellite, use

# linstor node restore [nodename]

Should you wish to instead throw everything that once was on that node, including the node itself, away, you need to use the node lost command instead.

2.19. QoS设置

2.19.1. 系统

LINSTOR能够设置以下Sysfs设置:

SysFs Linstor property

/sys/fs/cgroup/blkio/blkio.throttle.read_bps_device

sys/fs/blkio_throttle_read

/sys/fs/cgroup/blkio/blkio.throttle.write_bps_device

sys/fs/blkio_throttle_write

/sys/fs/cgroup/blkio/blkio.throttle.read_iops_device

sys/fs/blkio_throttle_read_iops

/sys/fs/cgroup/blkio/blkio.throttle.write_iops_device

sys/fs/blkio_throttle_write_iops

If a LINSTOR volume is composed of multiple “stacked” volume (for example DRBD with external metadata will have 3 devices: backing (storage) device, metadata device and the resulting DRBD device), setting a sys/fs/\* property for a Volume, only the bottom-most local “data”-device will receive the corresponding /sys/fs/cgroup/…​ setting. That means, in case of the example above only the backing device will receive the setting. In case a resource-definition has an nvme-target as well as an nvme-initiator resource, both bottom-most devices of each node will receive the setting. In case of the target the bottom-most device will be the volume of LVM or ZFS, whereas in case of the initiator the bottom-most device will be the connected nvme-device, regardless which other layers are stacked on top of that.

2.20. 得到帮助

2.20.1. 从命令行

在命令行中列出可用命令的一种快速方法是键入 linstor

Further information on sub-commands (e.g., list-nodes) can be retrieved in two ways:

# linstor node list -h
# linstor help node list

Using the ‘help’ sub-command is especially helpful when LINSTOR is executed in interactive mode (linstor interactive).

LINSTOR最有用的特性之一是它丰富的tab-completion,它基本上可以用来完成LINSTOR所知道的每个对象(例如,节点名、IP地址、资源名等等)。在下面的示例中,我们将展示一些可能的完成及其结果:

# linstor node create alpha 1<tab> # completes the IP address if hostname can be resolved
# linstor resource create b<tab> c<tab> # linstor assign-resource backups charlie

如果制表符完成不正常,请尝试获取相应的文件:

# source /etc/bash_completion.d/linstor # or
# source /usr/share/bash_completion/completions/linstor

对于zsh shell用户,linstor客户机可以生成一个zsh编译文件,该文件基本上支持命令和参数完成。

# linstor gen-zsh-completer > /usr/share/zsh/functions/Completion/Linux/_linstor

2.20.2. SOS-Report

If something goes wrong and you need help finding the cause of the issue, you can use

# linstor sos-report create

The command above will create a new sos-report in /var/log/linstor/controller/ on the controller node. Alternatively you can use

# linstor sos-report download

which will create a new sos-report and additionally downloads that report to the local machine into your current working directory.

This sos-report contains logs and useful debug-information from several sources (Linstor-logs, dmesg, versions of external tools used by Linstor, ip a, database dump and many more). These information are stored for each node in plaintext in the resulting .tar.gz file.

2.20.3. 来自社区

如需社区帮助,请订阅我们的邮件列表: https://lists.linbit.com/listinfo/drbd-user

2.20.4. Github

要提交bug或功能请求,请查看我们的GitHub页面 https://GitHub.com/linbit

2.20.5. 有偿支持和开发

或者,如果您希望购买远程安装服务、24/7支持、访问认证存储库或功能开发,请联系我们:+1-877-454-6248(1-877-4LINBIT),国际电话:+43-1-8178292-0 | sales@linbit.com

3. Kubernetes的LINSTOR卷

本章描述了在Kubernetes中由操作员管理的LINSTOR的使用,以及使用 LINSTOR CSI plugin 配置卷的情况。

This Chapter goes into great detail regarding all the install time options and various configurations possible with LINSTOR and Kubernetes. For those more interested in a “quick-start” for testing, or those looking for some examples for reference. We have some complete Helm Install Examples of a few common uses near the end of the chapter.

3.1. Kubernetes 概述

Kubernetes is a container orchestrator. Kubernetes defines the behavior of containers and related services via declarative specifications. In this guide, we’ll focus on using kubectl to manipulate .yaml files that define the specifications of Kubernetes objects.

3.2. 在Kubernetes上部署LINSTOR

3.2.1. 使用LINSTOR Operator部署

LINBIT为商业支持客户提供LINSTOR Operator。Operator通过安装DRBD、管理Satellite/Controller Pods以及其他相关功能,简化了LINSTOR在Kubernetes上的部署。

Operator用Helm v3安装,如下所示:

  • 创建包含my.linbit.com凭据的kubernetes 密码:

    kubectl create secret docker-registry drbdiocred --docker-server=drbd.io --docker-username=<YOUR_LOGIN> --docker-email=<YOUR_EMAIL> --docker-password=<YOUR_PASSWORD>

    默认情况下,此密码的名称必须与Helm值中指定的名称 drbdiocred 匹配。

  • 为LINSTOR etcd实例配置存储。有多种选项可用于为LINSTOR配置etcd实例:

    • 使用具有默认 StorageClass 的现有存储资源调配器。

    • Use hostPath volumes

    • Disable persistence for basic testing. This can be done by adding --set etcd.persistentVolume.enabled=false to the helm install command below.

  • Read the storage guide and configure a basic storage setup for LINSTOR

  • Read the section on securing the deployment and configure as needed.

  • 在最后一步中,使用带有 helm install 命令的 --set 选择适当的内核模块注入。

    • Choose the injector according to the distribution you are using. Select the latest version from one of drbd9-rhel7, drbd9-rhel8,…​ from http://drbd.io/ as appropriate. The drbd9-rhel8 image should also be used for RHCOS (OpenShift). For the SUSE CaaS Platform use the SLES injector that matches the base system of the CaaS Platform you are using (e.g., drbd9-sles15sp1). For example:

      operator.satelliteSet.kernelModuleInjectionImage=drbd.io/drbd9-rhel8:v9.0.24
    • Only inject modules that are already present on the host machine. If a module is not found, it will be skipped.

      operator.satelliteSet.kernelModuleInjectionMode=DepsOnly
    • Disable kernel module injection if you are installing DRBD by other means. Deprecated by DepsOnly

      operator.satelliteSet.kernelModuleInjectionMode=None
  • 最后创建一个名为 linstor-op 的Helm部署,它将设置所有内容。

    helm repo add linstor https://charts.linstor.io
    helm install linstor-op linstor/linstor

    Further deployment customization is discussed in the advanced deployment section

LINSTOR etcd hostPath 持久存储

You can use the pv-hostpath Helm templates to create hostPath persistent volumes. Create as many PVs as needed to satisfy your configured etcd replicas (default 1).

创建 hostPath 持久卷,在 nodes= 选项中相应地替换为集群节点名称:

helm repo add linstor https://charts.linstor.io
helm install linstor-etcd linstor/pv-hostpath --set "nodes={<NODE0>,<NODE1>,<NODE2>}"

Persistence for etcd is enabled by default.

Using an existing database

LINSTOR can connect to an existing PostgreSQL, MariaDB or etcd database. For instance, for a PostgreSQL instance with the following configuration:

POSTGRES_DB: postgresdb
POSTGRES_USER: postgresadmin
POSTGRES_PASSWORD: admin123

The Helm chart can be configured to use this database instead of deploying an etcd cluster by adding the following to the Helm install command:

--set etcd.enabled=false --set "operator.controller.dbConnectionURL=jdbc:postgresql://postgres/postgresdb?user=postgresadmin&password=admin123"

3.2.2. Configuring storage

The LINSTOR operator can automate some basic storage set up for LINSTOR.

Configuring storage pool creation

The LINSTOR operator can be used to create LINSTOR storage pools. Creation is under control of the LinstorSatelliteSet resource:

$ kubectl get LinstorSatelliteSet.linstor.linbit.com linstor-op-ns -o yaml
kind: LinstorSatelliteSet
metadata:
..
spec:
  ..
  storagePools:
    lvmPools:
    - name: lvm-thick
      volumeGroup: drbdpool
    lvmThinPools:
    - name: lvm-thin
      thinVolume: thinpool
      volumeGroup: ""
    zfsPools:
    - name: my-linstor-zpool
      zPool: for-linstor
      thin: true
At install time

At install time, by setting the value of operator.satelliteSet.storagePools when running helm install.

First create a file with the storage configuration like:

operator:
  satelliteSet:
    storagePools:
      lvmPools:
      - name: lvm-thick
        volumeGroup: drbdpool

This file can be passed to the helm installation like this:

helm install -f <file> linstor-op linstor/linstor
After install

On a cluster with the operator already configured (i.e. after helm install), you can edit the LinstorSatelliteSet configuration like this:

$ kubectl edit LinstorSatelliteSet.linstor.linbit.com <satellitesetname>

The storage pool configuration can be updated like in the example above.

Preparing physical devices

By default, LINSTOR expects the referenced VolumeGroups, ThinPools and so on to be present. You can use the devicePaths: [] option to let LINSTOR automatically prepare devices for the pool. Eligible for automatic configuration are block devices that:

  • Are a root device (no partition)

  • do not contain partition information

  • have more than 1 GiB

To enable automatic configuration of devices, set the devicePaths key on storagePools entries:

  storagePools:
    lvmPools:
    - name: lvm-thick
      volumeGroup: drbdpool
      devicePaths:
      - /dev/vdb
    lvmThinPools:
    - name: lvm-thin
      thinVolume: thinpool
      volumeGroup: linstor_thinpool
      devicePaths:
      - /dev/vdc
      - /dev/vdd

Currently, this method supports creation of LVM and LVMTHIN storage pools.

lvmPools configuration
  • name name of the LINSTOR storage pool.Required

  • volumeGroup name of the VG to create.Required

  • devicePaths devices to configure for this pool.Must be empty and >= 1GiB to be recognized.Optional

  • raidLevel LVM raid level.Optional

  • vdo Enable [VDO] (requires VDO tools in the satellite).Optional

  • vdoLogicalSizeKib Size of the created VG (expected to be bigger than the backing devices by using VDO).Optional

  • vdoSlabSizeKib Slab size for VDO. Optional

lvmThinPools configuration
  • name name of the LINSTOR storage pool. Required

  • volumeGroup VG to use for the thin pool. If you want to use devicePaths, you must set this to "". This is required because LINSTOR does not allow configuration of the VG name when preparing devices.

  • thinVolume name of the thinpool. Required

  • devicePaths devices to configure for this pool. Must be empty and >= 1GiB to be recognized. Optional

  • raidLevel LVM raid level. Optional

The volume group created by LINSTOR for LVMTHIN pools will always follow the scheme “linstor_$THINPOOL”.
zfsPools configuration
  • name name of the LINSTOR storage pool. Required

  • zPool name of the zpool to use. Must already be present on all machines. Required

  • thin true to use thin provisioning, false otherwise. Required

Using automaticStorageType (DEPRECATED)

ALL eligible devices will be prepared according to the value of operator.satelliteSet.automaticStorageType, unless they are already prepared using the storagePools section. Devices are added to a storage pool based on the device name (i.e. all /dev/nvme1 devices will be part of the pool autopool-nvme1)

The possible values for operator.satelliteSet.automaticStorageType:

  • None no automatic set up (default)

  • LVM create a LVM (thick) storage pool

  • LVMTHIN create a LVM thin storage pool

  • ZFS create a ZFS based storage pool (UNTESTED)

3.2.3. Securing deployment

This section describes the different options for enabling security features available when using this operator. The following guides assume the operator is installed using Helm

Secure communication with an existing etcd instance

Secure communication to an etcd instance can be enabled by providing a CA certificate to the operator in form of a kubernetes secret. The secret has to contain the key ca.pem with the PEM encoded CA certificate as value.

The secret can then be passed to the controller by passing the following argument to helm install

--set operator.controller.dbCertSecret=<secret name>
Authentication with etcd using certificates

If you want to use TLS certificates to authenticate with an etcd database, you need to set the following option on helm install:

--set operator.controller.dbUseClientCert=true

If this option is active, the secret specified in the above section must contain two additional keys: * client.cert PEM formatted certificate presented to etcd for authentication * client.key private key in PKCS8 format, matching the above client certificate. Keys can be converted into PKCS8 format using openssl:

openssl pkcs8 -topk8 -nocrypt -in client-key.pem -out client-key.pkcs8
Configuring secure communication between LINSTOR components

The default communication between LINSTOR components is not secured by TLS. If this is needed for your setup, follow these steps:

  • Create private keys in the java keystore format, one for the controller, one for all satellites:

keytool -keyalg rsa -keysize 2048 -genkey -keystore satellite-keys.jks -storepass linstor -alias satellite -dname "CN=XX, OU=satellite, O=Example, L=XX, ST=XX, C=X"
keytool -keyalg rsa -keysize 2048 -genkey -keystore control-keys.jks -storepass linstor -alias control -dname "CN=XX, OU=control, O=Example, L=XX, ST=XX, C=XX"
  • Create a trust store with the public keys that each component needs to trust:

  • Controller needs to trust the satellites

  • Nodes need to trust the controller

    keytool -importkeystore -srcstorepass linstor -deststorepass linstor -srckeystore control-keys.jks -destkeystore satellite-trust.jks
    keytool -importkeystore -srcstorepass linstor -deststorepass linstor -srckeystore satellite-keys.jks -destkeystore control-trust.jks
  • Create kubernetes secrets that can be passed to the controller and satellite pods

    kubectl create secret generic control-secret --from-file=keystore.jks=control-keys.jks --from-file=certificates.jks=control-trust.jks
    kubectl create secret generic satellite-secret --from-file=keystore.jks=satellite-keys.jks --from-file=certificates.jks=satellite-trust.jks
  • Pass the names of the created secrets to helm install

    --set operator.satelliteSet.sslSecret=satellite-secret --set operator.controller.sslSecret=control-secret
It is currently NOT possible to change the keystore password. LINSTOR expects the passwords to be linstor. This is a current limitation of LINSTOR.
Configuring secure communications for the LINSTOR API

Various components need to talk to the LINSTOR controller via its REST interface. This interface can be secured via HTTPS, which automatically includes authentication. For HTTPS+authentication to work, each component needs access to:

  • A private key

  • A certificate based on the key

  • A trusted certificate, used to verify that other components are trustworthy

The next sections will guide you through creating all required components.

Creating the private keys

Private keys can be created using java’s keytool

keytool -keyalg rsa -keysize 2048 -genkey -keystore controller.pkcs12 -storetype pkcs12 -storepass linstor -ext san=dns:linstor-op-cs.default.svc -dname "CN=XX, OU=controller, O=Example, L=XX, ST=XX, C=X" -validity 5000
keytool -keyalg rsa -keysize 2048 -genkey -keystore client.pkcs12 -storetype pkcs12 -storepass linstor -dname "CN=XX, OU=client, O=Example, L=XX, ST=XX, C=XX" -validity 5000

The clients need private keys and certificate in a different format, so we need to convert it

openssl pkcs12 -in client.pkcs12 -passin pass:linstor -out client.cert -clcerts -nokeys
openssl pkcs12 -in client.pkcs12 -passin pass:linstor -out client.key -nocerts -nodes
The alias specified for the controller key (i.e. -ext san=dns:linstor-op-cs.default.svc) has to exactly match the service name created by the operator. When using helm, this is always of the form <release-name>-cs.<release-namespace>.svc.
It is currently NOT possible to change the keystore password. LINSTOR expects the passwords to be linstor. This is a current limitation of LINSTOR
Create the trusted certificates

For the controller to trust the clients, we can use the following command to create a truststore, importing the client certificate

keytool -importkeystore -srcstorepass linstor -srckeystore client.pkcs12 -deststorepass linstor -deststoretype pkcs12 -destkeystore controller-trust.pkcs12

For the client, we have to convert the controller certificate into a different format

openssl pkcs12 -in controller.pkcs12 -passin pass:linstor -out ca.pem -clcerts -nokeys
Create Kubernetes secrets

Now you can create secrets for the controller and for clients:

kubectl create secret generic http-controller --from-file=keystore.jks=controller.pkcs12 --from-file=truststore.jks=controller-trust.pkcs12
kubectl create secret generic http-client --from-file=ca.pem=ca.pem --from-file=client.cert=client.cert --from-file=client.key=client.key

The names of the secrets can be passed to helm install to configure all clients to use https.

--set linstorHttpsControllerSecret=http-controller  --set linstorHttpsClientSecret=http-client
Automatically set the passphrase for encrypted volumes

Linstor can be used to create encrypted volumes using LUKS. The passphrase used when creating these volumes can be set via a secret:

kubectl create secret generic linstor-pass --from-literal=MASTER_PASSPHRASE=<password>

On install, add the following arguments to the helm command:

--set operator.controller.luksSecret=linstor-pass
Helm Install Examples

All the below examples use the following sp-values.yaml file. Feel free to adjust this for your uses and environment. See [Configuring storage pool creation] for further details.

operator:
  satelliteSet:
    storagePools:
      lvmThinPools:
      - name: lvm-thin
        thinVolume: thinpool
        volumeGroup: ""
        devicePaths:
        - /dev/sdb

Default install. Please note this does not setup any persistence for the backing etcd key-value store.

This is not suggested for any use outside of testing.
kubectl create secret docker-registry drbdiocred --docker-server=drbd.io --docker-username=<YOUR_LOGIN> --docker-password=<YOUR_PASSWORD>
helm repo add linstor https://charts.linstor.io
helm install linstor-op linstor/linstor

Install with LINSTOR storage-pools defined at install via sp-values.yaml, persistent hostPath volumes, 3 etcd replicas, and by compiling the DRBD kernel modules for the host kernels.

This should be adequate for most basic deployments. Please note that this deployment is not using the pre-compiled DRBD kernel modules just to make this command more portable. Using the pre-compiled binaries will make for a much faster install and deployment. Using the Compile option would not be suggested for use in a large Kubernetes clusters.

kubectl create secret docker-registry drbdiocred --docker-server=drbd.io --docker-username=<YOUR_LOGIN> --docker-password=<YOUR_PASSWORD>
helm repo add linstor https://charts.linstor.io
helm install linstor-etcd linstor/pv-hostpath --set "nodes={<NODE0>,<NODE1>,<NODE2>}"
helm install -f sp-values.yaml linstor-op linstor/linstor --set etcd.replicas=3 --set operator.satelliteSet.kernelModuleInjectionMode=Compile

Install with LINSTOR storage-pools defined at install via sp-values.yaml, use an already created PostgreSQL DB (preferably clustered), instead of etcd, and use already compiled kernel modules for DRBD. Additionally, we’ll disable the Stork scheduler in this example.

The PostgreSQL database in this particular example is reachable via a service endpoint named postgres. PostgreSQL itself is configured with POSTGRES_DB=postgresdb, POSTGRES_USER=postgresadmin, and POSTGRES_PASSWORD=admin123

kubectl create secret docker-registry drbdiocred --docker-server=drbd.io --docker-username=<YOUR_LOGIN> --docker-email=<YOUR_EMAIL> --docker-password=<YOUR_PASSWORD>
helm repo add linstor https://charts.linstor.io
helm install -f sp-values.yaml linstor-op linstor/linstor --set etcd.enabled=false --set "operator.controller.dbConnectionURL=jdbc:postgresql://postgres/postgresdb?user=postgresadmin&password=admin123" --set stork.enabled=false
终止Helm部署

To protect the storage infrastructure of the cluster from accidentally deleting vital components, it is necessary to perform some manual steps before deleting a Helm deployment.

  1. Delete all volume claims managed by LINSTOR components. You can use the following command to get a list of volume claims managed by LINSTOR. After checking that none of the listed volumes still hold needed data, you can delete them using the generated kubectl delete command.

    $ kubectl get pvc --all-namespaces -o=jsonpath='{range .items[?(@.metadata.annotations.volume\.beta\.kubernetes\.io/storage-provisioner=="linstor.csi.linbit.com")]}kubectl delete pvc --namespace {.metadata.namespace} {.metadata.name}{"\n"}{end}'
    kubectl delete pvc --namespace default data-mysql-0
    kubectl delete pvc --namespace default data-mysql-1
    kubectl delete pvc --namespace default data-mysql-2
    These volumes, once deleted, cannot be recovered.
  2. Delete the LINSTOR controller and satellite resources.

    Deployment of LINSTOR satellite and controller is controlled by the LinstorSatelliteSet and LinstorController resources. You can delete the resources associated with your deployment using kubectl

    kubectl delete linstorcontroller <helm-deploy-name>-cs
    kubectl delete linstorsatelliteset <helm-deploy-name>-ns

    After a short wait, the controller and satellite pods should terminate. If they continue to run, you can check the above resources for errors (they are only removed after all associated pods terminate)

  3. Delete the Helm deployment.

    If you removed all PVCs and all LINSTOR pods have terminated, you can uninstall the helm deployment

    helm uninstall linstor-op
    Due to the Helm’s current policy, the Custom Resource Definitions named LinstorController and LinstorSatelliteSet will not be deleted by the command. More information regarding Helm’s current position on CRDs can be found here.

3.2.4. Advanced deployment options

The helm charts provide a set of further customization options for advanced use cases.

global:
  imagePullPolicy: IfNotPresent # empty pull policy means k8s default is used ("always" if tag == ":latest", "ifnotpresent" else) (1)
  setSecurityContext: true # Force non-privileged containers to run as non-root users
# Dependency charts
etcd:
  persistentVolume:
    enabled: true
    storage: 1Gi
  replicas: 1 # How many instances of etcd will be added to the initial cluster. (2)
  resources: {} # resource requirements for etcd containers (3)
  image:
    repository: gcr.io/etcd-development/etcd
    tag: v3.4.9
csi-snapshotter:
  enabled: true # <- enable to add k8s snapshotting CRDs and controller. Needed for CSI snapshotting
  image: k8s.gcr.io/sig-storage/snapshot-controller:v3.0.2
  replicas: 1 (2)
  resources: {} # resource requirements for the cluster snapshot controller. (3)
stork:
  enabled: true
  storkImage: docker.io/openstorage/stork:2.5.0
  schedulerImage: k8s.gcr.io/kube-scheduler-amd64
  schedulerTag: ""
  replicas: 1 (2)
  storkResources: {} # resources requirements for the stork plugin containers (3)
  schedulerResources: {} # resource requirements for the kube-scheduler containers (3)
  podsecuritycontext: {}
csi:
  enabled: true
  pluginImage: "drbd.io/linstor-csi:v0.11.0"
  csiAttacherImage: k8s.gcr.io/sig-storage/csi-attacher:v3.0.2
  csiLivenessProbeImage: k8s.gcr.io/sig-storage/livenessprobe:v2.1.0
  csiNodeDriverRegistrarImage: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.0.1
  csiProvisionerImage: k8s.gcr.io/sig-storage/csi-provisioner:v2.0.4
  csiSnapshotterImage: k8s.gcr.io/sig-storage/csi-snapshotter:v3.0.2
  csiResizerImage: k8s.gcr.io/sig-storage/csi-resizer:v1.0.1
  controllerReplicas: 1 (2)
  nodeAffinity: {} (4)
  nodeTolerations: [] (4)
  controllerAffinity: {} (4)
  controllerTolerations: [] (4)
  enableTopology: false
  resources: {} (3)
priorityClassName: ""
drbdRepoCred: drbdiocred
linstorHttpsControllerSecret: "" # <- name of secret containing linstor server certificates+key.
linstorHttpsClientSecret: "" # <- name of secret containing linstor client certificates+key.
controllerEndpoint: "" # <- override to the generated controller endpoint. use if controller is not deployed via operator
psp:
  privilegedRole: ""
  unprivilegedRole: ""
operator:
  replicas: 1 # <- number of replicas for the operator deployment (2)
  image: "drbd.io/linstor-operator:v1.3.1"
  affinity: {} (4)
  tolerations: [] (4)
  resources: {} (3)
  podsecuritycontext: {}
  controller:
    enabled: true
    controllerImage: "drbd.io/linstor-controller:v1.11.1"
    luksSecret: ""
    dbCertSecret: ""
    dbUseClientCert: false
    sslSecret: ""
    affinity: {} (4)
    tolerations: (4)
      - key: node-role.kubernetes.io/master
        operator: "Exists"
        effect: "NoSchedule"
    resources: {} (3)
    replicas: 1 (2)
    additionalEnv: [] (5)
    additionalProperties: {} (6)
  satelliteSet:
    enabled: true
    satelliteImage: "drbd.io/linstor-satellite:v1.11.1"
    storagePools: {}
    sslSecret: ""
    automaticStorageType: None
    affinity: {} (4)
    tolerations: [] (4)
    resources: {} (3)
    kernelModuleInjectionImage: "drbd.io/drbd9-rhel7:v9.0.27"
    kernelModuleInjectionMode: ShippedModules
    kernelModuleInjectionResources: {} (3)
    additionalEnv: [] (5)
haController:
  enabled: true
  image: drbd.io/linstor-k8s-ha-controller:v0.1.3
  affinity: {} (4)
  tolerations: [] (4)
  resources: {} (3)
  replicas: 1 (2)
1 Sets the pull policy for all images.
2 Controls the number of replicas for each component.
3 Set container resource requests and limits. See the kubernetes docs. Most containers need a minimal amount of resources, except for:
  • etcd.resources See the etcd docs

  • operator.controller.resources Around 700MiB memory is required

  • operater.satelliteSet.resources Around 700MiB memory is required

  • operator.satelliteSet.kernelModuleInjectionResources If kernel modules are compiled, 1GiB of memory is required.

4 Affinity and toleration determine where pods are scheduled on the cluster. See the kubernetes docs on affinity and toleration. This may be especially important for the operator.satelliteSet and csi.node* values. To schedule a pod using a LINSTOR persistent volume, the node requires a running LINSTOR satellite and LINSTOR CSI pod.
5 Sets additional environments variables to pass to the Linstor Controller and Satellites. Uses the same format as the env value of a container
6 Sets additional properties on the Linstor Controller. Expects a simple mapping of <property-key>: <value>.
High Availability Deployment

To create a High Availability deployment of all components, take a look at the upstream guide The default values are chosen so that scaling the components to multiple replicas ensures that the replicas are placed on different nodes. This ensures that a single node failures will not interrupt the service.

3.2.5. Deploying with an external LINSTOR controller

The operator can configure the satellites and CSI plugin to use an existing LINSTOR setup. This can be useful in cases where the storage infrastructure is separate from the Kubernetes cluster. Volumes can be provisioned in diskless mode on the Kubernetes nodes while the storage nodes will provide the backing disk storage.

To skip the creation of a LINSTOR Controller deployment and configure the other components to use your existing LINSTOR Controller, use the following options when running helm install:

  • operator.controller.enabled=false This disables creation of the LinstorController resource

  • operator.etcd.enabled=false Since no LINSTOR Controller will run on Kubernetes, no database is required.

  • controllerEndpoint=<url-of-linstor-controller> The HTTP endpoint of the existing LINSTOR Controller. For example: http://linstor.storage.cluster:3370/

After all pods are ready, you should see the Kubernetes cluster nodes as satellites in your LINSTOR setup.

Your kubernetes nodes must be reachable using their IP by the controller and storage nodes.

Create a storage class referencing an existing storage pool on your storage nodes.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: linstor-on-k8s
provisioner: linstor.csi.linbit.com
parameters:
  autoPlace: "3"
  storagePool: existing-storage-pool
  resourceGroup: linstor-on-k8s

You can provision new volumes by creating PVCs using your storage class. The volumes will first be placed only on nodes with the given storage pool, i.e. your storage infrastructure. Once you want to use the volume in a pod, LINSTOR CSI will create a diskless resource on the Kubernetes node and attach over the network to the diskful resource.

3.2.6. 使用Piraeus Operator部署

在Kubernetes中由社区支持的LINSTOR部署版本称为Piraeus。Piraeus项目提供了一个 an operator 进行部署。

3.3. 在Kubernetes中与LINSTOR互动

Controller pod包括LINSTOR客户端,使得直接与LINSTOR交互变得容易。例如:

kubectl exec deployment/linstor-op-cs-controller -- linstor storage-pool list

For a convenient shortcut to the above command, download kubectl-linstor and install it alongside kubectl. Then you can use kubectl linstor to get access to the complete Linstor CLI

$ kubectl linstor node list
╭───────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node                                      ┊ NodeType   ┊ Addresses                   ┊ State  ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════╡
┊ kube-node-01.test                         ┊ SATELLITE  ┊ 10.43.224.26:3366 (PLAIN)   ┊ Online ┊
┊ kube-node-02.test                         ┊ SATELLITE  ┊ 10.43.224.27:3366 (PLAIN)   ┊ Online ┊
┊ kube-node-03.test                         ┊ SATELLITE  ┊ 10.43.224.28:3366 (PLAIN)   ┊ Online ┊
┊ linstor-op-cs-controller-85b4f757f5-kxdvn ┊ CONTROLLER ┊ 172.24.116.114:3366 (PLAIN) ┊ Online ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────╯

It also expands references to PVCs to the matching Linstor resource

$ kubectl linstor resource list -r pvc:my-namespace/demo-pvc-1 --all
pvc:my-namespace/demo-pvc-1 -> pvc-2f982fb4-bc05-4ee5-b15b-688b696c8526
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName                             ┊ Node              ┊ Port ┊ Usage  ┊ Conns ┊    State   ┊ CreatedOn           ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ pvc-2f982fb4-bc05-4ee5-b15b-688b696c8526 ┊ kube-node-01.test ┊ 7000 ┊ Unused ┊ Ok    ┊   UpToDate ┊ 2021-02-05 09:16:09 ┊
┊ pvc-2f982fb4-bc05-4ee5-b15b-688b696c8526 ┊ kube-node-02.test ┊ 7000 ┊ Unused ┊ Ok    ┊ TieBreaker ┊ 2021-02-05 09:16:08 ┊
┊ pvc-2f982fb4-bc05-4ee5-b15b-688b696c8526 ┊ kube-node-03.test ┊ 7000 ┊ InUse  ┊ Ok    ┊   UpToDate ┊ 2021-02-05 09:16:09 ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

It also expands references of the form pod:[<namespace>/]<podname> into a list resources in use by the pod.

这应该只是调研问题和访问高级功能所必需的。常规操作(如创建卷)应通过Kubernetes integration实现。

3.4. 基本配置和部署

一旦所有linstor-csi Pods 都启动并运行,我们就可以使用通常的Kubernetes工作流创建卷。

Configuring the behavior and properties of LINSTOR volumes deployed via Kubernetes is accomplished via the use of StorageClasses.

the “resourceGroup” parameter is mandatory. Usually you want it to be unique and the same as the storage class name.

Here below is the simplest practical StorageClass that can be used to deploy volumes:

Listing 1. linstor-basic-sc.yaml
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
  # The name used to identify this StorageClass.
  name: linstor-basic-storage-class
  # The name used to match this StorageClass with a provisioner.
  # linstor.csi.linbit.com is the name that the LINSTOR CSI plugin uses to identify itself
provisioner: linstor.csi.linbit.com
parameters:
  # LINSTOR will provision volumes from the drbdpool storage pool configured
  # On the satellite nodes in the LINSTOR cluster specified in the plugin's deployment
  storagePool: "drbdpool"
  resourceGroup: "linstor-basic-storage-class"
  # Setting a fstype is required for "fsGroup" permissions to work correctly.
  # Currently supported: xfs/ext4
  csi.storage.k8s.io/fstype: xfs

DRBD options can be set as well in the parameters section. Valid keys are defined in the LINSTOR REST-API (e.g., DrbdOptions/Net/allow-two-primaries: "yes").

我们可以使用以下命令创建 StorageClasses

kubectl create -f linstor-basic-sc.yaml

现在,我们的存储类已经创建,我们现在可以创建一个 PersistentVolumeClaim ,它可以用来提供Kubernetes和LINSTOR都知道的卷:

Listing 2. my-first-linstor-volume-pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: my-first-linstor-volume
spec:
  storageClassName: linstor-basic-storage-class
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi

我们可以使用以下命令创建 PersistentVolumeClaim

kubectl create -f my-first-linstor-volume-pvc.yaml

这将创建一个Kubernetes已知的 PersistentVolumeClaim ,它将绑定一个 PersistentVolume,另外LINSTOR现在将根据 listor-basic-storage-class 中定义的配置 StorageClass 创建这个卷。LINSTOR卷的名称将是一个UUID,前缀为 csi- 可以使用通常的 listor resource list 来观察此卷。一旦创建了该卷,我们就可以将其附加到一个pod。下面的 Pod 规范将生成一个Fedora容器,其中附加了忙等待的卷,因此在我们与它交互之前,它不会被取消调度 :

Listing 3. my-first-linstor-volume-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: fedora
  namespace: default
spec:
  containers:
  - name: fedora
    image: fedora
    command: [/bin/bash]
    args: ["-c", "while true; do sleep 10; done"]
    volumeMounts:
    - name: my-first-linstor-volume
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: my-first-linstor-volume
    persistentVolumeClaim:
      claimName: "my-first-linstor-volume"

我们可以使用以下命令创建 Pod

kubectl create -f my-first-linstor-volume-pod.yaml

运行 kubectl describe pod fedora 可用于确认 Pod 调度和卷附加成功。

要删除卷,请确保没有pod正在使用它,然后通过 kubectl 删除 PersistentVolumeClaim 。例如,要删除我们刚刚创建的卷,请运行以下两个命令,注意在删除 PersistentVolumeClaim 之前必须取消调度该 Pod

kubectl delete pod fedora # unschedule the pod.

kubectl get pod -w # wait for pod to be unscheduled

kubectl delete pvc my-first-linstor-volume # remove the PersistentVolumeClaim, the PersistentVolume, and the LINSTOR Volume.

3.4.1. Available parameters in a StorageClass

The following storage class contains all currently available parameters to configure the provisioned storage

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: full-example
provisioner: linstor.csi.linbit.com
parameters:
  # CSI related parameters
  csi.storage.k8s.io/fstype: xfs
  # LINSTOR parameters
  autoPlace: "2"
  placementCount: "2"
  resourceGroup: "full-example"
  storagePool: "my-storage-pool"
  disklessStoragePool: "DfltDisklessStorPool"
  layerList: "drbd,storage"
  placementPolicy: "AutoPlace"
  allowRemoteVolumeAccess: "true"
  encryption: "true"
  nodeList: "diskful-a,diskful-b"
  clientList: "diskless-a,diskless-b"
  replicasOnSame: "zone=a"
  replicasOnDifferent: "rack"
  disklessOnRemaining: "false"
  doNotPlaceWithRegex: "tainted.*"
  fsOpts: "nodiscard"
  mountOpts: "noatime"
  postMountXfsOpts: "extsize 2m"
  # DRBD parameters
  DrbdOptions/*: <x>

3.4.2. csi.storage.k8s.io/fstype

Sets the file system type to create for volumeMode: FileSystem PVCs. Currently supported are:

  • ext4 (default)

  • xfs

3.4.3. autoPlace

autoPlace is an integer that determines the amount of replicas a volume of this StorageClass will have. For instance, autoPlace: "3" will produce volumes with three-way replication. If neither autoPlace nor nodeList are set, volumes will be automatically placed on one node.

如果使用此选项,则不能使用nodeList
You have to use quotes, otherwise Kubernetes will complain about a malformed StorageClass.
此选项(以及影响自动放置行为的所有选项)修改将在其上配置卷的底层存储的LINSTOR节点的数量,并且与可从中访问这些卷的 kubelets 正交。

3.4.4. placementCount

placementCount is an alias for autoPlace

3.4.5. resourceGroup

The LINSTOR Resource Group (RG) to associate with this StorageClass. If not set, a new RG will be created for each new PVC.

3.4.6. storagePool

storagePool 是用于为新创建的卷提供存储的LINSTOR storage pool的名称。

只有配置了相同 storage pool 的节点才被考虑用于autoplacement。同样,对于使用nodeList该列表中指定的所有节点,都必须在其上配置此 storage pool

3.4.7. disklessStoragePool

disklessStoragePool 是一个可选参数,它只影响作为客户端无磁盘分配给 kubelets 的LINSTOR卷。如果在LINSTOR中定义了自定义 diskless storage pool,请在此处指定。

3.4.8. layerList

A comma-separated list of layers to use for the created volumes. The available layers and their order are described towards the end of this section. Defaults to drbd,storage

3.4.9. placementPolicy

Select from one of the available volume schedulers:

  • AutoPlace, the default: Use LINSTOR autoplace, influenced by replicasOnSame and replicasOnDifferent

  • FollowTopology: Use CSI Topology information to place at least one volume in each “preferred” zone. Only useable if CSI Topology is enabled.

  • Manual: Use only the nodes listed in nodeList and clientList.

  • Balanced: EXPERIMENTAL Place volumes across failure domains, using the least used storage pool on each selected node.

3.4.10. allowRemoteVolumeAccess

Disable remote access to volumes. This implies that volumes can only be accessed from the initial set of nodes selected on creation. CSI Topology processing is required to place pods on the correct nodes.

3.4.11. encryption

encryption is an optional parameter that determines whether to encrypt volumes. LINSTOR must be configured for encryption for this to work properly.

3.4.12. 节点列表

nodeList 是要分配给卷的节点列表。这将把卷分配给每个节点,并在所有节点之间进行复制。这也可以用于按主机名选择单个节点,但使用replicasOnSame选择单个节点更灵活。

如果使用此选项,则不能使用autoPlace
此选项确定卷的底层存储将在哪个LINSTOR节点上配置,并且与从那个 kubelets 可以访问这些卷位置正交。

3.4.13. clientList

clientList is a list of nodes for diskless volumes to be assigned to. Use in conjunction with 节点列表.

3.4.14. replicasOnSame

replicasOnSame is a list of key or key=value items used as autoplacement selection labels when autoplace is used to determine where to provision storage. These labels correspond to LINSTOR node properties.

LINSTOR node properties are different from kubernetes node labels. You can see the properties of a node by running linstor node list-properties <nodename>. You can also set additional properties (“auxiliary properties”): linstor node set-property <nodename> --aux <key> <value>.

Let’s explore this behavior with examples assuming a LINSTOR cluster such that node-a is configured with the following auxiliary property zone=z1 and role=backups, while node-b is configured with only zone=z1.

如果我们使用 autoPlace: "1" ` 和 `replicasOnSame: "zone=z1 role=backups"`来配置一个 StorageClass ,那么从该 StorageClass 创建的所有卷都将配置在 node-a 上,因为这是LINSTOR集群中唯一具有所有正确的key=value对的节点。这是选择单个节点进行资源调配的最灵活方式。

This guide assumes LINSTOR CSI version 0.10.0 or newer. All properties referenced in replicasOnSame and replicasOnDifferent are interpreted as auxiliary properties. If you are using an older version of LINSTOR CSI, you need to add the Aux/ prefix to all property names. So replicasOnSame: "zone=z1" would be replicasOnSame: "Aux/zone=z1" Using Aux/ manually will continue to work on newer LINSTOR CSI versions.

如果我们使用 autoPlace: "1" ` 和 `replicasOnSame: "zone=z1" ` 来配置一个 StorageClass ,那么卷将在 `node-anode-b 上配置,因为它们都有 zone=z1 辅助属性。

If we configure a StorageClass with autoPlace: "2" and replicasOnSame: "zone=z1 role=backups", then provisioning will fail, as there are not two or more nodes that have the appropriate auxiliary properties.

如果我们用 autoPlace: "2"replicasOnSame: "zone=z1" 配置一个存储类,那么卷将同时在 node-anode-b 上配置,因为它们都有 zone=z1 辅助属性。

You can also use a property key without providing a value to ensure all replicas are placed on nodes with the same property value, with caring about the particular value. Assuming there are 4 nodes, node-a1 and node-a2 are configured with zone=a. node-b1 and node-b2 are configured with zone=b. Using autoPlace: "2" and replicasOnSame: "zone" will place on either node-a1 and node-a2 OR on node-b1 and node-b2.

3.4.15. replicasOnDifferent

replicasOnDifferent takes a list of properties to consider, same as replicasOnSame. There are two modes of using replicasOnDifferent:

  • Preventing volume placement on specific nodes:

    If a value is given for the property, the nodes which have that property-value pair assigned will be considered last.

    Example: replicasOnDifferent: "no-csi-volumes=true" will place no volume on any node with property no-csi-volumes=true unless there are not enough other nodes to fulfill the autoPlace setting.

  • Distribute volumes across nodes with different values for the same key:

    If no property value is given, LINSTOR will place the volumes across nodes with different values for that property if possible.

    Example: Assuming there are 4 nodes, node-a1 and node-a2 are configured with zone=a. node-b1 and node-b2 are configured with zone=b. Using a StorageClass with autoPlace: "2" and replicasOnDifferent: "zone", LINSTOR will create one replica on either node-a1 or node-a2 and one replica on either node-b1 or node-b2.

3.4.16. disklessOnRemaining

Create a diskless resource on all nodes that were not assigned a diskful resource.

3.4.17. doNotPlaceWithRegex

Do not place the resource on a node which has a resource with a name matching the regex.

3.4.18. fsOpts

fsOpts 是一个可选参数,在创建时将选项传递给卷的文件系统。

请注意,这些值特定于您选择的filesystem

3.4.19. mountOpts

mountOpts 是一个可选参数,在装载时将选项传递给卷的文件系统。

3.4.20. postMountXfsOpts

Extra arguments to pass to xfs_io, which gets called before right before first use of the volume.

3.4.21. DrbdOptions/*: <x>

Advanced DRBD options to pass to LINSTOR. For example, to change the replication protocol, use DrbdOptions/Net/protocol: "A".

The full list of options is available here

3.5. 快照

Creating snapshots and creating new volumes from snapshots is done via the use of VolumeSnapshots, VolumeSnapshotClasses, and PVCs.

3.5.1. Adding snapshot support

LINSTOR supports the volume snapshot feature, which is currently in beta. To use it, you need to install a cluster wide snapshot controller. This is done either by the cluster provider, or you can use the LINSTOR chart.

By default, the LINSTOR chart will install its own snapshot controller. This can lead to conflict in some cases:

  • the cluster already has a snapshot controller

  • the cluster does not meet the minimal version requirements (>= 1.17)

In such a case, installation of the snapshot controller can be disabled:

--set csi-snapshotter.enabled=false

3.5.2. Using volume snapshots

Then we can create our VolumeSnapshotClass:

Listing 4. my-first-linstor-snapshot-class.yaml
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshotClass
metadata:
  name: my-first-linstor-snapshot-class
driver: linstor.csi.linbit.com
deletionPolicy: Delete

使用 kubectl 创建 VolumeSnapshotClass

kubectl create -f my-first-linstor-snapshot-class.yaml

现在,我们将为上面创建的卷创建卷快照。这是用 VolumeSnapshot 完成的:

Listing 5. my-first-linstor-snapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: my-first-linstor-snapshot
spec:
  volumeSnapshotClassName: my-first-linstor-snapshot-class
  source:
    persistentVolumeClaimName: my-first-linstor-volume

使用 kubectl 创建 VolumeSnapshot

kubectl create -f my-first-linstor-snapshot.yaml

You can check that the snapshot creation was successful

kubectl describe volumesnapshots.snapshot.storage.k8s.io my-first-linstor-snapshot
...
Spec:
  Source:
    Persistent Volume Claim Name:  my-first-linstor-snapshot
  Volume Snapshot Class Name:      my-first-linstor-snapshot-class
Status:
  Bound Volume Snapshot Content Name:  snapcontent-b6072ab7-6ddf-482b-a4e3-693088136d2c
  Creation Time:                       2020-06-04T13:02:28Z
  Ready To Use:                        true
  Restore Size:                        500Mi

最后,我们将使用 PVC 从快照创建一个新卷。

Listing 6. my-first-linstor-volume-from-snapshot.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-first-linstor-volume-from-snapshot
spec:
  storageClassName: linstor-basic-storage-class
  dataSource:
    name: my-first-linstor-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi

kubectl 创建 PVC

kubectl create -f my-first-linstor-volume-from-snapshot.yaml

3.6. 卷可访问性

LINSTOR卷通常可以通过网络在本地和over the network来访问。

默认情况下,CSI插件将直接附加卷,如果碰巧将 Pod 调度到 kubelet 所在的底层存储上。但是,Pod 调度当前不考虑卷位置。如果需要本地连接的卷,可以使用replicasOnSame参数来限制底层存储的配置位置。

See placementPolicy to see how this default behavior can be modified.

3.7. Volume Locality Optimization using Stork

Stork is a scheduler extender plugin for Kubernetes which allows a storage driver to give the Kubernetes scheduler hints about where to place a new pod so that it is optimally located for storage performance. You can learn more about the project on its GitHub page.

The next Stork release will include the LINSTOR driver by default. In the meantime, you can use a custom-built Stork container by LINBIT which includes a LINSTOR driver, available on Docker Hub

3.7.1. Using Stork

By default, the operator will install the components required for Stork, and register a new scheduler called stork with Kubernetes. This new scheduler can be used to place pods near to their volumes.

apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  schedulerName: stork (1)
  containers:
  - name: busybox
    image: busybox
    command: ["tail", "-f", "/dev/null"]
    volumeMounts:
    - name: my-first-linstor-volume
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: my-first-linstor-volume
    persistentVolumeClaim:
      claimName: "test-volume"
1 Add the name of the scheduler to your pod.

Deployment of the scheduler can be disabled using

--set stork.enabled=false

3.8. Fast workload fail over using the High Availability Controller

The LINSTOR High Availability Controller (HA Controller) will speed up the fail over process for stateful workloads using LINSTOR for storage. It is deployed by default, and can be scaled to multiple replicas:

$ kubectl get pods -l app.kubernetes.io/name=linstor-op-ha-controller
NAME                                       READY   STATUS    RESTARTS   AGE
linstor-op-ha-controller-f496c5f77-fr76m   1/1     Running   0          89s
linstor-op-ha-controller-f496c5f77-jnqtc   1/1     Running   0          89s
linstor-op-ha-controller-f496c5f77-zcrqg   1/1     Running   0          89s

In the event of node failures, Kubernetes is very conservative in rescheduling stateful workloads. This means it can take more than 15 minutes for Pods to be moved from unreachable nodes. With the information available to DRBD and LINSTOR, this process can be sped up significantly.

The HA Controller enables fast fail over for:

  • Pods using DRBD backed PersistentVolumes. The DRBD resources must make use of the quorum functionality LINSTOR will configure this automatically for volumes with 2 or more replicas in clusters with at least 3 nodes.

  • The workload does not use any external resources in a way that could lead to a conflicting state if two instances try to use the external resource at the same time. While DRBD can ensure that only one instance can have write access to the storage, it cannot provide the same guarantee for external resources.

  • The Pod is marked with the linstor.csi.linbit.com/on-storage-lost: remove label.

3.8.1. Example

The following StatefulSet uses the HA Controller to manage fail over of a pod.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: my-stateful-app
spec:
  serviceName: my-stateful-app
  selector:
    matchLabels:
      app.kubernetes.io/name: my-stateful-app
  template:
    metadata:
      labels:
        app.kubernetes.io/name: my-stateful-app
        linstor.csi.linbit.com/on-storage-lost: remove (1)
    ...
1 The label is applied to Pod template, not the StatefulSet. The label was applied correctly, if your Pod appears in the output of kubectl get pods -l linstor.csi.linbit.com/on-storage-lost=remove.

Deploy the set and wait for the pod to start

$ kubectl get pod -o wide
NAME                                        READY   STATUS              RESTARTS   AGE     IP                NODE                    NOMINATED NODE   READINESS GATES
my-stateful-app-0                           1/1     Running             0          5m      172.31.0.1        node01.ha.cluster       <none>           <none>

Then one of the nodes becomes unreachable. Shortly after, Kubernetes will mark the node as NotReady

$ kubectl get nodes
NAME                    STATUS     ROLES     AGE    VERSION
master01.ha.cluster     Ready      master    12d    v1.19.4
master02.ha.cluster     Ready      master    12d    v1.19.4
master03.ha.cluster     Ready      master    12d    v1.19.4
node01.ha.cluster       NotReady   compute   12d    v1.19.4
node02.ha.cluster       Ready      compute   12d    v1.19.4
node03.ha.cluster       Ready      compute   12d    v1.19.4

After about 45 seconds, the Pod will be removed by the HA Controller and re-created by the StatefulSet

$ kubectl get pod -o wide
NAME                                        READY   STATUS              RESTARTS   AGE     IP                NODE                    NOMINATED NODE   READINESS GATES
my-stateful-app-0                           0/1     ContainerCreating   0          3s      172.31.0.1        node02.ha.cluster       <none>           <none>
$ kubectl get events --sort-by=.metadata.creationTimestamp -w
...
0s          Warning   ForceDeleted              pod/my-stateful-app-0                                                                   pod deleted because a used volume is marked as failing
0s          Warning   ForceDetached             volumeattachment/csi-d2b994ff19d526ace7059a2d8dea45146552ed078d00ed843ac8a8433c1b5f6f   volume detached because it is marked as failing
...

3.9. Upgrading a LINSTOR Deployment on Kubernetes

A LINSTOR Deployment on Kubernets can be upgraded to a new release using Helm.

Before upgrading to a new release, you should ensure you have an up-to-date backup of the LINSTOR database. If you are using the Etcd database packaged in the LINSTOR Chart, see here

Upgrades using the LINSTOR Etcd deployment require etcd to use persistent storage. Only follow these steps if Etcd was deployed using etcd.persistentVolume.enabled=true

Upgrades will update to new versions of the following components:

  • LINSTOR operator deployment

  • LINSTOR Controller

  • LINSTOR Satellite

  • LINSTOR CSI Driver

  • Etcd

  • Stork

Some versions require special steps, please take a look here The main command to upgrade to a new LINSTOR operator version is:

helm repo update
helm upgrade linstor-op linstor/linstor

If you used any customizations on the initial install, pass the same options to helm upgrade. The options currently in use can be retrieved from Helm.

# Retrieve the currently set options
$ helm get values linstor-op
USER-SUPPLIED VALUES:
USER-SUPPLIED VALUES: null
drbdRepoCred: drbdiocred
operator:
  satelliteSet:
    kernelModuleInjectionImage: drbd.io/drbd9-rhel8:v9.0.28
    storagePools:
      lvmThinPools:
      - devicePaths:
        - /dev/vdb
        name: thinpool
        thinVolume: thinpool
        volumeGroup: ""
# Save current options
$ helm get values linstor-op > orig.yaml
# modify values here as needed. for example selecting a newer DRBD image
$ vim orig.yaml
# Start the upgrade
$ helm upgrade linstor-op linstor/linstor -f orig.yaml

This triggers the rollout of new pods. After a short wait, all pods should be running and ready. Check that no errors are listed in the status section of LinstorControllers, LinstorSatelliteSets and LinstorCSIDrivers.

During the upgrade process, provisioning of volumes and attach/detach operations might not work. Existing volumes and volumes already in use by a pod will continue to work without interruption.

3.9.1. Upgrade instructions for specific versions

Some versions require special steps, see below.

Upgrade to v1.4

This version introduces a new default version for the Etcd image, so take extra care that Etcd is using persistent storage. Upgrading the Etcd image without persistent storage will corrupt the cluster.

If you are upgrading an existing cluster without making use of new Helm options, no additional steps are necessary.

If you plan to use the newly introduced additionalProperties and additionalEnv settings, you have to replace the installed CustomResourceDefinitions with newer versions. Helm does not upgrade the CRDs on a chart upgrade

$ helm pull linstor/linstor --untar
$ kubectl replace -f linstor/crds/
customresourcedefinition.apiextensions.k8s.io/linstorcontrollers.linstor.linbit.com replaced
customresourcedefinition.apiextensions.k8s.io/linstorcsidrivers.linstor.linbit.com replaced
customresourcedefinition.apiextensions.k8s.io/linstorsatellitesets.linstor.linbit.com replaced
Upgrade to v1.3

No additional steps necessary.

Upgrade to v1.2

LINSTOR operator v1.2 is supported on Kubernetes 1.17+. If you are using an older Kubernetes distribution, you may need to change the default settings, for example [the CSI provisioner](https://kubernetes-csi.github.io/docs/external-provisioner.html).

There is a known issue when updating the CSI components: the pods will not be updated to the newest image and the errors section of the LinstorCSIDrivers resource shows an error updating the DaemonSet. In this case, manually delete deployment/linstor-op-csi-controller and daemonset/linstor-op-csi-node. They will be re-created by the operator.

3.9.2. Creating Etcd Backups

To create a backup of the Etcd database and store it on your control host, run:

kubectl exec linstor-op-etcd-0 -- etcdctl snapshot save /tmp/save.db
kubectl cp linstor-op-etcd-0:/tmp/save.db save.db

These commands will create a file save.db on the machine you are running kubectl from.

4. LINSTOR Volumes in Openshift

This chapter describes the usage of LINSTOR in Openshift as managed by the operator and with volumes provisioned using the LINSTOR CSI plugin.

4.1. Openshift Overview

OpenShift is the official Red Hat developed and supported distribution of Kubernetes. As such, you can easily deploy Piraeus or the LINSTOR operator using Helm or via example yamls as mentioned in the previous chapter, Kubernetes的LINSTOR卷.

Some of the value of Red Hat’s Openshift is that it includes its own registry of supported and certified images and operators, in addition to a default and standard web console. This chapter describes how to install the Certified LINSTOR operator via these tools.

4.2. Deploying LINSTOR on Openshift

4.2.1. Before you Begin

LINBIT provides a certified LINSTOR operator via the RedHat marketplace. The operator eases deployment of LINSTOR on Kubernetes by installing DRBD, managing Satellite and Controller pods, and other related functions.

The operator itself is available from the Red Hat Marketplace.

Unlike deployment via the helm chart, the certified Openshift operator does not deploy the needed etcd cluster. You must deploy this yourself ahead of time. We do this via the etcd operator available on operatorhub.io.

It it advised that the etcd deployment uses persistent storage of some type. Either use an existing storage provisioner with a default StorageClass or simply use hostPath volumes.

Read the storage guide and configure a basic storage setup for LINSTOR.

Read the section on securing the deployment and configure as needed.

4.2.2. Deploying the operator pod

Once etcd and storage has been configured, we are now ready to install the LINSTOR operator. You can find the LINSTOR operator via the left-hand control pane of Openshift Web Console. Expand the “Operators” section and select “OperatorHub”. From here you need to find the LINSTOR operator. Either search for the term “LINSTOR” or filter only by “Marketplace” operators.

The LINSTOR operator can only watch for events and manage custom resources that are within the same namespace it is deployed within (OwnNamespace). This means the LINSTOR Controller, LINSTOR Satellites, and LINSTOR CSI Driver pods all need to be deployed in the same namespace as the LINSTOR Operator pod.

Once you have located the LINSTOR operator in the Marketplace, click the “Install” button and install it as you would any other operator.

At this point you should have just one pod, the operator pod, running.

Next we need to configure the remaining provided APIs.

4.2.3. Deploying the LINSTOR Controller

Again, navigate to the left-hand control pane of the Openshift Web Console. Expand the “Operators” section, but this time select “Installed Operators”. Find the entry for the “Linstor Operator”, then select the “LinstorController” from the “Provided APIs” column on the right.

From here you should see a page that says “No Operands Found” and will feature a large button on the right which says “Create LinstorController”. Click the “Create LinstorController” button.

Here you will be presented with options to configure the LINSTOR Controller. Either via the web-form view or the YAML View. Regardless of which view you select, make sure that the dbConnectionURL matches the endpoint provided from your etcd deployment. Otherwise, the defaults are usually fine for most purposes.

Lastly hit “Create”, you should now see a linstor-controller pod running.

4.2.4. Deploying the LINSTOR Satellites

Next we need to deploy the Satellites Set. Just as before navigate to the left-hand control pane of the Openshift Web Console. Expand the “Operators” section, but this time select “Installed Operators”. Find the entry for the “Linstor Operator”, then select the “LinstorSatelliteSet” from the “Provided APIs” column on the right.

From here you should see a page that says “No Operands Found” and will feature a large button on the right which says “Create LinstorSatelliteSet”. Click the “Create LinstorSatelliteSet” button.

Here you will be presented with the options to configure the LINSTOR Satellites. Either via the web-form view or the YAML View. One of the first options you’ll notice is the automaticStorageType. If set to “NONE” then you’ll need to remember to configure the storage pools yourself at a later step.

Another option you’ll notice is kernelModuleInjectionMode. I usually select “Compile” for portability sake, but selecting “ShippedModules” will be faster as it will install pre-compiled kernel modules on all the worker nodes.

Make sure the controllerEndpoint matches what is available in the kubernetes endpoints. The default is usually correct here.

Below is an example manifest:

apiVersion: linstor.linbit.com/v1
kind: LinstorSatelliteSet
metadata:
  name: linstor
  namespace: default
spec:
  satelliteImage: ''
  automaticStorageType: LVMTHIN
  drbdRepoCred: ''
  kernelModuleInjectionMode: Compile
  controllerEndpoint: 'http://linstor:3370'
  priorityClassName: ''
status:
  errors: []

Lastly hit “Create”, you should now see a linstor-node pod running on every worker node.

4.2.5. Deploying the LINSTOR CSI driver

Last bit left is the CSI pods to bridge the layer between the CSI and LINSTOR. Just as before navigate to the left-hand control pane of the OpenShift Web Console. Expand the “Operators” section, but this time select “Installed Operators”. Find the entry for the “Linstor Operator”, then select the “LinstorCSIDriver” from the “Provided APIs” column on the right.

From here you should see a page that says “No Operands Found” and will feature a large button on the right which says “Create LinstorCSIDriver”. Click the “Create LinstorCSIDriver” button.

Again, you will be presented with the options. Make sure that the controllerEndpoint is correct. Otherwise the defaults are fine for most use cases.

Lastly hit “Create”. You will now see a single “linstor-csi-controller” pod, as well as a “linstor-csi-node” pod on all worker nodes.

4.3. Interacting with LINSTOR in OpenShift.

The Controller pod includes a LINSTOR Client, making it easy to interact directly with LINSTOR. For instance:

oc exec deployment/linstor-controller -- linstor storage-pool list

This should only be necessary for investigating problems and accessing advanced functionality. Regular operation such as creating volumes should be achieved via the Kubernetes integration.

4.4. Configuration and deployment

Once the operator and all the needed pods are deployed, provisioning volumes simply follows the usual Kubernetes workflows.

As such, please see the previous chapter’s section on Basic Configuration and Deployment.

4.5. Deploying additional components

Some additional components are not included in the OperatorHub version of the LINSTOR Operator when compared to the Helm deployment. Most notably, this includes setting up Etcd and deploying the STORK integration.

Etcd can be deployed by using the Etcd Operator available in the OperatorHub.

4.5.1. Stork

To deploy STORK, you can use the single YAML deployment available at: https://charts.linstor.io/deploy/stork.yaml Download the YAML and replace every instance of MY-STORK-NAMESPACE with your desired namespace for STORK. You also need to replace MY-LINSTOR-URL with the URL of your controller. This value depends on the name you chose when creating the LinstorController resource. By default this would be http://linstor.<operator-namespace>.svc:3370

To apply the YAML to Openshift, either use oc apply -f <filename> from the command line or find the “Import YAML” option in the top right of the Openshift Web Console.

4.5.2. High Availability Controller

To deploy our High Availability Controller, you can use the single YAML deployment available at: https://charts.linstor.io/deploy/ha-controller.yaml

Download the YAML and replace:

To apply the YAML to Openshift, either use oc apply -f <filename> from the command line or find the “Import YAML” option in the top right of the Openshift Web Console.

4.5.3. Deploying via Helm on openshift

Alternatively, you can deploy the LINSTOR Operator using Helm instead. Take a look at the Kubernetes guide. Openshift requires changing some of the default values in our Helm chart.

If you chose to use Etcd with hostpath volumes for persistence (see here), you need to enable selinux relabelling. To do this pass --set selinux=true to the pv-hostpath install command.

For the LINSTOR Operator chart itself, you should change the following values:

global:
  setSecurityContext: false (1)
csi-snapshotter:
  enabled: false (2)
stork:
  schedulerTag: v1.18.6 (3)
etcd:
  podsecuritycontext:
    supplementalGroups: [1000] (4)
operator:
  satelliteSet:
    kernelModuleInjectionImage: drbd.io/drbd9-rhel8:v9.0.25 (5)
1 Openshift uses SCCs to manage security contexts.
2 The cluster wide CSI Snapshot Controller is already installed by Openshift.
3 Automatic detection of the Kubernetes Scheduler version fails in Openshift, you need to set it manually. Note: the tag does not have to match Openshift’s Kubernetes release.
4 If you choose to use Etcd deployed via Helm and use the pv-hostpath chart, Etcd needs to run as member of group 1000 to access the persistent volume.
5 The RHEL8 kernel injector also supports RHCOS.

Other overrides, such as storage pool configuration, HA deployments and more, are available and documented in the Kubernetes guide.

5. Proxmox VE中的LINSTOR卷

本章描述了通过 LINSTOR Proxmox Plugin 实现的Proxmox VE中的DRBD。

5.1. Proxmox VE概述

proxmox VE是一个易于使用的、完整的服务器虚拟化环境,具有KVM、Linux容器和HA。

linstor-proxmox 是proxmox的一个Perl插件,它与LINSTOR结合,允许在多个proxmox VE节点上复制VM磁盘。这允许在几秒钟内实时迁移活动vm,而且不需要中央SAN,因为数据已经复制到多个节点。

5.2. Upgrades

If this is a fresh installation, skip this section and continue with Proxmox插件安装.

5.2.1. From 4.x to 5.x

Version 5 of the plugin drops compatibility with the legacy configuration options “storagepool” and “redundancy”. Version 5 requires a “resourcegroup” option, and obviously a LINSTOR resource group. The old options should be removed from the config.

Configuring LINSTOR is described in Section LINSTOR配置, a typical example follows: Let’s assume the pool was set to “mypool”, and redundancy to 3.

# linstor resource-group create --storage-pool=mypool --place-count=3 drbdMypoolThree
# linstor volume-group create drbdMypoolThree
# vi /etc/pve/storage.cfg
drbd: drbdstorage
   content images,rootdir
   controller 10.11.12.13
   resourcegroup drbdMypoolThree

5.3. Proxmox插件安装

LINBIT为Proxmox VE用户提供了一个专用的公共仓库。这个存储库不仅包含Proxmox插件,还包含整个DRBD-SDS堆栈,包括DRBD SDS内核模块和用户空间实用程序。

DRBD9内核模块是作为dkms软件包(即 drbd-dkms )安装的,因此,您必须先安装 pve-headers 软件包,然后才能从LINBIT的存储库中设置/安装软件包。 按照该顺序,确保内核模块将为您的内核正确构建。如果您不打算安装最新的Proxmox内核,则必须安装与当前运行的内核匹配的内核头文件(例如 pve-headers-$(uname -r))。如果您错过了这一步,那么仍然可以通过输入 apt install --reinstall drbd-dkms 命令, 针对当前内核重建dkms软件包(必须预先安装内核头文件)。

LINBIT’s repository can be enabled as follows, where “$PVERS” should be set to your Proxmox VE major version (e.g., “6”, not “6.1”):

# wget -O- https://packages.linbit.com/package-signing-pubkey.asc | apt-key add -
# PVERS=6 && echo "deb http://packages.linbit.com/proxmox/ proxmox-$PVERS drbd-9.0" > \
	/etc/apt/sources.list.d/linbit.list
# apt update && apt install linstor-proxmox

5.4. LINSTOR配置

对于本指南的其余部分,我们假设您已经按照初始化集群中的说明配置了LINSTOR集群。还要确保将每个节点设置为 “Combined” 节点。在一个节点上启动 “linstor-controller” ` ,在所有节点上启动 “linstor-satellite” 。从4.1.0版开始,使用插件的首选方法是通过LINSTOR资源组和每个资源组中的单个卷组。LINSTOR资源组在资源组中描述。必须在资源组上设置所有必需的LINSTOR配置(如冗余计数)。

5.5. ProxBox插件配置

最后一步是为Proxmox本身提供配置。这可以通过在 /etc/pve/storage.cfg 文件中添加一个条目来完成,其内容类似于以下内容。

drbd: drbdstorage
   content images,rootdir
   controller 10.11.12.13
   resourcegroup defaultpool

The “drbd” entry is fixed and you are not allowed to modify it, as it tells to Proxmox to use DRBD as storage backend. The “drbdstorage” entry can be modified and is used as a friendly name that will be shown in the PVE web GUI to locate the DRBD storage. The “content” entry is also fixed, so do not change it. The redundancy (specified in the resource group) specifies how many replicas of the data will be stored in the cluster. The recommendation is to set it to 2 or 3 depending on your setup. The data is accessible from all nodes, even if some of them do not have local copies of the data. For example, in a 5 node cluster, all nodes will be able to access 3 copies of the data, no matter where they are stored in. The “controller” parameter must be set to the IP of the node that runs the LINSTOR controller service. Only one node can be set to run as LINSTOR controller at the same time. If that node fails, start the LINSTOR controller on another node and change that value to its IP address.

插件的最新版本允许定义多个不同的存储池。这样的配置如下:

drbd: drbdstorage
   content images,rootdir
   controller 10.11.12.13
   resourcegroup defaultpool

drbd: fastdrbd
   content images,rootdir
   controller 10.11.12.13
   resourcegroup ssd

drbd: slowdrbd
   content images,rootdir
   controller 10.11.12.13
   resourcegroup backup

现在,您应该可以通过Proxmox的web GUI创建vm,方法是选择 “drbdstorage” , 或者选择任何其他已定义的池作为存储位置。

Starting from version 5 of the plugin one can set the option “preferlocal yes”. If it is set, the plugin tries to create a diskful assignment on the node that issued the storage create command. With this option one can make sure the VM gets local storage if possible. Without that option LINSTOR might place the storage on nodes ‘B’ and ‘C’, while the VM is initially started on node ‘A’. This would still work as node ‘A’ then would get a diskless assignment, but having local storage might be preferred.

注意:目前DRBD只支持 raw 磁盘格式。

此时,您可以尝试实时迁移虚拟机 – 因为所有节点(甚至在无盘节点上)都可以访问所有数据 – 只需几秒钟。如果VM负载不足,并且有很多RAM一直被污染,那么整个过程可能需要更长的时间。但是在任何情况下,停机时间都应该是最少的,而且您不会看到任何中断。

5.6. Making the Controller Highly-Available (optional)

Making LINSTOR highly-available is a matter of making the LINSTOR controller highly-available. This step is described in Section LINSTOR high availability.

The last — but crucial — step is to configure the Proxmox plugin to be able to connect to multiple LINSTOR controllers. It will use the first one it receives an answer from. This is done by adding a comma-separated list of controllers in the controller section of the plugin like this:

drbd: drbdstorage
   content images,rootdir
   controller 10.11.12.13,10.11.12.14,10.11.12.15
   resourcegroup defaultpool

6. OpenNebula中的LINSTOR卷

本章通过使用 LINSTOR storage driver addon 描述OpenNebula中的DRBD。

详细的安装和配置说明,请参见驱动程序源的 README.md 文件。

6.1. OpenNebula概述

OpenNebula是一个灵活的开源云管理平台,允许通过使用插件扩展其功能。

LINSTOR插件允许部署具有由DRBD支持并通过DRBD自己的传输协议连接到网络上的高可用映像的虚拟机。

6.2. OpenNebula插件安装

为OpenNebula安装linstorage插件需要一个工作的OpenNebula集群以及一个工作的LINSTOR集群。

通过访问LINBIT的客户仓库,您可以使用

# apt install linstor-opennebula

# yum install linstor-opennebula

如果无法访问LINBIT准备的包,您需要返回阅读 Github page 上的说明。

可以按照指南中的说明安装和配置带有LINSTOR的DRBD集群,请参见初始化集群

OpenNebula和DRBD集群可以相互独立,但以下情况除外:OpenNebula的前端和主机节点必须包含在这两个集群中。

主机节点不需要本地LINSTOR存储池,因为虚拟机映像是通过网络连接到它的。 [1]

6.3. 部署选项

建议使用LINSTOR资源组配置您喜欢的部署,请参见OpenNebula资源组。不推荐使用以前的自动放置和部署节点模式, 因为已经过时了。

6.4. 配置

6.4.1. 将驱动程序添加到OpenNebula

修改 /etc/one/oned.conf 的以下部分`

将linstor添加到 TM MADDATASTORE MAD 部分的驱动程序列表中:

TM_MAD = [
  executable = "one_tm",
  arguments = "-t 15 -d dummy,lvm,shared,fs_lvm,qcow2,ssh,vmfs,ceph,linstor"
]
DATASTORE_MAD = [
    EXECUTABLE = "one_datastore",
    ARGUMENTS  = "-t 15 -d dummy,fs,lvm,ceph,dev,iscsi_libvirt,vcenter,linstor -s shared,ssh,ceph,fs_lvm,qcow2,linstor"

添加新的TM_MAD_CONF和DS_MAD_CONF部分:

TM_MAD_CONF = [
    NAME = "linstor", LN_TARGET = "NONE", CLONE_TARGET = "SELF", SHARED = "yes", ALLOW_ORPHANS="yes",
    TM_MAD_SYSTEM = "ssh,shared", LN_TARGET_SSH = "NONE", CLONE_TARGET_SSH = "SELF", DISK_TYPE_SSH = "BLOCK",
    LN_TARGET_SHARED = "NONE", CLONE_TARGET_SHARED = "SELF", DISK_TYPE_SHARED = "BLOCK"
]
DS_MAD_CONF = [
    NAME = "linstor", REQUIRED_ATTRS = "BRIDGE_LIST", PERSISTENT_ONLY = "NO",
    MARKETPLACE_ACTIONS = "export"
]

完成这些更改后,重新启动opennebula服务。

6.4.2. 配置节点

前端节点通过Linstor向存储和主机节点发出命令

存储节点在本地保存vm的磁盘映像。

主机节点负责运行实例化的vm,通常通过Linstor无盘模式在网络上存储它们需要连接的映像。

所有节点都必须安装DRBD9和Linstor。这个过程在 DRBD9用户指南 中有详细说明

只要前端和主机节点满足这两个角色的所有要求,它们就可以作为其主要角色之外的存储节点。

前端配置

请验证您希望与之通信的控制节点是否可以从前端节点访问。 linstor node list 用于本地运行linstor控制器,而 linstor --controllers "<IP:PORT>" node list 用于远程运行linstor控制器,这是一种方便的测试方法。

主机配置

主机节点上必须运行Linstor satellite进程,并且与前端和存储节点属于同一Linstor集群,并且可以选择在本地存储。如果 oneadmin 用户能够在主机之间无密码地使用ssh,那么即使使用ssh系统数据存储,也可以使用实时迁移。

存储节点配置

只有前端和主机节点需要安装OpenNebula,但oneadmin用户必须能够无密码访问存储节点。有关如何在发行版中手动配置oneadmin用户帐户,请参阅OpenNebula安装指南。

存储节点必须使用由能够生成快照的驱动程序(如精简LVM插件)创建的存储池。

在这个使用LVM for Linstor准备精简配置存储的示例中,必须在每个存储节点上创建卷组并使用LVM创建精简配置存储。

此过程的示例使用两个物理卷(/dev/sdX和/dev/sdY)以及卷组和thinpool的通用名称。请确保将thinLV的元数据卷设置为合理的大小,一旦它满了,就很难调整大小:

pvcreate /dev/sdX /dev/sdY
vgcreate drbdpool /dev/sdX /dev/sdY
lvcreate -l 95%VG --poolmetadatasize 8g -T /dev/drbdpool/drbdthinpool

然后在Linstor上创建存储池,并将其用作备份存储。

If you are using ZFS storage pools or thick-LVM, please use LINSTOR_CLONE_MODE copy otherwise you will have problems deleting linstor resources, because of ZFS parent-child snapshot relationships.

6.4.3. Oneadmin的权限

oneadmin用户必须具有对存储节点上的 mkfs 命令的无密码sudo访问权限

oneadmin ALL=(root) NOPASSWD: /sbin/mkfs

请确保考虑应该添加oneadmin的组,以便访问访问存储和实例化vm所需的设备和程序。对于此加载项,oneadmin用户必须属于所有节点上的 disk 组,才能访问保存镜像的DRBD设备。

usermod -a -G disk oneadmin

6.4.4. 创建新的Linstor数据存储

创建一个名为ds.conf的数据存储配置文件,并使用 onedatastore 工具基于该配置创建一个新的数据存储。有两个相互排斥的部署选项:LINSTOR_AUTO_PLACE和LINSTOR_deployment_NODES。如果两者都已配置,则忽略LINSTOR_AUTO_PLACE。对于这两个选项,BRIDGE_LIST必须是Linstor集群中所有存储节点的空间分隔列表。

6.4.5. OpenNebula资源组

因为1.0.0版LINSTOR支持资源组。资源组是所有链接到该资源组的资源共享的设置的集中点。

Create a resource group and volume group for your datastore, it is mandatory to specify a storage-pool within the resource group, otherwise monitoring space for opennebula will not work. Here we create one with 2 node redundancy and use a created opennebula-storagepool:

linstor resource-group create OneRscGrp --place-count 2 --storage-pool opennebula-storagepool
linstor volume-group create

现在使用LINSTOR插件添加OpenNebula数据存储:

cat >ds.conf <<EOI
NAME = linstor_datastore
DS_MAD = linstor
TM_MAD = linstor
TYPE = IMAGE_DS
DISK_TYPE = BLOCK
LINSTOR_RESOURCE_GROUP = "OneRscGrp"
COMPATIBLE_SYS_DS = 0
BRIDGE_LIST = "alice bob charlie"  #node names
EOI

onedatastore create ds.conf

6.4.6. 插件属性

LINSTOR_CONTROLLERS

LINSTOR_CONTROLLERS 可用于在LINSTOR控制器进程未在前端本地运行的情况下,将以逗号分隔的控制器IP和端口列表传递给LINSTOR客户端,例如:

LINSTOR_CONTROLLERS = "192.168.1.10:8080,192.168.1.11:6000"

LINSTOR_CLONE_MODE

Linstor支持两种不同的克隆模式,并通过 LINSTOR_CLONE_MODE 属性设置:

  • snapshot

默认模式是 ‘snapshot’ ,它使用linstor快照,并从该快照还原一个新资源,该快照随后是映像的克隆。此模式通常比使用 copy 模式快,因为快照是廉价的副本。

  • copy

第二种模式是 copy ,它创建一个与原始资源大小相同的新资源,并使用 dd 将数据复制到新资源。此模式将比 snapshot 慢,但更健壮,因为它不依赖任何快照机制,如果要将镜像克隆到不同的linstor数据存储中,也会使用此模式。

6.4.7. 不推荐的属性

以下属性已弃用,并将在1.0.0版之后的版本中删除。

LINSTOR_STORAGE_POOL

LINSTOR_STORAGE_POOL 属性用于选择数据存储应使用的LINSTOR存储池。如果使用资源组,则不需要此属性,因为可以通过 自动选择筛选器 选项选择存储池。如果使用 LINSTOR_AUTO_PLACELINSTOR_DEPLOYMENT_NODES ,并且未设置 LINSTOR_STORAGE_POOL ,则它将回退到LINSTOR中的 DfltStorPool

LINSTOR_AUTO_PLACE

LINSTOR_AUTO_PLACE 选项采用一个冗余级别,即一个与存储节点总数之间的数字。资源根据冗余级别自动分配给存储节点。

LINSTOR_DEPLOYMENT_NODES

使用 LINSTOR_DEPLOYMENT_NODES 可以选择资源将始终分配给的一组节点。请注意,bridge列表仍然包含Linstor集群中的所有存储节点。

6.4.8. LINSTOR作为系统数据存储

Linstor驱动程序也可以用作系统数据存储,配置与普通数据存储非常相似,但有一些更改:

cat >system_ds.conf <<EOI
NAME = linstor_system_datastore
TM_MAD = linstor
TYPE = SYSTEM_DS
LINSTOR_RESOURCE_GROUP = "OneSysRscGrp"
BRIDGE_LIST = "alice bob charlie"  # node names
EOI

onedatastore create system_ds.conf

还要将新的sys datastore id添加到镜像数据存储的 COMPATIBLE_SYS_DS (以逗号分隔),否则调度程序将忽略它们。

如果要使用易失性磁盘进行实时迁移,则需要为KVM启用 --unsafe 选项,请参见: opennebula doc

6.5. Live Migration

即使使用ssh系统数据存储和nfs共享系统数据存储,也支持实时迁移。

6.6. 可用空间报告

空闲空间的计算方式不同,这取决于资源是自动部署还是按节点部署。

对于每个节点放置的数据存储,将根据部署资源的所有节点中限制最严格的存储池报告可用空间。例如,总存储空间最小的节点的容量用于确定数据存储的总大小,而可用空间最小的节点用于确定数据存储中的剩余空间。

对于使用自动放置的数据存储,大小和剩余空间是根据LINSTOR报告的数据存储所使用的聚合存储池确定的。

7. Openstack中的LINSTOR卷

This chapter describes using LINSTOR to provision persistent, replicated, and high-performance block storage for Openstack.

7.1. Openstack概述

Openstack consists of a wide range of individual services; The service responsible for provisioning and managing block storage is called Cinder. Other openstack services such as the compute instance service Nova can request volumes from Cinder. Cinder will then make a volume accessible to the requesting service.

LINSTOR can integrate with Cinder using a volume driver. The volume driver translates calls to the Cinder API to LINSTOR commands. For example: requesting a volume from Cinder will create new resources in LINSTOR, Cinder Volume snapshots translate to snapshots in LINSTOR and so on.

7.2. 在OpenStack上安装LINSTOR

An initial installation and configuration of DRBD and LINSTOR must be completed prior to using the OpenStack driver.

At this point you should be able to list your storage cluster nodes using the LINSTOR client:

$ linstor node info
╭────────────────────────────────────────────────────────────────────────────╮
┊ Node                      ┊ NodeType  ┊ Addresses                 ┊ State  ┊
╞════════════════════════════════════════════════════════════════════════════╡
┊ cinder-01.openstack.test  ┊ COMBINED  ┊ 10.43.224.21:3366 (PLAIN) ┊ Online ┊
┊ cinder-02.openstack.test  ┊ COMBINED  ┊ 10.43.224.22:3366 (PLAIN) ┊ Online ┊
┊ storage-01.openstack.test ┊ SATELLITE ┊ 10.43.224.11:3366 (PLAIN) ┊ Online ┊
┊ storage-02.openstack.test ┊ SATELLITE ┊ 10.43.224.12:3366 (PLAIN) ┊ Online ┊
┊ storage-03.openstack.test ┊ SATELLITE ┊ 10.43.224.13:3366 (PLAIN) ┊ Online ┊
╰────────────────────────────────────────────────────────────────────────────╯

At this point you should configure one or more storage pools per node. This guide assumes the storage pool is named cinderpool. LINSTOR should list the storage pool for each node, including the diskless storage pool created by default.

$ linstor storage-pool list
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node                      ┊ Driver   ┊ PoolName        ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ cinder-01.openstack.test  ┊ DISKLESS ┊                 ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ cinder-02.openstack.test  ┊ DISKLESS ┊                 ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ storage-01.openstack.test ┊ DISKLESS ┊                 ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ storage-02.openstack.test ┊ DISKLESS ┊                 ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ storage-03.openstack.test ┊ DISKLESS ┊                 ┊              ┊               ┊ False        ┊ Ok    ┊
┊ cinderpool           ┊ storage-01.openstack.test ┊ LVM_THIN ┊ ssds/cinderpool ┊      100 GiB ┊       100 GiB ┊ True         ┊ Ok    ┊
┊ cinderpool           ┊ storage-02.openstack.test ┊ LVM_THIN ┊ ssds/cinderpool ┊      100 GiB ┊       100 GiB ┊ True         ┊ Ok    ┊
┊ cinderpool           ┊ storage-03.openstack.test ┊ LVM_THIN ┊ ssds/cinderpool ┊      100 GiB ┊       100 GiB ┊ True         ┊ Ok    ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

7.2.1. Install the LINSTOR driver

Starting with OpenStack Stein, the LINSTOR driver is part of the Cinder project. While the driver can be used as is, it might be missing features or fixes available in newer version. Due to OpenStacks update policy for stable versions, most improvements to the driver will not get back-ported to older stable releases.

LINBIT maintains a fork of the Cinder repository with all improvements to the LINSTOR driver backported to the supported stable versions. Currently, these are:

OpenStack Release Included Version LINBIT Version LINBIT Branch

victoria

1.0.1

1.1.0

linstor/stable/victoria

ussuri

1.0.1

1.1.0

linstor/stable/ussuri

train

1.0.0

1.1.0

linstor/stable/train

stein

1.0.0

1.1.0

linstor/stable/stein

rocky

n/a

n/a

n/a

queens

n/a

n/a

n/a

pike

n/a

n/a

n/a

ocata

n/a

n/a

n/a

The exact steps to enable the Linstor Driver depend on your OpenStack distribution. In general, the python-linstor package needs to be installed on all hosts running the Cinder volume service. The next section will cover the installation process for common OpenStack distributions.

DevStack

DevStack is a great way to try out OpenStack in a lab environment. To use the most recent driver use the following DevStack configuration:

Listing 7. local.conf
# This ensures the Linstor Driver has access to the 'python-linstor' package.
#
# This is needed even if using the included driver!
USE_VENV=True
ADDITIONAL_VENV_PACKAGES=python-linstor

# This is required to select the LINBIT version of the driver
CINDER_REPO=https://github.com/LINBIT/openstack-cinder.git
# Replace linstor/stable/victoria with the reference matching your Openstack release.
CINDER_BRANCH=linstor/stable/victoria
Kolla

Kolla packages OpenStack components in containers. They can then be deployed, for example using Kolla Ansible You can take advantage of the available customisation options for kolla containers to set up the Linstor driver.

To ensure that the required python-linstor package is installed, use the following override file:

Listing 8. template-override.j2
{% extends parent_template %}

# Cinder
{% set cinder_base_pip_packages_append = ['python-linstor'] %}

To install the LINBIT version of the driver, update your kolla-build.conf

Listing 9. /etc/kolla/kolla-build.conf
[cinder-base]
type = git
location = https://github.com/LINBIT/openstack-cinder.git
# Replace linstor/stable/victoria with the reference matching your Openstack release.
reference = linstor/stable/victoria

To rebuild the Cinder containers, run:

# A private registry used to store the kolla container images
REGISTRY=deployment-registry.example.com
# The image namespace in the registry
NAMESPACE=kolla
# The tag to apply to all images. Use the release name for compatibility with kolla-ansible
TAG=victoria
kolla-build -t source --template-override template-override.j2 cinder --registry $REGISTRY --namespace $NAMESPACE --tag $TAG
Kolla Ansible

When deploying OpenStack using Kolla Ansible, you need to make sure that:

  • the custom Cinder images, created in the section above, are used

  • deployment of Cinder services is enabled

Listing 10. /etc/kolla/globals.yml
# use "source" images
kolla_install_type: source
# use the same registry as for running kolla-build above
docker_registry: deployment-registry.example.com
# use the same namespace as for running kolla-build above
docker_namespace: kolla
# deploy cinder block storage service
enable_cinder: "yes"
# disable verification of cinder backends, kolla-ansible only supports a small subset of available backends for this
skip_cinder_backend_check: True
# add the LINSTOR backend to the enabled backends. For backend configuration see below
cinder_enabled_backends:
  - name: linstor-drbd

You can place the Linstor driver configuration in one of the override directories for kolla-ansible. For more details on the available configuration options, see the section below.

Listing 11. /etc/kolla/config/cinder/cinder-volume.conf
[linstor-drbd]
volume_backend_name = linstor-drbd
volume_driver = cinder.volume.drivers.linstordrv.LinstorDrbdDriver
linstor_autoplace_count = 2
linstor_default_storage_pool_name = cinderpool
linstor_default_uri = linstor://cinder-01.openstack.test,linstor://cinder-02.openstack.test
OpenStack Ansible

OpenStack Ansible provides Ansible playbooks to configure and deploy of OpenStack environments. It allows for fine-grained customization of the deployment, letting you set up the Linstor driver directly.

Listing 12. /etc/openstack_ansile/user_variables.yml
cinder_git_repo: https://github.com/LINBIT/openstack-cinder.git
cinder_git_install_branch: linstor/stable/victoria

cinder_user_pip_packages:
  - python-linstor

cinder_backends: (1)
  linstor-drbd:
   volume_backend_name: linstor-drbd
   volume_driver: cinder.volume.drivers.linstordrv.LinstorDrbdDriver
   linstor_autoplace_count: 2
   linstor_default_storage_pool_name: cinderpool
   linstor_default_uri: linstor://cinder-01.openstack.test,linstor://cinder-02.openstack.test
1 A detailed description of the available backend parameters can be found in the section below
Generic Cinder deployment

For other forms of OpenStack deployments, this guide can only provide non-specific hints.

To update the Linstor driver version, find your cinder installation. Some likely paths are:

/usr/lib/python3.6/dist-packages/cinder/
/usr/lib/python3.6/site-packages/cinder/
/usr/lib/python2.7/dist-packages/cinder/
/usr/lib/python2.7/site-packages/cinder/

The Linstor driver consists of a single file called linstordrv.py, located in the Cinder directory:

$CINDER_PATH/volume/drivers/linstordrv.py

To update the driver, replace the file with one from the LINBIT repository

RELEASE=linstor/stable/victoria
curl -fL "https://raw.githubusercontent.com/LINBIT/openstack-cinder/$RELEASE/cinder/volume/drivers/linstordrv.py" > $CINDER_PATH/volume/drivers/linstordrv.py

You might also need to remove the Python cache for the update to be registered:

rm -rf $CINDER_PATH/volume/drivers/__pycache__

7.3. Configure a Linstor Backend for Cinder

To use the Linstor driver, configure the Cinder volume service. This is done by editing the Cinder configuration file and then restarting the Cinder Volume service.

Most of the time, the Cinder configuration file is located at /etc/cinder/cinder.conf. Some deployment options allow manipulating this file in advance. See the section above for specifics.

To configure a new volume backend using Linstor, add the following section to cinder.conf

[linstor-drbd]
volume_backend_name = linstor-drbd (1)
volume_driver = cinder.volume.drivers.linstordrv.LinstorDrbdDriver (2)
linstor_default_uri = linstor://cinder-01.openstack.test,linstor://cinder-02.openstack.test (3)
linstor_default_storage_pool_name = cinderpool (4)
linstor_autoplace_count = 2 (5)
linstor_controller_diskless = true (6)
... (7)
The parameters described here are based on the latest release provided by Linbit. The driver included in OpenStack might not support all of these parameters. Take a look at the OpenStack driver documentation to find out more.
1 The name of the volume backend. Needs to be unique in the Cinder configuration. The whole section should share the same name. This name is referenced again in cinder.conf in the enabled_backends setting and when creating a new volume type.
2 The version of the Linstor driver to use. There are two options:
  • cinder.volume.drivers.linstordrv.LinstorDrbdDriver

  • cinder.volume.drivers.linstordrv.LinstorIscsiDriver

    Which driver you should use depends on your Linstor set up and requirements. Details on each choice are documented in the section below.

3 The URL(s) of the Linstor Controller(s). Multiple Controllers can be specified to make use of Linstor High Availability. If not set, defaults to linstor://localhost.
4 The storage pools to use when placing resources. Applies to all diskfull resources created. Defaults to DfltStorPool.
5 The number of replicas to create for the given volume. A value of 0 will create a replica on all nodes. Defaults to 0.
6 If set to true, ensures that at least one (diskless) replica is deployed on the Cinder Controller host. This is useful for ISCSI transports. Defaults to true.
7 You can specify more generic Cinder options here, for example target_helper = tgtadm for the ISCSI connector.
You can also configure multiple Linstor backends, choosing a different name and configuration options for each.

After configuring the Linstor backend, it should also be enabled. Add it to the list of enabled backends in cinder.conf, and optionally set is as the default backend:

[DEFAULT]
...
default_volume_type = linstor-drbd-volume
enabled_backends = lvm,linstor-drbd
...

As a last step, if you changed the Cinder configuration or updated the driver itself, make sure to restart the Cinder service(s). Please check the documentation for your OpenStack Distribution on how to restart services.

7.3.1. Choice of Transport Protocol

The Transport Protocol in Cinder is how clients (for example nova-compute) access the actual volumes. With Linstor, you can choose between two different drivers that use different transports.

  • cinder.volume.drivers.linstordrv.LinstorDrbdDriver, which uses DRBD as transport

  • cinder.volume.drivers.linstordrv.LinstorIscsiDriver, which uses ISCSI as transport

Using DRBD as Transport Protocol

The LinstorDrbdDriver works by ensuring a replica of the volume is available locally on the node where a client (i.e. nova-compute) issued a request. This only works if all compute nodes are also running Linstor Satellites that are part of the same Linstor cluster.

The advantages of this option are:

  • Once set up, the Cinder host is no longer involved in the data path. All read and write to the volume are handled by the local DRBD module, which will handle replication across its configured peers.

  • Since the Cinder host is not involved in the data path, any disruptions to the Cinder service do not affect volumes that are already attached.

A drawback is the reduced compatibility: Not all hosts and hypervisors support using DRBD volumes. This restricts deployment to Linux hosts and kvm hypervisors.

Using ISCSI as Transport Protocol

导出Cinder卷的默认方法是通过iSCSI。这带来了最大兼容性的优势 – 无论是VMWare、Xen、HyperV还是KVM,iSCSI都可以用于每个虚拟机监控程序。

缺点是,所有数据都必须发送到Cinder节点,由(userspace)iSCSI守护进程处理;这意味着数据需要通过内核/userspace边界,这些转换将消耗一些性能。

Another drawback is the introduction of a single point of failure. If a Cinder node running the iSCSI daemon crashes, other nodes lose access to their volumes. There are ways to configure Cinder for automatic fail-over to mitigate this, but it requires considerable effort.

For ISCSI to work, the Cinder host needs access to a local replica of every volume. This can be achieved by either setting linstor_controller_diskless=True or using linstor_autoplace_count=0.

7.3.2. Verify status of Linstor backends

To verify that all backends are up and running, you can use the OpenStack command line client:

$ openstack volume service list
+------------------+----------------------------------------+------+---------+-------+----------------------------+
| Binary           | Host                                   | Zone | Status  | State | Updated At                 |
+------------------+----------------------------------------+------+---------+-------+----------------------------+
| cinder-scheduler | cinder-01.openstack.test               | nova | enabled | up    | 2021-03-10T12:24:37.000000 |
| cinder-volume    | cinder-01.openstack.test@linstor-drbd  | nova | enabled | up    | 2021-03-10T12:24:34.000000 |
| cinder-volume    | cinder-01.openstack.test@linstor-iscsi | nova | enabled | up    | 2021-03-10T12:24:35.000000 |
+------------------+----------------------------------------+------+---------+-------+----------------------------+

If you have the Horizon GUI deployed, check Admin > System Information > Block Storage Service instead.

In the above example all configured services are enabled and up. If there are any issues, please check the logs of the Cinder Volume service.

7.4. Create a new volume type for Linstor

Before creating volumes using Cinder, you have to create a volume type. This can either be done via the command line:

# Create a volume using the default backend
$ openstack volume type create default
+-------------+--------------------------------------+
| Field       | Value                                |
+-------------+--------------------------------------+
| description | None                                 |
| id          | 58365ffb-959a-4d91-8821-5d72e5c39c26 |
| is_public   | True                                 |
| name        | default                              |
+-------------+--------------------------------------+
# Create a volume using a specific backend
$ openstack volume type create --property volume_backend_name=linstor-drbd linstor-drbd-volume
+-------------+--------------------------------------+
| Field       | Value                                |
+-------------+--------------------------------------+
| description | None                                 |
| id          | 08562ea8-e90b-4f95-87c8-821ac64630a5 |
| is_public   | True                                 |
| name        | linstor-drbd-volume                  |
| properties  | volume_backend_name='linstor-drbd'   |
+-------------+--------------------------------------+

Alternatively, you can create volume types via the Horizon GUI. Navigate to Admin > Volume > Volume Types and click “Create Volume Type”. You can assign it a backend by adding the volume_backend_name as “Extra Specs” to it.

7.4.1. Advanced Configuration of volume types

You can enable advanced configuration options by adding more properties (command line) or “Extra Specs” (Horizon GUI) to the volume type. You can get a list of available options by running:

$ cinder get-capabilities cinder-01.openstack.test@linstor-drbd
+---------------------+------------------------------------------------------------------+
| Volume stats        | Value                                                            |
+---------------------+------------------------------------------------------------------+
| description         | None                                                             |
| display_name        | None                                                             |
| driver_version      | 1.1.0                                                            |
| namespace           | OS::Storage::Capabilities::cinder-01.openstack.test@linstor-drbd |
| pool_name           | None                                                             |
| replication_targets | []                                                               |
| storage_protocol    | DRBD                                                             |
| vendor_name         | LINBIT                                                           |
| visibility          | None                                                             |
| volume_backend_name | linstor-drbd                                                     |
+---------------------+------------------------------------------------------------------+
+---------------------+---------------------------------------+
| Backend properties  | Value                                 |
+---------------------+---------------------------------------+
| compression         | description : Enables compression.    |
|                     | title : Compression                   |
|                     | type : boolean                        |
| qos                 | description : Enables QoS.            |
|                     | title : QoS                           |
|                     | type : boolean                        |
| replication_enabled | description : Enables replication.    |
|                     | title : Replication                   |
|                     | type : boolean                        |
| thin_provisioning   | description : Sets thin provisioning. |
|                     | title : Thin Provisioning             |
|                     | type : boolean                        |
+---------------------+---------------------------------------+

7.5. Using volumes

Once you have a volume type configured, you can start using it to provision new volumes.

For example, to create a simple 1Gb volume on the command line you can use:

openstack volume create --type linstor-drbd-volume --size 1 --availability-zone nova linstor-test-vol
openstack volume list
If you set default_volume_type = linstor-drbd-volume in your /etc/cinder/cinder.conf, you may omit the --type linstor-drbd-volume from the openstack volume create …​ command above.

7.6. Troubleshooting

This section describes what to do in case you encounter problems with using Linstor volumes and snapshots.

7.6.1. Checking for error messages in Horizon

Every volume and snapshot has a Messages tab in the Horizon dashboard. In case of errors, the list of messages can be used as a starting point for further investigation. Some common messages in case of errors:

create volume from backend storage:Driver failed to create the volume.

There was an error creating a new volume. Check the Cinder Volume service logs for more details

schedule allocate volume:Could not find any available weighted backend.

If this is the only error message, this means the Cinder Scheduler could not find a volume backend suitable for creating the volume. This is most likely because:

  • The volume backend is offline. See Verify status of Linstor backends

  • The volume backend has not enough free capacity to fulfil the request. Check the output of cinder get-pools --detail and linstor storage-pool list to ensure that the requested capacity is available.

7.6.2. Checking the Cinder Volume Service

The Linstor driver is called as part of the Cinder Volume service.

Distribution Log location or command

DevStack

journalctl -u devstack@c-vol

TODO

others

7.6.3. Checking the compute srvice logs

Some issues will not be logged in the Cinder Service but in the actual consumer of the volumes, most likely the compute service (Nova). As with the volume service, the exact host and location to check depends on your Openstack distribution:

Distribution Log location or command

DevStack

journalctl -u devstack@n-cpu

TODO

others

8. Docker中的LINSTOR卷

本章介绍docker中由https://github.com/LINBIT/linstor-docker-volume-go[LINSTOR Docker Volume Plugin]管理的LINSTOR卷。

8.1. Docker概述

Docker 是一个以Linux容器的形式开发、发布和运行应用程序的平台。对于需要数据持久性的有状态应用程序,Docker支持使用持久 卷驱动程序

LINSTOR Docker Volume Plugin是一个卷驱动程序,它为docker容器提供来自linstor集群的持久卷。

8.2. 用于Docker安装的LINSTOR插件

To install the linstor-docker-volume plugin provided by LINBIT, you’ll need to have a working LINSTOR cluster. After that the plugin can be installed from the public docker hub.

# docker plugin install linbit/linstor-docker-volume

8.3. 用于Docker配置的LINSTOR插件

As the plugin has to communicate to the LINSTOR controller via the LINSTOR python library, we must tell the plugin where to find the LINSTOR Controller node in its configuration file:

# cat /etc/linstor/docker-volume.conf
[global]
controllers = linstor://hostnameofcontroller

更广泛的例子如下:

# cat /etc/linstor/docker-volume.conf
[global]
storagepool = thin-lvm
fs = ext4
fsopts = -E discard
size = 100MB
replicas = 2

8.4. 示例用法

下面是一些如何使用LINSTOR Docker Volume插件的示例。在下面我们期望一个由三个节点(alpha、bravo和charlie)组成的集群。

8.4.1. 示例1-典型的docker模式

在节点alpha上:

$ docker volume create -d linstor \
             --opt fs=xfs --opt size=200 lsvol
$ docker run -it --rm --name=cont \
             -v lsvol:/data --volume-driver=linstor busybox sh
$ root@cont: echo "foo" > /data/test.txt
$ root@cont: exit

在bravo节点上:

$ docker run -it --rm --name=cont \
             -v lsvol:/data --volume-driver=linstor busybox sh
$ root@cont: cat /data/test.txt
  foo
$ root@cont: exit
$ docker volume rm lsvol

8.4.2. 示例2-按名称分配一个磁盘,两个节点无磁盘

$ docker volume create -d linstor --opt nodes=bravo lsvol

8.4.3. 示例3-一个磁盘完全分配,无论在何处,两个节点无磁盘

$ docker volume create -d linstor --opt replicas=1 lsvol

8.4.4. 示例4-两个按名称分配的磁盘,charlie diskless

$ docker volume create -d linstor --opt nodes=alpha,bravo lsvol

8.4.5. 示例5-两个磁盘满分配,无论在何处,一个节点无磁盘

$ docker volume create -d linstor --opt replicas=2 lsvol

9. Exporting Highly Available Storage using LINSTOR Gateway

LINSTOR Gateway manages highly available iSCSI targets and NFS exports by leveraging on LINSTOR and Pacemaker. Setting up LINSTOR – including storage pools and resource groups – as well as Corosync and Pacemaker’s properties are a prerequisite to use this tool.

9.1. Requirements

9.1.1. LINSTOR

A LINSTOR cluster is required to operate LINSTOR Gateway. It is highly recommended to run the LINSTOR controller as a Pacemaker resource. This needs to be configured manually. Such a resource could look like the following: (optional)

primitive p_linstor-controller systemd:linstor-controller \
        op start interval=0 timeout=100s \
        op stop interval=0 timeout=100s \
        op monitor interval=30s timeout=100s

For both iSCSI and NFS, storage pool , resource group and a volume group for LINSTOR Gateway needs to be present. Lets do it;

Creating the storage pool in 3 nodes using the phsical device /dev/sdb;

linstor physical-storage create-device-pool --pool-name lvpool LVM LINSTOR1 /dev/sdb --storage-pool lvmpool
linstor physical-storage create-device-pool --pool-name lvpool LVM LINSTOR2 /dev/sdb --storage-pool lvmpool
linstor physical-storage create-device-pool --pool-name lvpool LVM LINSTOR3 /dev/sdb --storage-pool lvmpool

We also need resource groups and volume groups;

linstor rg c iSCSI_group --storage-pool lvmpool --place-count 2
linstor rg c nfs_group --storage-pool lvmpool --place-count 3
linstor vg c iSCSI_group
linstor vg c nfs_group

For a more detailed explanation of the storage pool, resource group and volume group creation, check the LINSTOR user guide

9.1.2. Pacemaker

A working Corosync/Pacemaker cluster is expected on the machine where LINSTOR Gateway is running. The drbd-attr resource agent is required to run LINSTOR Gateway. This is included in LINBIT’s drbd-utils package for Ubuntu based distributions, or the drbd-pacemaker package on RHEL/CentOS. LINSTOR Gateway sets up all required Pacemaker resource and constraints by itself, except for the LINSTOR controller resource.

9.1.3. iSCSI & NFS

LINSTOR Gateway uses Pacemaker’s ocf::heartbeat:iSCSITarget resource agent for its iSCSI integration, which requires an iSCSI implementation to be installed. Using targetcli is recommended.

For iSCSI, please install targetcli.

yum install targetcli

For nfs, nfs-server needs to be enabled and ready;

systemctl enable --now nfs-server

9.2. Preparation

First, let’s check that all the components are available. This guide assumes you already installed and configured a LINSTOR cluster. Volume Group, Storage Pool and Resource Group should be defined before using linstor-iscsi or linstor-nfs.

Tools need to be present in server;

  • linstor-client (managing the LINSTOR cluster)

  • drbd-attr resource agent (part of drbd-utils in Debian/Ubuntu, and part of drbd-pacemaker for other Linux distrubutions)

  • targetcli (for iSCSI)

  • nfs-utils, nfs-server

  • pcs or crmsh for pacemaker client. (checking the status of the iSCSI or nfs targets)

9.3. Checking the Cluster

Check the LINSTOR cluster status with;

[root@LINSTOR1 ~]# linstor n l
╭────────────────────────────────────────────────────────────╮
┊ Node     ┊ NodeType  ┊ Addresses                  ┊ State  ┊
╞════════════════════════════════════════════════════════════╡
┊ LINSTOR1 ┊ COMBINED  ┊ 172.16.16.111:3366 (PLAIN) ┊ Online ┊
┊ LINSTOR2 ┊ SATELLITE ┊ 172.16.16.112:3366 (PLAIN) ┊ Online ┊
┊ LINSTOR3 ┊ SATELLITE ┊ 172.16.16.113:3366 (PLAIN) ┊ Online ┊
╰────────────────────────────────────────────────────────────╯

Check the LINSTOR storage pool list with;

root@LINSTOR1 ~]# linstor sp l
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node     ┊ Driver   ┊ PoolName ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ LINSTOR1 ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ LINSTOR2 ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊
┊ DfltDisklessStorPool ┊ LINSTOR3 ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊
┊ lvmpool              ┊ LINSTOR1 ┊ LVM      ┊ lvpool   ┊    10.00 GiB ┊     10.00 GiB ┊ False        ┊ Ok    ┊
┊ lvmpool              ┊ LINSTOR2 ┊ LVM      ┊ lvpool   ┊    10.00 GiB ┊     10.00 GiB ┊ False        ┊ Ok    ┊
┊ lvmpool              ┊ LINSTOR3 ┊ LVM      ┊ lvpool   ┊    10.00 GiB ┊     10.00 GiB ┊ False        ┊ Ok    ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

LINSTOR resource group list (please do not forget to create volume group for the resource group with: linstor vg c iscsi_group)

[root@LINSTOR1 ~]# linstor rg l
╭────────────────────────────────────────────────────────────────╮
┊ ResourceGroup ┊ SelectFilter            ┊ VlmNrs ┊ Description ┊
╞════════════════════════════════════════════════════════════════╡
┊ DfltRscGrp    ┊ PlaceCount: 2           ┊        ┊             ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ iscsi_group   ┊ PlaceCount: 2           ┊ 0      ┊             ┊
┊               ┊ StoragePool(s): lvmpool ┊        ┊             ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ nfs_group     ┊ PlaceCount: 3           ┊ 0      ┊             ┊
┊               ┊ StoragePool(s): lvmpool ┊        ┊             ┊
╰────────────────────────────────────────────────────────────────╯

Check and disable stonith. Stonith is a technique for fencing node in clusters. We have 3 nodes here so quorum will be used instead of fencing.

pcs property set stonith-enabled=false
pcs property show

Check the pacemaker cluster health

[root@LINSTOR1 ~]# pcs status
Cluster name: LINSTOR
Cluster Summary:
  * Stack: corosync
  * Current DC: LINSTOR1 (version 2.0.5.linbit-1.0.el7-ba59be712) - partition with quorum
  * Last updated: Wed Mar 24 21:24:10 2021
  * Last change:  Wed Mar 24 21:24:01 2021 by root via cibadmin on LINSTOR1
  * 3 nodes configured
  * 0 resource instances configured

Node List:
  * Online: [ LINSTOR1 LINSTOR2 LINSTOR3 ]

Full List of Resources:
  * No resources

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

9.4. Setting up iSCSI target

Now, everything looks good, let’s start creating our first iSCSI lun. linstor-iscsi tool will be used for all iSCSI related actions. Please check “linstor-iscsi help” for detailed usage. At first it creates a new resource within the LINSTOR system under the specified name and using the specified resource group. After that it creates resource primitives in the Pacemaker cluster including all necessary order and location constraints. The Pacemaker primitives are prefixed with p_, contain the resource name and a resource type postfix.

linstor-iscsi create --iqn=iqn.2021-04.com.linbit:lun4 --ip=172.16.16.101/24 --username=foo --lun=4 --password=bar --resource-group=iSCSI_group --size=1G

This command will create a 1G iSCSI disk with the provided username and password in the resource group defined iSCSI_group DRBD and pacemaker resources will be automatically created by linstor-iscsi. You can check the pacemaker resources with “pcs status” command.

[root@LINSTOR1 ~]# linstor-iscsi list
+-----------------------------+-----+---------------+-----------+--------------+---------+
|             IQN             | LUN | Pacemaker LUN | Pacemaker | Pacemaker IP | LINSTOR |
+-----------------------------+-----+---------------+-----------+--------------+---------+
| iqn.2020-06.com.linbit:lun4 |   4 |       ✓       |     ✓     |      ✓       |    ✓    |
+-----------------------------+-----+---------------+-----------+--------------+---------+

9.5. Deleting iSCSI target

The following command will delete the iSCSI target from pacemaker as well as the LINSTOR cluster;

linstor-iscsi delete -i iqn.2021-04.com.linbit:lun4 -l 4

9.6. Setting up NFS export

Before creating the nfs exports you need to tell LINSTOR that filesystem will be used for NFS exports will be EXT4. in order to do that, we’ll apply a property to the resource group of NFS resources simply by typing;

linstor rg set-property nfs_group FileSystem/Type ext4

The following command will create a NFS export in the cluster. At first it creates a new resource within the LINSTOR system under the specified name and using the specified resource group. After that it creates resource primitives in the Pacemaker cluster including all necessary order and location constraints. The Pacemaker primites are prefixed with p_, contain the resource name and a resource type postfix.

linstor-nfs create --resource=nfstest --service-ip=172.16.16.102/32 --allowed-ips=172.16.16.0/24 --resource-group=nfs_group --size=1G

You can simply list the nfs exports with the command below;

[root@LINSTOR1 ~]# LINSTOR-nfs list
+---------------+------------------+-----------------------+------------+------------+
| Resource name | LINSTOR resource | Filesystem mountpoint | NFS export | Service IP |
+---------------+------------------+-----------------------+------------+------------+
| nfstest       |        ✓         |           ✓           |     ✓      |     ✓      |
+---------------+------------------+-----------------------+------------+------------+

9.7. Deleting NFS Export

The following command will delete the nfs export from pacemaker as well as the LINSTOR cluster;

[root@LINSTOR1 ~]# linstor-nfs delete -r nfstest

1. 如果主机也是存储节点,则它将使用映像的本地副本(如果该副本可用)。