Understanding LINSTOR Gateway

2024-05-22 16:33:40 UTC

This document outlines some of the basic knowledge that is required to effectively operate and administer a storage cluster that relies on LINSTOR Gateway.

It also provides some insight into the design decisions that were made while implementing LINSTOR Gateway, and gives an overview of how its internals work.

1. Core Concepts and Terms

This section introduces and explains some basic concepts that you will encounter in a LINSTOR Gateway cluster. This chapter will also explain the terminology used when discussing or working with LINSTOR Gateway.

1.1. LINSTOR Concepts

Since LINSTOR Gateway is integrated into the LINSTOR ecosystem, you will need to understand the basic parts that make up a LINSTOR cluster.

Refer to the “Concepts and Terms” section of the LINSTOR User’s Guide to learn more if you are not yet familiar.

1.2. Servers, Clients, and Agents

Generally speaking, within a LINSTOR Gateway cluster, there are three different kinds of roles that a node can take. It can either be used as 1. a server, 2. a client, or 3. an agent.

It is also possible for a node to take more than one role. Any combination of the three roles is possible. This section outlines what sets these roles apart, both conceptually and in terms of the software components used.

1.2.1. Server

LINSTOR Gateway Server diagram

The server role is LINSTOR Gateway’s main mode of operation. The main task of the server component is to communicate with the LINSTOR controller.

It is important to understand that the LINSTOR Gateway server does not store any state information itself. It does not matter on which machine it runs, provided that it can reach the LINSTOR controller.

1.2.2. Client

LINSTOR Gateway client

This is any piece of software that interacts with LINSTOR Gateway via LINSTOR Gateway’s REST API. In most cases, this will be the command line client that is included with LINSTOR Gateway.

It is the interface between LINSTOR Gateway and the user.

Even if there are multiple LINSTOR Gateway servers in a cluster, the client only ever talks to one of them.

1.2.3. Agent

The agent is a more abstract role for a node, in the sense that it does not necessarily run any part of the LINSTOR Gateway software itself. Instead, it contains the software components that LINSTOR Gateway uses to provide its services, for example:

  • targetcli for iSCSI targets

  • nfs-server to provide NFS exports

  • nvmetcli to create NVMe-oF targets

To be a proper part of the cluster, an agent node also requires the essential components of a LINBIT SDS software stack, such as:

  • The DRBD kernel module

  • A LINSTOR cluster node that runs the linstor-satellite service

  • DRBD Reactor, to be able to run highly available resources

The health check integrated into the LINSTOR Gateway Client binary can be used to identify which components are still missing or misconfigured on a given node.

2. Architecture of a Cluster

LINSTOR – and by extension LINSTOR Gateway – clusters can come in a variety of sizes and implementations, depending on the circumstances of the surrounding environment and the intended use cases.

This chapter outlines some considerations to take into account when designing and planning a LINSTOR Gateway cluster.

Depending on the architecture of the cluster, there might be multiple instances of the LINSTOR Gateway server software or only one. There are advantages and disadvantages to both approaches. These are the routes you can take:

  1. Only one LINSTOR Gateway server: This is the simplest setup. If the LINSTOR controller software always runs on the same node, it makes sense to place the LINSTOR Gateway server on that same node.
    The advantage of this approach is that the LINSTOR Gateway server will need no further configuration to find the LINSTOR controller.
    The disadvantage is that you will need to configure your LINSTOR Gateway client so that it finds the LINSTOR Gateway server.

  2. Multiple LINSTOR Gateway servers: If your LINSTOR controller can move between nodes (for example, because you made it highly available by using DRBD Reactor), it can be beneficial to run a LINSTOR Gateway server on every node where the LINSTOR Controller could potentially run.

2.1. Choosing the Right Transport

LINSTOR Gateway supports different storage transport protocols. This section briefly explains the differences between these options and when you might use one over the other.

For more detailed information about a specific transport, refer to its documentation.

2.1.1. iSCSI Targets

iSCSI is a transport protocol that allows SCSI traffic to be sent via TCP. The standard has seen wide use since its inception in the early 2000s, so it has often been viewed as the “default choice” for network-attached storage.

LINSTOR Gateway uses an iSCSI target implementation that is included in the Linux kernel, LIO.

2.1.2. NVMe-oF Targets

NVMe over Fabrics is a much newer standard relative to iSCSI. It allows routing NVMe traffic over several different physical transports, such as RDMA, Ethernet (TCP/IP) or Fibre Channel.

Linux kernel support for NVMe over Fabrics targets is — generally speaking — more actively maintained than support for iSCSI target implementations. Using NVMe-oF might lead to throughput improvements in a storage cluster, especially when using modern high-performance hardware.

LINSTOR Gateway uses the NVMe target implementation bundled with the Linux kernel.

2.1.3. NFS Exports

The Network File System (NFS) serves a different purpose than iSCSI and NVMe-oF.

Rather than transmitting block-level data over the network, NFS is a distributed file system. NFS exports are often used to share directories across a network. One common use case would be providing images of operating system installation media to virtualization hosts.

LINSTOR Gateway currently only supports a relatively limited mode of NFS operation, without user management capabilities. All files on the share are readable and writable by any user.

LINSTOR Gateway uses the NFS server implementation that is included with Linux.

2.2. The LINSTOR Gateway Server

LINSTOR Gateway’s central component is a REST API server that acts as a relay to the LINSTOR API.

The LINSTOR Gateway server does not carry any state information of its own. This means that you can run an arbitrary number of server instances in your cluster. It does not matter which LINSTOR Gateway server you send commands to, provided that server can communicate with the LINSTOR controller.

However, it makes sense to strategically choose the location(s) of the LINSTOR Gateway Server(s) so that it stays reachable in case of node failure.

2.2.1. General Guidelines for Deploying a LINSTOR Gateway Server

Here are two guidelines for all LINSTOR Gateway deployments:

  1. On every node where a LINSTOR controller could potentially run, also run a LINSTOR Gateway server.

  2. Configure every LINSTOR Gateway server so that it knows the location of every potential LINSTOR controller node.

Some example real-world deployment scenarios follow to show how these rules apply.

2.2.2. Fixed LINSTOR Controller

This is the simplest possible setup for a LINSTOR cluster. There is one node that is chosen to be the LINSTOR controller node, and the LINSTOR controller service can only run on this particular node.

Fixed Controller Cluster
C in the diagram above stands for “(LINSTOR) Controller”, while S stands for “(LINSTOR) Satellite”.

In this case, the rules from above boil down to one trivial instruction:

  1. Run the LINSTOR Gateway server on node-1.

You can omit the second rule because the only possible node that the LINSTOR controller can run on is node-1. From the perspective of node-1, this is equivalent to localhost. As it so happens, the default place the LINSTOR Gateway server searches for the LINSTOR controller is localhost, so the default configuration is sufficient here.

This setup is very simple to implement, but it has an important drawback: the LINSTOR controller on node-1 is the single point of failure for the cluster’s control plane. If node-1 were to go offline, you would lose the ability to control the storage in your cluster.

It is important to distinguish between the control plane and the data plane. Even in this situation, with the sole controller node offline, your data stays available. It is the ability to create, modify, and delete storage resources that is lost.

To address this issue and make the cluster more robust against node failure, a highly available LINSTOR controller can be configured.

2.2.3. Highly Available LINSTOR Controller

Multi Controller Cluster

Here, the picture is different: The LINSTOR controller is currently running on node-1. However, if node-1 should fail, one of the other nodes can take over. This configuration is more complex, but it makes sense in environments where it is critical that the LINSTOR controller stays available at all times.

To learn more about how to configure this mode of operation, refer to the “Creating a Highly Available LINSTOR Cluster” section of the LINSTOR User’s Guide.

In a cluster such as this, the general rules outlined above resolve to these instructions:

  1. Run the LINSTOR Gateway Server on node-1, node-2, and node-3

  2. On each of the nodes, configure the LINSTOR Gateway Server such that it looks for the LINSTOR controller on node-1, node-2, or node-3.

When the LINSTOR Gateway server tries to contact the LINSTOR controller service, it first searches its list of configured potential LINSTOR controller nodes by sending a dummy request to each of the nodes. The first node that responds correctly is considered the currently active LINSTOR controller.

3. Under the Hood

This section elaborates on the mechanisms that make LINSTOR Gateway work, and discusses some design decisions that were taken while building it.

3.1. The Software Stack

The section discusses the underlying software solutions LINSTOR Gateway builds on.

3.1.1. DRBD

In order for LINSTOR Gateway to be able to export highly available storage, it must be able to create said highly available storage in the first place.

To do this, it relies on the bottom-most part of the LINBIT software stack: DRBD, a “tried and true” solution for data replication and high availability clusters.

If you are not yet familiar with DRBD and want to learn more, you can refer to its User’s Guide. A deep understanding of DRBD is not required to use LINSTOR Gateway effectively, but it helps – especially when troubleshooting should any issues arise.

For a shortened – but also massively oversimplified – explanation, you can think of DRBD as a sort of “software RAID-1 over the network”.

3.1.2. LINSTOR

Highly available storage via DRBD is great, but it can be difficult to manage and manipulate resources at scale. To handle this – and many other orchestration tasks – LINSTOR was created.

As the name might imply, LINSTOR Gateway heavily relies on LINSTOR to create the DRBD resources that can later be exported as highly available storage.

Knowledge and experience with handling LINSTOR clusters helps immensely with using and understanding LINSTOR Gateway, as most of the required administration tasks are carried out within LINSTOR.

3.1.3. DRBD Reactor

The piece of software that enables LINSTOR Gateway to actually export the LINSTOR resources it created as highly available storage is DRBD Reactor, more specifically its promoter plugin.

DRBD Reactor – again, as the name implies – reacts to DRBD events. DRBD already includes sophisticated state management mechanisms to ensure that only one node in the cluster has write access to a resource at a time. In the promoter plugin, we use this to our advantage to make sure that a particular service is always started on the same node that is allowed to write to the resource.

This might remind you of other cluster managers, such as Pacemaker. In fact, DRBD Reactor was inspired by such cluster managers and its goals and features overlap with some of Pacemaker’s.

However, administrators often face difficulties when implementing highly available clusters using these cluster managers in combination with DRBD, due to the sheer complexity of such a system.

This is why the DRBD Reactor promoter plugin was intentionally designed as a very simple cluster manager that is useful with minimal configuration needed for existing DRBD deployments.

3.1.4. Resource Agents

The DRBD Reactor promoter plugin needs a service to keep highly available. Fortunately, it was built with cross-compatibility to Pacemaker in mind. Because of this, Pacemaker’s OCF Resource Agents are supported in DRBD Reactor.

This is very convenient because it allows LINSTOR Gateway to re-use existing code that makes services such as iSCSI targets and NFS exports highly available.

3.1.5. Putting It All Together

Finally, this is where LINSTOR Gateway comes in.

In a nutshell, LINSTOR Gateway actually only does two things:

  1. Creates a LINSTOR resource for highly available, replicated storage.

  2. Generates a DRBD Reactor configuration file that starts the highly available service we want to provide (for example, an iSCSI target).

Of course, there are still challenges associated with “doing it right” for every possible use-case and context that highly available storage might be used in.

This is why LINSTOR Gateway aims to automate away as much of the setup and maintenance work as possible, so that the administrator can focus on using their highly available storage effectively.