The Technology Inside LINSTOR (Part II)

Linstor technology

In our first look into LINSTOR, you learned a lot about the single communication protocol, transaction safety and modularity features. In the next chapter, you can dive deeper into the construction.

Fault Tolerance

Keeping the software responsive is one of the more difficult problems that we have to deal with in LINSTOR’s design and implementation. The Controller/Satellite split is one fundamental part of LINSTOR’s design toward fault tolerance. But there are many other design and implementation details that improve the software’s robustness. Many of them are virtually invisible to the user.

On the Controller side, communication and persistence are the two main areas that can lead to the software becoming unresponsive. The following problems could lead to an unusable network communication service on the Controller side:

  • Stopping or reconfiguring a network interface
  • Address conflicts
  • In-use TCP/IP ports

All network I/O in LINSTOR is non-blocking. This means that unresponsive network peers don’t lead to a lockup of LINSTOR’s network communication service. The network communication service is designed to recover from many kinds of problems. However, it additionally allows the use of multiple independent network connectors. This means the system remains accessible even in the case where a network connector requires reconfiguration to recover. The network connectors can also stop and start independently, allowing the reinitialization of failed connectors.

The Controller can obviously not continue normal operation while the database service is inoperative. This could of course happen when using an external database, for example, due to a downtime of the database server or due to a network problem. Once the database service becomes available again, the Controller will recover automatically, without requiring any operator intervention.

Satellites in LINSTOR

The Satellite side of LINSTOR does not run a database. And a single unresponsive Satellite is less critical for the system as a whole than an unresponsive Controller. Nonetheless, if a Satellite encounters a failure during the configuration of one storage resource, that should still not temporarily prevent it from being able to service requests for the configuration of other resources.

The biggest challenge regarding fault tolerance on the Satellite side is the fact that the Satellite interacts with lots of external programs and processes that are neither part of LINSTOR nor under the direct control of the Satellite process. These external components include system utilities required for the configuration of backend storage, such as LVM or ZFS commands. It also includes processes observing events generated by the DRBD kernel module whenever the state of a resource changes. Block device files that appear or disappear when storage devices are reconfigured, and similar kinds of objects.

To achieve fault tolerance on the Satellite side, the software deals with many possible external malfunctions. This includes the time-boxing and the enforcement of size limits on the amount of read-back data when executing external processes. Also, recovery procedures that attempt to abort external processes that have become unresponsive. Theres even a fallback that reports a malfunctioning operating system kernel. This happens when the operating system is unable to end an unresponsive process. The LINSTOR code also contains a mechanism that can run critical operations. This includes the attempt to open a device file (which may block due to faulty operating system drivers) asynchronously. Even if the operation blocks, LINSTOR would normally at least be able to detect and report the problem.

Usability of LINSTOR Technology

With feature richness, customizability and flexibility also come complexity. All that can be done to make the system as easy to understand and use as possible is this. Attempt to make the system intuitive, self-explaining and unambiguous.

Clarity in the naming scheme of objects turned out to be an important factor in a user’s ability to use the software intuitively. In our previous product, drbdmanage, users would typically look for commands to either create a “resource” or a “volume.” However, the corresponding commands, “new-resource” and “new-volume”, only define a resource and its volumes, but do not actually create storage resources on any of the cluster nodes. Another command, “assign”, was required to assign the resource to cluster nodes, thereby creating the actual storage resource, and users sometimes had a hard time finding this command.

For this reason, the naming of objects was changed in LINSTOR. A user looking for a command to create a resource will find the command that actually creates a storage resource, and one of the required parameters for this command is the so-called resource definition. It is quite obvious that the next step would be to look for a command that creates a resource definition. This kind of naming convention is supposed to make it easier for users to figure out how to intuitively use the application.

LINSTOR is also explicit with replies to user commands, as well as with return codes for API calls. The software typically replies with a message that describes whether or not the command was successful, what the software did, and to which objects the message refers. Error messages that include a description of the problem cause or hints for possible correction measures also follow a uniform structure.

Similar ideas also apply to return codes, which include not only the error code (e.g., Object exists), but also information on what objects the error refers to (e.g., the type of object and the identifier specified by the user).

Reporting System

LINSTOR also generates a unique identifier for every logged error to make diagnosing errors easier. The traditional logging and error reporting on Unix/Linux systems basically consist of single text lines logged to one large logfile, sometimes even a single logfile for many different applications. An application could log multiple lines for each error, but support for logging multiple lines atomically (instead of interleaved with log lines for other errors, possibly from other applications) is virtually nonexistent.

For this reason, LINSTOR logs a single-line short description of the error, including the error identifier, to the system log, but also logs the details of the error to a report file that can be found using the error identifier. The detailed log report also contains information such as where the error occurred, the exact version of the software, debug information, nested errors, and many other details that may help with problem mitigation.

LINSTOR Technology: Implementation Quality

While the various design characteristics are important factors for creating a powerful and robust software system, even the best design cannot produce a reliable application if it is not implemented with high quality.

The first step, before writing code, was to choose a programming language that would be suitable for the task. Our previous product, drbdmanage, and the current LINSTOR client are implemented in Python. However, the LINSTOR server-side components (the Controller and Satellite) are implemented in Java. A server application that manages highly available storage systems takes more consideration than the typical single-user desktop application. Java is a very strict programming language that provides strong static typing and checked exceptions. It allows only a few implicit type conversions. These are all features that also enable IDEs to perform static checking of the code in progress.

While it can make writing high-quality code easier, the programming language choice doesn’t automatically improve code. To keep LINSTOR’s code clean, readable, self-explaining and maintainable, we apply many of the best practices that have proven successful in the creation of mission-critical software systems. This includes more important things like choosing descriptive variable names. It also includes maintaining a clear and logical control flow. And it even extends to less technical details like consistent formatting of the source code. The coding standard that we apply to produce high-quality code is based on standards from the aviation industry and is among the strictest coding standards that exist today.

Easy Validity Checks

There is also a strong focus on correctness and strict checking in the implementation of LINSTOR. As an example, the name of objects like nodes, resources or storage pools is not simply a String. They’re an object that can only be constructed with a name that is valid for that kind of object. It is impossible to create a resource name object that contains invalid characters. It’s also impossible to accidentally use a resource name object as the identifier for a storage pool. As a result, developers cannot forget to perform a validity check on a node name or on a volume number. They also cannot apply the wrong check by accident.

All those considerations, design characteristics, and implementation methods are important factors that helped us create dependable and user-friendly software. It’s software that we hope will prove useful and valuable to its users like you.

Share on facebook
Share on twitter
Share on linkedin
Share on reddit
Share on whatsapp
Share on vk
Share on email

Share this post

Robert Altnoeder

Robert Altnoeder

Robert joined the LINBIT development team in 2013. He had worked with DRBD at a startup company in the SaaS field before joining LINBIT. His current primary field of work is the architecture and implementation of LINSTOR, the cluster management component of LINBIT's SDS software.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.