If you need to run stateful applications and workloads in Kubernetes, you will need to make use of Kubernetes persistent volumes. Although Kubernetes might be more well known for and first associated with running stateless applications, infrastructure for persisting data in Kubernetes has been in its code from its early stages.
When persisting data in Kubernetes, one of the first forks in the road is whether or not you as an administrator will provision storage in your cluster manually or dynamically. Provisioning persistent storage manually in Kubernetes means creating persistent volumes as you go. Otherwise, you can have Kubernetes dynamically provision them by using two other Kubernetes resources: the storage class and the persistent volume claim.
For flexibility and scalability, configuring a storage class and having Kubernetes dynamically provision persistent volumes from the storage class as your workloads demand them is the way to go. This is where LINSTOR® in Kubernetes can help you.
This article will give you an overview of how you can use LINSTOR-backed storage classes and persistent volume claims, to dynamically provision persistent volumes that are highly available in your cluster without dragging storage I/O performance or compute resources down.
For readers who already know this stuff
If you are already familiar with storage topics in Kubernetes you can skip the conceptual background section in this article. If you want to affirm a curiosity about using LINSTOR in Kubernetes before diving in, you can skip ahead to the “Features and benefits” section. If you just want to get started with an enterprise-grade, world-wide deployed, open source solution for managing and dynamically deploying Kubernetes persistent volumes, you can skip to the “Next steps” section, and download the official LINBIT “Kubernetes Persistent Storage Using LINSTOR Quick Start Guide”.
Conceptual background
For some background, before you begin a journey with LINSTOR in Kubernetes, it is critical that you understand the Kubernetes concepts of StorageClass, PersistentVolume, and the closely related concept of PersistentVolumeClaim. I do not think I can do a better job explaining those concepts than the Kubernetes documentation does. If you are unfamiliar with them, or if you want to refresh your understanding with words direct from the maintainers, follow the links.
You can use the storage class resource to describe different kinds of storage that you want to make available to your workloads in Kubernetes. These might relate to tiered-storage offerings, for example, slower or faster local backing storage. Different kinds of storage might also mean different back-end storage technologies, such as NFS, iSCSI, or make use of the Container Storage Interface (CSI) framework to expose block or file storage systems to containerized workloads through (enter LINSTOR) third-party plugins or drivers. The CSI framework ensures that although you might be turning your Kubernetes volume management over to third-party software, that third-party software integrates with Kubernetes according to standards and best practices dictated by the Kubernetes-endorsed CSI spec.
How persistent volumes enable stateful applications in Kubernetes
The mechanism through which persistent volumes enable stateful applications in Kubernetes, where workloads run in container instances that might come and go, is volume bind mounting. This concept should be familiar if you have used the docker run --volume [...] command before. Volume bind mounting allows a container to read and write data to and from storage “outside” the ephemeral environment of the container itself. That outside storage can either be local storage on the node hosting the container, or else storage that the host node can somehow connect to over a network.
The reasoning behind using software-defined storage in Kubernetes
If you only have a few applications that you run in Kubernetes that need persistent storage, you might be fine provisioning volumes as you need them. However, that is not a typical enterprise use case. An organization that has chosen Kubernetes in the first place has likely done so because it needs to run many containerized workloads and needs a centralized platform to manage them. Also, enterprises will usually have in mind scaling operations and deployments as they plan for future growth.
This is untenable for the provisioned-by-an-administrator approach to storage volumes. This is where a software-defined storage (SDS) solution such as LINSTOR can help.
SDS abstracts storage assets and their management life cycle so that applications can consume storage to preconfigured requirements, as they need to, without administrator intervention. This is truly abstracted storage orchestration for abstracted container and namespace orchestration.
Features and benefits of LINSTOR in Kubernetes
LINSTOR has many features and benefits which make it a worthy persistent storage management solution for Kubernetes deployments. The following are some highlights. This is not an exhaustive list.
Highly available persistent volumes in Kubernetes
It is not a strict requirement but most people use LINSTOR to manage DRBD® replicated storage for high availability use cases. By virtue of using DRBD synchronously replicated volumes in Kubernetes, stateful applications will have perfect fault-tolerant data replicas in the cluster.
You can completely lose a storage node, for example, and LINSTOR in Kubernetes can support seamlessly failing storage over to another healthy node or nodes with perfect data replicas, as quickly as Kubernetes pod rescheduling timeouts permit.
Data locality in Kubernetes
Another benefit of using LINSTOR-managed DRBD replicated volumes in Kubernetes is data locality. By using LINSTOR auxiliary properties an administrator can label cluster nodes and place constraints on storage resources so that resources are placed on nodes matching an administrator’s criteria. This can ensure that persistent storage resources are provisioned in proximity to their compute resources, while also providing administrators control of where volume replicas are provisioned within the storage cluster.
DRBD and LINSTOR also allow so-called “diskless” access clients, so a node in the cluster might access storage resources over a network connection. This can help reduce your deployment costs, and can even provide better performance for certain I/O patterns, when compared to other I/O over the network technologies, such as iSCSI.
Volume expansion in Kubernetes
LINSTOR, with its CSI driver for Kubernetes, supports volume expansion if you use LVM or ZFS thin-provisioned volumes with XFS, Ext3, or Ext4 file systems. You can then specify allowVolumeExpansion: true in a LINSTOR-backed storage class configuration and Kubernetes will take care of growing PVCs created from that storage class as needed to support the demands of your workloads.
Supporting different Kubernetes storage access modes
Persistent volume claims backed by LINSTOR storage classes in Kubernetes support different access modes. The most commonly used access modes for LINSTOR PVCs are ReadWriteOnce and ReadWriteMany.
High performance storage in Kubernetes
LINSTOR and DRBD replication under the hood have consistently shown great I/O performance, not only in LINBIT® run tests but also by independent testers. The LINBIT team achieved its highest recorded IOPS number, over 25.5 million IOPS, from a LINSTOR in Kubernetes cluster. An independent tester also came to favorable performance conclusions about LINSTOR when running benchmark comparisons of different Kubernetes SDS solutions.
DRBD replication is designed with performance in mind and particularly excels when coupled with random write I/O patterns, the kind of I/O patterns typically associated with databases, messaging queues, transactional systems, real-time analytics, and event processing. The intent is to replicate data for high availability use cases while not creating a drag on your applications and workloads.
Avoiding vendor lock-in by choosing an open source solution
Because of the convenience of access to the official LINBIT container repository and LINBIT world-class support, LINSTOR in Kubernetes, rather than Piraeus Datastore, is the recommended enterprise deployment choice. However, LINSTOR in Kubernetes is based on the upstream open source cloud native project, Piraeus Datastore. Piraeus Datastore pulls in open source LINBIT software such as LINSTOR, DRBD, DRBD Reactor, and others, for deployment in Kubernetes. By choosing to use open source software, you have the freedom to copy, fork, and change the code, as you might need to, and you will have the peace of mind in knowing that the solution cannot be taken away from you.
Supporting COTS hardware for cost savings
Beyond the freedom of software licensing model, LINSTOR tries to make few demands on the hardware you deploy it on. Memory and CPU consumption are particularly modest and LINBIT software can run on and work with a variety of commercial off-the-shelf (COTS) systems, storage media, and network infrastructure. LINBIT Solutions Architect, Matt Kereczman, even wrote about running LINSTOR in Kubernetes on a low-power single-board computer cluster.
You can learn more about a homelabber’s experience with Piraeus Datastore in an article on the LINBIT blog, “Homelabbing LINBIT Storage Solutions “. By using CPU and memory resources frugally, and trying to be agnostic about your storage hardware choices, running LINSTOR in Kubernetes can help reduce your hardware expenses.
Enforcing security when deploying LINSTOR in Kubernetes
LINSTOR in Kubernetes supports encryption at rest and in transit. By using LINSTOR to manage a LUKS layer in your LINSTOR storage back-end stack, you can have encryption for your data at rest. You can encrypt DRBD replication traffic in transit between cluster nodes, and encrypt communication between the LINSTOR control and data planes, by using SSL/TLS. You can also use LDAP or systemd credentials to authenticate LINSTOR administrators.
For more information about the security topic in LINSTOR, see the “Security Best Practices For LINSTOR Software-Defined Storage Clusters” LINBIT blog article.
Supporting disaster recovery for persistent data in Kubernetes
LINSTOR supports taking and shipping snapshots of LVM or ZFS logical thin-provisioned volumes to support disaster recovery. An administrator can ship snapshots to other nodes within a LINSTOR cluster, to a different remote LINSTOR cluster, or to S3-compatible object storage, such as AWS S3, MinIO, Storj, CloudCasa, or others. When needed, an administrator can use shipped snapshots to restore data from known good points in time, as part of a disaster recovery plan.
Data autonomy
Throughout the lifecycle of a LINSTOR in Kubernetes deployment, the data that you trust to put on LINSTOR-managed storage remains yours. Data on LINSTOR provisioned storage is accessible, retrievable, or recoverable, using standard tools familiar to Linux administrators. Provided that the underlying physical storage media is healthy, a Linux administrator can access data, whether or not LINSTOR is running.
Next steps for getting started with LINSTOR in Kubernetes
If the LINSTOR in Kubernetes features and benefits excited you, or if you already knew this stuff and just want to learn where to start, you have a few possibilities.
Enterprise users might want to download the “Kubernetes Persistent Storage Using LINBIT SDS Quick Start” how-to guide. This guide has step-by-step instructions for deploying LINSTOR in Kubernetes with enterprise deployments specifically in mind. To follow the guide, you will need access to the official LINBIT container registry. You can request evaluation access by contacting the LINBIT team.
If you have other interests in LINSTOR in Kubernetes, you might want to start by installing the upstream Cloud Native Computing Foundation (CNCF) sandbox project, Piraeus Datastore. The Piraeus Datastore documentation has tutorials and how-to guides to give you more starting points, after deploying.
You can install Piraeus Datastore or LINSTOR in different Kubernetes versions, including in minikube, if you want a quick and easy way to test the waters.
LINSTOR and Piraeus Datastore both use an Operator to ease deploying in Kubernetes. By using an Operator, you are only two kubectl commands away from installing either LINSTOR or Piraeus Datastore in Kubernetes.
Conclusion
Using LINSTOR helps you abstract and manage persistent block storage resources in Kubernetes to support running stateful applications at scale. It is a low-resource consuming, high-performing solution designed for enterprise environments. Its features include high availability of storage, data autonomy, disaster recovery support, open source flexibility, and flexibility of hardware infrastructure. Getting started is straightforward with comprehensive guides and integration options that are just links away.
If you might need help on your journey, reach out to the LINBIT team or join the LINBIT Community Forum.