Filling the Gap: LINBIT SDS in Amazon EKS

Reader Bootstrapping (Introduction):

Amazon Elastic Kubernetes Service (EKS) might be one of the quickest routes an organization can take to run a highly available, fault tolerant, and scalable Kubernetes cluster. EKS handles many of the difficult tasks in managing a Kubernetes cluster for its users; letting them focus more on their applications than their infrastructure. For example, EKS will balance your Kubernetes control plane and worker nodes across AWS availability zones (AZs) in your region, dynamically create EC2 load balancers enabling access to your applications, and it even comes with an Elastic Block Store (EBS) storage class for your stateful applications. As great as that all sounds (and is!), there are some gaps that LINBIT SDS can fill.

Where are these gaps?!

The two main storage offerings available from Amazon for use in EKS are their Elastic Block Store (EBS) and Elastic File System (EFS).

EBS provides block-level storage to EC2 instances – EKS workers in this case – running within the same AZ as the EBS volume. EBS volumes are RWO (read write once) – just like LINBIT SDS and other block storage for Kubernetes – meaning they can only be accessed by one EKS worker instance at a time. This makes EBS a good fit for stateful applications that require performant (low latency/high throughput) block-level storage in AWS. However, RWO access also means that you’ll have to configure and rely on your application’s built-in replication for fault tolerance and high availability of your application’s stateful data. However, what if your application doesn’t have built-in replication?

EFS provides a filesystem to EC2 instances that can be accessed concurrently and across AZs. In Kubernetes terms, it’s a RWX (read write many) persistent storage option. This acts much like an NFS mounted filesystem. Much like NFS, write performance to EFS suffers from the locking overhead necessary to prevent corruption.

Thus, if your application requires highly performant storage and doesn’t have the ability to replicate its state on its own, you’re in the gap between EBS and EFS. If you choose EBS, your application could become unavailable in the event of an AZ service outage in your AWS region. If you choose EFS, your users will likely complain about poor performance. LINBIT SDS fills that gap by providing high performance, synchronously replicated, block storage to Kubernetes clusters. With a little EC2 massaging, LINBIT SDS can fill that gap in Amazon EKS.

LINBIT SDS is open-source, and can run on your development laptop just like it can in the cloud. This means your “throw-away dev cluster” can have the same storage backend as your production cluster in AWS, which is another gap LINBIT SDS can help fill.

Preparing EC2 for LINBIT SDS:

LINBIT SDS for Kubernetes deploys a LINSTOR cluster into your Kubernetes cluster via Helm. LINSTOR will then layer different Linux block technologies such as LVM, VDO, LUKS, and DRBD on top of one another in order to enable specific features on a volume. The most important layer LINSTOR manages will almost always be DRBD for its in-kernel block replication. In order for LINSTOR to create DRBD devices in our EC2 worker instances, it either needs to ship with a kernel module specific to the Amazon Linux 2.0 kernel (the default Linux distribution for EKS), or the EC2 instances need to have the appropriate kernel-devel packages installed so LINSTOR can compile DRBD for Amazon Linux 2.0 on deployment. As of the publication of this blog, LINSTOR does not ship with the Amazon Linux 2.0 DRBD kernel module. Therefore, we must create a launch template in EC2 for our EKS cluster to use when bootstrapping new EKS worker nodes. LINSTOR also requires an unused block device attached to each EC2 instance that it can use as a storage pool; this can also be done via the EC2 launch template.

Log in to the AWS Management Console and browse to the EC2 Dashboard. In the navigation sidebar you should see a link to “Launch Templates” nested under the “Instances” drop down; follow that link. Then, click the “Create launch template” button in the Launch Template console. Set only the options pictured below, as the rest will be configured elsewhere (depending on how your organization manages EKS).

Name and describe the launch template:

launch template name and description

Instance type for launch template:

launch template instance type

Set the instance type according to your application’s needs. LINSTOR itself is not resource intensive. Memory utilization for a DRBD resource scales with the size of volume and number of replicas. The formula is roughly 32KiB of memory per 1GiB of storage multiplied by the number of peers (_other_ nodes with replicas).

Storage settings for launch template:

launch template storage settings

This volume will be used by LINSTOR when provisioning PVs from its storage classes. Set the size and type according to your requirements. Volume type will default to gp2 EBS volumes if not set.

Advanced settings for launch template:

launch template advanced settings

The only advanced settings that need modification is the user data pictured above. This additional user data will be appended to the user data responsible for bootstrapping an EKS worker. For copy and paste-ability, here is the user data pictured above:

Content-Type: multipart/mixed; boundary="==BOUNDARY=="
MIME-Version: 1.0

--==BOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
echo "Running custom user data script"
sudo yum install kernel-devel-`uname -r` -y

--==BOUNDARY==--\

Finally, click the “Create template version” button to save the launch template. You should see that your launch template was created successfully. Follow the link to view your launch template and note the “Launch Template ID” which should be formatted like this: lt-0123456789abcdefg. You will need to specify this launch template in the managed node group used by your EKS cluster.

If you use eksctl to create EKS clusters, you can specify your launch template right in your eksctl configuration:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: linbit-eks
  region: us-west-2
managedNodeGroups:
- name: lb-mng-0
  launchTemplate:
    id: lt-0123456789abcdefg
    version: "1"
  desiredCapacity: 3

Options to Set for LINSTOR’s Operator in EKS:

With EC2 prepped for LINSTOR, you should be able to follow the LINSTOR User’s Guide to get LINSTOR installed. For convenience, here are the options I would use when deploying LINSTOR via Helm in your EKS cluster:

operator:
  satelliteSet:
    storagePools:
      lvmThinPools:
      - name: lvm-thin
        thinVolume: thinpool
        volumeGroup: ""
        devicePaths:
        - /dev/nvme1n1
    kernelModuleInjectionMode: Compile
stork:
  enabled: true
  schedulerImage: gcr.io/google_containers/kube-scheduler-amd64
  schedulerTag: v1.18.6
  replicas: 3
etcd:
  replicas: 3
haController:
  replicas: 3

The stork scheduler and the haController are key components of LINBIT SDS for Kubernetes. Together, they’re going to make sure your applications are scheduled with your storage more intelligently, and also enable Kubernetes to reschedule your StatefulSet controlled applications in different AZs if there ever is an outage (which I’ve blogged/vlogged about here).

Also, the additional EBS volume attached to each EC2 instance that EKS spins up from our launch template will be attached as /dev/nvme1n1, which we’ve set in the configuration above. This will cause LINSTOR to automatically prepare the volume for use as its storage pool.

If you’ve made it this far and you’re interested in the use case that instigated this blog post, please read the reference architecture I wrote for HA Jenkins deployments in EKS. Or, of course, reach out directly!

Like? Share it with the world.

Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on whatsapp
WhatsApp
Share on vk
VK
Share on reddit
Reddit
Share on email
Email