Edge computing is a distributed computing paradigm that brings data processing and computation closer to the data source or “edge” of the network. This reduces latency and removes Internet connectivity as a point of failure for users of edge services.
Since more hardware is involved in an edge computing environment than there is in a traditional central data center topology, there is a need to keep that hardware relatively inexpensive and replaceable. Generally, that will mean the hardware running at the edge will have less system resources than what you might find in a central data center.
LINBIT SDS, which consists of LINSTOR® and DRBD® from LINBIT®, has a very small footprint on system resources. This leaves more resources available for other edge services and applications and makes LINBIT SDS an ideal candidate for solving persistent storage needs at the edge.
The core function of LINBIT SDS is to provide resilient and feature-rich block storage to the many platforms it integrates with. The resilience comes from DRBD, the block storage replication driver managed by LINSTOR in LINBIT SDS, allows services to tolerate host-level failures. This is an important feature at the edge, since host-level failures may be more frequent when using less expensive hardware that might not be as fault tolerant as hardware that you would run in a proper data center.
To prove and highlight some of the claims I’ve made about LINBIT SDS above, I used my trusty Libre Computer AML-S905X-CC (Le Potato) ARM-based single board computer (SBC) cluster to run LINBIT SDS and K3s. If you’re not familiar with “Le Potato” SBCs, they are simply 2GB Raspberry Pi 4 model B clones. I would characterize my “Potato cluster” as severely underpowered compared to the enterprise grade hardware used by some of LINBIT’s users, and would even go as far as saying this is the floor in terms of hardware capability that I would try something like this on. To read about LINBIT SDS on a much more capable ARM-based system read my blog, Benchmarking on Ampere Altra Max Platform with LINBIT SDS. That said, if LINBIT SDS can run on my budget Raspberry Pi clone cluster, it can run anywhere.
Here is a real photograph* of my Le Potato cluster and cooling system in my home lab:
đź“ť NOTE: This is not a real photograph.
Small Footprint on System Resources
The cluster I’m using does not have a ton of resources available. Using the kubectl top node
command we can see what each of these nodes has available with LINBIT SDS already deployed.
root@potato-0:~# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
potato-0 1111m 27% 1495Mi 77%
potato-1 1277m 31% 1666Mi 86%
potato-2 1096m 27% 1634Mi 84%
A single CPU core in these quad-core Libre Computer SBCs is equal to 1000m, or 1000 millicpus. This output shows us that with LINBIT SDS and Kubernetes running on them, that they still have 69-73% of their CPU resources available. Memory pressure on these 2Gi SBCs is extremely limiting, but we still have a little room to play around.
A typical LINBIT SDS deployment in Kubernetes consists of the following containers:
- LINSTOR operator
- LINSTOR controller
- LINSTOR satellites
- LINSTOR CSI controller
- LINSTOR CSI daemonset
- LINSTOR High Availability (HA) controller daemonset
đź“ť NOTE: You can verify which containers the latest LINBIT SDS deployment in Kubernetes uses by viewing the image lists that LINBIT maintains at charts.linstor.io.
Using kubectl top pods -n linbit-sds --sum
we can see how much memory and CPU the LINBIT SDS containers are using.
root@potato-0:~# kubectl top -n linbit-sds pods --sum
NAME CPU(cores) MEMORY(bytes)
ha-controller-bk2rh 4m 20Mi
ha-controller-bxhs7 5m 19Mi
ha-controller-knvg8 3m 21Mi
linstor-controller-5b84bfc497-wrbdn 25m 168Mi
linstor-csi-controller-8c9fdd6c-q7rj5 45m 124Mi
linstor-csi-node-c46bb 3m 31Mi
linstor-csi-node-knmv2 4m 29Mi
linstor-csi-node-x9bhr 3m 33Mi
linstor-operator-controller-manager-6dd5bfbfc8-cfp7t 10m 57Mi
potato-0 9m 87Mi
potato-1 10m 68Mi
potato-2 10m 57Mi
________ ________
127m 719Mi
That’s less than a quarter of a single CPU core and under 1Gi of the 6Gi available in my tiny cluster.
If I create a LINBIT SDS provisioned persistent volume claim (PVC) replicated twice for a demo MinIO pod, we can check the utilization again while we’re actually running services. Using the following PVC and pod manifest LINBIT SDS will provision a LINSTOR volume, replicate it between two nodes (as defined in my storageClass) using DRBD, and schedule MinIO pod with data persisted on the LINBIT SDS managed storage.
root@potato-0:~# kubectl apply -f - <<EOF
apiVersion: v1
kind: Namespace
metadata:
name: minio
labels:
name: minio
EOF
namespace/minio created
root@potato-0:~# kubectl apply -f - <<EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: demo-pvc-0
namespace: minio
spec:
storageClassName: linstor-csi-lvm-thin-r2
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 4G
EOF
persistentvolumeclaim/demo-pvc-0 created
root@potato-0:~# kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
labels:
app: minio
name: minio
namespace: minio
spec:
containers:
- name: minio
image: quay.io/minio/minio:latest
command:
- /bin/bash
- -c
args:
- minio server /data --console-address :9090
volumeMounts:
- mountPath: /data
name: demo-pvc-0
volumes:
- name: demo-pvc-0
persistentVolumeClaim:
claimName: demo-pvc-0
EOF
pod/minio created
Using the kubectl port-forward pod/minio 9000 9090 -n minio --address 0.0.0.0
command, I forwarded traffic on port 9000 from my potato-0
node to the MinIO pod. I then started an upload of a Debian image (583Mi) to a new MinIO bucket (bucket-0
) using the MinIO console accessible at https://potato-0:9000. During the upload I captured the output of kubectl top
again to compare against my previous results.
root@potato-2:~# kubectl top pods -n linbit-sds --sum
NAME CPU(cores) MEMORY(bytes)
ha-controller-bk2rh 3m 22Mi
ha-controller-bxhs7 5m 26Mi
ha-controller-knvg8 6m 16Mi
linstor-controller-5b84bfc497-wrbdn 72m 174Mi
linstor-csi-controller-8c9fdd6c-q7rj5 30m 127Mi
linstor-csi-node-c46bb 8m 33Mi
linstor-csi-node-knmv2 3m 25Mi
linstor-csi-node-x9bhr 2m 31Mi
linstor-operator-controller-manager-6dd5bfbfc8-cfp7t 174m 58Mi
potato-0 4m 118Mi
potato-1 8m 131Mi
potato-2 7m 81Mi
________ ________
317m 846Mi
root@potato-2:~# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
potato-0 2581m 64% 1623Mi 84%
potato-1 1556m 38% 1713Mi 88%
potato-2 1147m 28% 1636Mi 84%
That’s a pretty minimal difference. The LINSTOR satellite pods (potato-0, potato-1, and potato-2) are using a little more memory than in the first sample. The extra memory is likely to hold DRBD’s bitmap because there are physical replicas on potato-0 and potato-1, and a diskless “tiebreaker” assignment on potato-2, which does not store a bitmap.
root@potato-2:~# kubectl exec -n linbit-sds deployments/linstor-controller -- linstor resource list
+----------------------------------------------------------------------------------------------------------------+
| ResourceName | Node | Port | Usage | Conns | State | CreatedOn |
|================================================================================================================|
| pvc-e5c40aa3-e9b2-40dc-b096-e96a61a27d47 | potato-0 | 7000 | InUse | Ok | UpToDate | 2023-09-08 17:37:32 |
| pvc-e5c40aa3-e9b2-40dc-b096-e96a61a27d47 | potato-1 | 7000 | Unused | Ok | UpToDate | 2023-09-08 17:37:18 |
| pvc-e5c40aa3-e9b2-40dc-b096-e96a61a27d47 | potato-2 | 7000 | Unused | Ok | TieBreaker | 2023-09-08 17:37:40 |
+----------------------------------------------------------------------------------------------------------------+
Resilience During Host Failures
Now that the storage has data, I can simulate a failure in the cluster and see whether the data persists. I can tell that the MinIO pod is running on potato-0 from the linstor resource list
command which shows the PVC as InUse
on potato-0. To do this, I used the command echo c > /proc/sysrq-trigger
on potato-0. This immediately crashes the kernel, and unless you’ve configured your system otherwise, it will not reboot on its own.
While waiting for Kubernetes to catch and react to the failure, I checked DRBD’s state on the remaining nodes and could see that potato-1, the remaining “diskful” peer, reported UpToDate
data, so it would be able to take over services:
root@potato-2:~# kubectl exec -it -n linbit-sds potato-1 -- drbdadm status
pvc-e5c40aa3-e9b2-40dc-b096-e96a61a27d47 role:Secondary
disk:UpToDate
potato-0 connection:Connecting
potato-2 role:Secondary
peer-disk:Diskless
root@potato-2:~# kubectl exec -it -n linbit-sds potato-2 -- drbdadm status
pvc-e5c40aa3-e9b2-40dc-b096-e96a61a27d47 role:Secondary
disk:Diskless
potato-0 connection:Connecting
potato-1 role:Secondary
peer-disk:UpToDate
After roughly five minutes, Kubernetes picked up on the failure and began terminating potato-0’s pods. I didn’t use a deployment, or any other workload resources for managing this pod, so it will not be rescheduled on its own. To delete a pod from a dead node I needed to use the force, that is: kubectl delete pod -n minio minio --force
. With the pod deleted, I could recreate it by using the same command used earlier:
root@potato-2:~# kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
labels:
app: minio
name: minio
namespace: minio
spec:
containers:
- name: minio
image: quay.io/minio/minio:latest
command:
- /bin/bash
- -c
args:
- minio server /data --console-address :9090
volumeMounts:
- mountPath: /data
name: demo-pvc-0
volumes:
- name: demo-pvc-0
persistentVolumeClaim:
claimName: demo-pvc-0
EOF
pod/minio created
After the pod was rescheduled on potato-1, and the port-forward to the MinIO pod restarted from potato-1, I could once again access the console and see the contents of my bucket were intact. This is because the DRBD resource LINBIT SDS created for the MinIO pod’s persistent storage replicates writes synchronously between the cluster peers. This means that by using DRBD, you have a block-for-block copy of your block devices on more than one node in the cluster at all times.
In this scenario, K3s happened to reschedule the MinIO pod on another node with a physical replica of the DRBD device, but this isn’t necessarily always the case. If K3s would have rescheduled the MinIO pod on a node without a physical replica of the DRBD device, the LINSTOR CSI driver would have created what we call a “diskless” resources on that node. A “diskless” resource uses DRBD’s replication network to attach the “diskless” peer to a node in the cluster that does contain a physical replica of the volume, allowing reads and writes to occur over the network. You can think of this like NVMe-oF or iSCSI targets and initiators, except that it uses DRBD’s internal protocols. Since this may be undesirable for workloads that are sensitive to latency, such as databases, you can configure LINBIT SDS to enforce volume locality in Kubernetes.
Total Cost of Ownership
LINBIT SDS is open source, with LINBIT offering support and services on a subscription basis. This means that the total cost of ownership (TCO) in terms of acquisition can be as low as the price of your hardware. My Potato cluster can be had for less than $100 USD from Amazon at the time I’m writing this blog. Realistically, you’re not going to run anything meaningful on a couple of Raspberry Pi clones, but I think I’ve made my point that you don’t need to spend tens of thousands of dollars for hardware to run a LINBIT SDS cluster.
The other side of TCO is the operating cost. This is where TCO involving open source software gets a bit more abstract. The price of hiring a Linux system administrator familiar with distributed storage can vary widely depending on the region you operate in, and you’ll want to have enough other work to keep an admin busy to make their salary a good investment. If that makes open source sound expensive, you’re not wrong, but LINBIT stands by its software and its users offering subscriptions at a fraction of the cost of hiring your own distributed storage expert.
Ultimately the actual TCO will come down to the expertise your organization has on staff and how many spare cycles they can put towards maintaining an open source solution like LINBIT SDS. I feel like this is where I can insert one of my favorite quotes regarding open source software, “think free as in free speech, not free beer.”
Concluding Thoughts
I’ve proven, at least to myself but hopefully to you the reader, that you could run LINBIT SDS and Kubernetes on a cluster that fits in a shoe box, with a price tag that’s probably lower than the shoes that came in said shoe box. The efficiency of LINSTOR when coupled with the resilient block storage from DRBD makes running edge services possible using replaceable hardware. The self healing nature of Kubernetes and LINBIT SDS makes replacing a node as easy as running a single command to add it to the Kubernetes cluster, making the combination an excellent platform for running persistent containers at the edge.
After using this “Potato cluster” for a few days to write this blog, I am happy with it, but I’m also eager to tinker with other ARM-based systems that are a bit more powerful. In the past I’ve used DRBD and Pacemaker for HA clustering on small form factor Micro ATX boards with Intel processors to great success, but the low power and size requirements of newer ARM-based systems is attractive for edge environments. If you have experience with a specific hardware platform that could fit this bill, consider joining the LINBIT Slack community and dropping me a message.