Creating a Self-healing Cluster Using LINSTOR’s Auto-evict Feature

Posted onOctober 16, 2022

When it’s time for the satellite to leave the cluster, the controller takes action!

LINBIT® introduced LINSTOR®’s auto-evict feature starting with LINSTOR version 1.10. In simple words, the feature evicts one of the satellite nodes from the cluster. But the possibilities that the feature opens up for you, including creating a self-healing cluster, are more interesting than you think.

As you may know, LINSTOR uses DRBD® for creating replicas of your data. DRBD doesn’t distribute the data but creates one-to-one real-time replicas that are the same on each of your cluster nodes. This is perfect for high-availability use cases.

Before Auto-evict

You never want node failure in a cluster but you should always expect and prepare for it because a node may fail and not come back online for many reasons.

Before the auto-evict feature, if a node failed, the controller would wait until the node connection was restored and would prevent you from modifying or deleting the resources on that node. Also, before auto-evict, when a node failed or went offline, LINSTOR did not automatically create another copy of your resources to match the desired number of replicas within your cluster.

After Auto-evict

Now, with the auto-evict feature configured, you can set a timer for a node that isn’t communicating with the cluster to be evicted from the cluster. When the timer expires, LINSTOR marks that node as “evicted” and then triggers automated reassignment of the affected DRBD resources to other nodes to keep a minimum replica count that you configured for your resources. By using the auto-evict feature and configuring minimum replica counts for your resources, you can make your cluster self-healing after a node failure.

After LINSTOR evicts a node from the cluster, you have several options. Two of these used to be panic and scramble. But not anymore.

Evicting a node allows you to modify the resources freely, because LINSTOR will otherwise not allow you to modify a resource without being connected to the LINSTOR controller. An evicted node is no longer connected to the controller so that you no longer have this constraint.

You can fix the issue with the node and make it available within your cluster again. Just be aware that the cluster will not accept the returning node automatically. Unless you use a cluster manager such as DRBD Reactor or Pacemaker, you will need to use the node restore command.

Or, if you used LINSTOR resource groups to configure auto-placement of your resources and you want to give the returning node a fresh start, you can use the node lost command to delete all LINSTOR resources and configurations on that node.

Conclusion

Auto-evict is “can’t live without it” feature for better control of your LINSTOR cluster and allows you to manage node failures more efficiently than you could before.

For more configuration details and technical information about auto-evict, the node restore, and the node lost commands, check out the auto-evict section in the LINSTOR User’s Guide.

Share this post

More to Explore

Yusuf Yıldız

After nearly 15 years of system and storage management, Yusuf started to work as a solution architect at LINBIT. Yusuf's main focus is on customer success and contributing to product development and testing. As part of the solution architects team, he is one of the backbone and supporter of the sales team.

Talk to us

First name

Last name

Company name

Country

Message

I agree to receive other communications from LINBIT.*

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.

Talk to us

First name

Last name

Company name

Country

Message

I agree to receive other communications from LINBIT.*

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.

Software-Defined Storage

High Availability

Disaster Recovery

Further Solutions

Guides, Manuals, & Training

From Our Community

Knowledge Base

Company

Partners

Events

ControlIT

Creating a Self-healing Cluster Using LINSTOR’s Auto-evict Feature

Before Auto-evict

After Auto-evict

Conclusion

Recent Posts

Recent Posts

More to Explore

Yusuf Yıldız

Talk to us

Talk to us

Legal

Resources

Company