Minimize Downtime During System Maintenance

System maintenance, whether planned or in response to failure, is a necessary part of managing infrastructure. Everyone hopes for the former, rather than the latter. We do our system maintenance quarterly here at LINBIT in hopes that the latter is avoided. These maintenance windows are where we install hardware and software updates, test failovers, and give everything a once over to ensure configurations still make sense.

Normally in the case of planned maintenance, users are left waiting for access while IT does whatever they need to do. This leads to a bad user experience. In fact, that is precisely what lead to this blog post. I was looking for a BIOS update for a motherboard in the server room and was presented with this lovely message:

I just had a bad user experience. And to further the experience, I have no indication as to when it will be back up or available. I guess I’m supposed to keep checking back until I get what I was looking for… if I remember to.

Here at LINBIT we use DRBD for all of our systems. This ensures that they are always on and always available for the end users and our customers. If for some reason you landed on this site and aren’t familiar with DRBD, DRBD is an open source project developed by us, LINBIT. In its simplest form you can think of it as network raid 1, however instead of having  independent disks, you have two (or more if you’re using DRBD9) independent systems. You essentially now need to lose twice the hardware to experience downtime of services.

System Maintenance

One commonly ignored or unrealized benefit of using DRBD is that system maintenance and upgrades can be done with minimal to no interruption of services. The length of the interruption is generally tied to the type of deployment – for example if you’re using virtual machines, live migration can be achieved using DRBD resulting in no downtime. If you’re running services on hardware and they need to be stopped and restarted, your downtime will be limited to the failover time.

So how do we do this? Let say you have two servers; Frodo and Sam – Frodo is Primary (running services) and Sam is Secondary. In this example we need to update the BIOS and upgrade the RAM of our servers. Follow these steps

  1. First put the cluster into maintenance mode
  2. Next power off Sam (the secondary server)
    1. We can now install any upgrades or hardware we need to
    2. Power the system up, enter the BIOS and make sure everything is OK
    3. Reboot and update the BIOS
  3. Boot Sam into the OS
    1. At this point you can install any OS updates and reboot again if needed
  4. Once Sam is back up and everything is verified to be in good condition, bring the cluster out of maintenance mode
  5. Now migrate services to Sam – again depending on how things are configured this may or may not cause a few seconds of  unavailability of services
  6. Repeat steps 1-4 for Frodo

 

There you have it, one of the better kept secret benefits of using DRBD.

Share on facebook
Share on twitter
Share on linkedin
Share on reddit
Share on whatsapp
Share on vk
Share on email

Share this post

Brian Hellman

Brian Hellman

Since 2008, Brian has been the Chief Operating Officer of LINBIT and the head of the LINBIT USA team, where he and his team have lead continual double digit growth over his tenure. Brian is committed to bringing High Availability, Disaster Recovery and Software-Defined Storage technologies to the Open Source community. Outside of LINBIT, Brian is a dedicated philanthropist through Oregon Freemasonry and The Shriners Hospital for Children.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.