The eight-week cadence brings the new DRBD releases 9.1.15 and 9.2.2. The two releases contain roughly the same bug fixes. What these bug fixes have in common is that the underlying bugs trigger very seldom.
We found one while using drbd-9.2 in our internal virtualization cluster for our lab infrastructure. Via the ‘eat your own dog food’ strategy. About every 3-5 weeks, we experienced kernel OOPSes. At first, no one could find an explanation. Then, one day, it became like a dialogue between Joel and me, and we were able to restore the chain of events that caused this type of crash.
Joel was able to build a reproducer script, and we were able to fix the bug. Later, we realized that it is not specific to drbd-9.2 — to trigger it on drbd-9.1, a custom CPU mask needs to be configured (which most users do not have).
This anecdote is, of course, a personal success. It improves the technology and stabilizes it for you and the user base.
It might be time to consider switching to drbd-9.2 as well. It brings several optimizations compared to 9.1 and is ‘drop-in’ compatible. With it comes:
- better (lower) write latency, in general
- better (smaller grain) lockout between resync and application IO
- deals a lot better with thinly provisioned storage during resync
In other news, the team has released a video demo that shows people how to set up a high availability (HA) ownCloud cluster using DRBD and DRBD Reactor. The demo also demonstrates a failover of services during a node failure.
Here’s the corresponding DRBD and DRBD Reactors documentation and ownCloud documentation for those interested in exploring the subject in more detail.
You can also join our LINBIT community to interact with our team directly.