LINBIT featured image

Trust, but verify

DRBD tries to ensure data integrity across different computers, and it’s quite good at it.

But, as per the old saying Trust, But Verify[1. attributed either to Lenin or Kennedy] it might be a good idea to periodically test whether the nodes really have identical data, similar to the checks that are[1. or at least can be] done for RAID sets.

The verify-alg digest is used to save bandwidth during online verification; while without this setting the whole data has to be transferred[1. unless you opt to verify only a part of it], a value of md5 means that only 20 bytes are needed for each 4KiByte block, resulting in bandwidth savings of about 99.5%.

The DRBD Users’ Guide has a nice chapter describing the configuration and usage, so I won’t get into this topic here.

If the volume you’re checking is actively used, you might see a few false positives in the log messages:

kernel: block drbd0: Out of sync: start=56079768, size=8 (sectors)

This is because data blocks might have been changed by the application in RAM after submitting the write request (but before getting it acknowledged!), and so would compare different generations of the data. If you do this check eg. every week and get different block numbers every time, you’re fine. If you get the same block number(s), your storage might have stuck bits, and be unable to correctly write data in these blocks!

Please note that the needed verify-alg setting here sounds similar to the data-integrity-alg option, but serves a different purpose. data-integrity-alg means more CPU-usage for every write; but, similar to verify-alg, it is subject to false-positives, see here for details on both of these points.

Like? Share it with the world.

Share on facebook
Share on twitter
Share on linkedin
Share on whatsapp
Share on vk
Share on reddit
Share on email