Recently it was discovered that the RAID 5/6 implementation in Btrfs is completely broken , due to the fact that it miscalculates parity (which is rather important in RAID 5 and RAID 6).
So what to do withan existing setup that’s running native Btfs RAID 5/6?
Well, fortunately, this issue doesn’t affect non-parity based RAID levels such as 1 and 0 (and combinations thereof) and it also doesn’t affect a Btrfs filesystem that’s sitting on top of a standard linux Software RAID (md) device.
So if down-time isn’t a problem, we could re-create the RAID 5/6 array using md and put Btrfs back on top and restore our data… or, thanks to Btrfs itself, we can live migrate it to RAID 10!
A few caveats though. When using RAID 10, space efficiency is reduced to 50% of your drives, no matter how many you have (this is because it’s mirrored). By comparison, with RAID 5 you lose a single drive in space, with RAID 6 it’s two, no-matter how many drives you have.
This is important to note, because a RAID 5 setup with 4 drives that is using more than 2/3rds of the total space will be too big to fit on RAID 10. Btrfs also needs space for System, Metadata and Reserves so I can’t say for sure how much space you will need for the migration, but I expect considerably more than 50%. In such cases, you may need to add more drives to the Btrfs array first, before the migration begins.
So, you will need:
At least 4 drives An even number of drives (unless you keep one as a spare) Data in use that is much less than 50% of the total provided by all drives (number of disks / 2)Of course, you’ll have a good, tested, reliable backup or two before you start this. Right? Good.
Plug any new disks in and partition or luksFormat them if necessary. We will assume your new drive is /dev/sdg, you’re using dm-crypt and that Btrfs is mounted at /mnt. Substitute these for your actual settings.
cryptsetup luksFormat /dev/sdg UUID="$(cryptsetup luksUUID /dev/sdg)" echo "luks-${UUID} UUID=${UUID} none" >> /etc/crypttab cryptsetup luksOpen luks-${UUID} /dev/sdg btrfs device add /dev/mapper/luks-${UUID} /mnt
The migration is going to take a long time, so best to run this in a tmux or screen session.
screen time btrfs balance /mnt time btrfs balance start -dconvert=raid10 -mconvert=raid10 /mntAfter this completes, check that everything has been migrated to RAID 10.
btrfs fi df /srv/data/ Data, RAID10: total=2.19TiB, used=2.18TiB System, RAID10: total=96.00MiB, used=240.00KiB Metadata, RAID10: total=7.22GiB, used=5.40GiB GlobalReserve, single: total=512.00MiB, used=0.00B
If you still see some RAID 5/6 entries, run the same migrate command and then check that everything has migrated successfully.
For good measure, let’s rebalance again without the migration (this will take a while).
time btrfs fi balance start --full-balance /srv/data/Now we can defragment everything.
time btrfs filesystem defragment /srv/data/ # this defrags the metadata time btrfs filesystem defragment -r /srv/data/ # this defrags data