Linux Software RAID Mirror In-Place Upgrade
Ran out of space on an old CentOS 6.8 server in the weekend, and so had to upgrade the main data mirror from a pair of Hitachi 2TB HDDs to a pair of 4TB WD Reds I had lying around.
The volume was using mdadm
, aka Linux Software RAID, and is a simple mirror
(RAID1), with LVM
volumes on top of the mirror. The safest upgrade path is
to build a new mirror on the new disks and sync the data across, but there
weren't any free SATA ports on the motherboard, so instead I opted to do an
in-place upgrade. I haven't done this for a good few years, and hit a couple
of wrinkles on the way, so here are the notes from the journey.
Below, the physical disk partitions are /dev/sdb1
and /dev/sdd1
, the
mirror is /dev/md1
, and the LVM volume group is extra
.
1. Backup your data (or check you have known good rock-solid backups in place), because this is a complex process with plenty that could go wrong. You want an exit strategy.
2. Break the mirror, failing and removing one of the old disks
mdadm --manage /dev/md1 --fail /dev/sdb1 mdadm --manage /dev/md1 --remove /dev/sdb1
3. Shutdown the server, remove the disk you've just failed, and insert your replacement. Boot up again.
4. Since these are brand new disks, we need to partition them. And since
these are 4TB disks, we need to use parted
rather than the older fdisk
.
parted /dev/sdb print mklabel gl # Create a partition, skipping the 1st MB at beginning and end mkpart primary 1 -1 unit s print # Not sure if this flag is required, but whatever set 1 raid on quit
5. Then add the new partition back into the mirror. Although this is much bigger, it will just sync up at the old size, which is what we want for now.
mdadm --manage /dev/md1 --add /dev/sdb1 # This will take a few hours to resync, so let's keep an eye on progress watch -n5 cat /proc/mdstat
6. Once all resynched, rinse and repeat with the other disk - fail and remove
/dev/sdd1
, shutdown and swap the new disk in, boot up again, partition the new
disk, and add the new partition into the mirror.
7. Once all resynched again, you'll be back where you started - a nice stable mirror of your original size, but with shiny new hardware underneath. Now we can grow the mirror to take advantage of all this new space we've got.
mdadm --grow /dev/md1 --size=max mdadm: component size of /dev/md0 has been set to 2147479552K
Ooops! That size doesn't look right, that's 2TB, but these are 4TB disks?!
Turns out there's a 2TB limit on mdadm
metadata version 0.90
, which this
mirror is using, as documented on https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#The_version-0.90_Superblock_Format.
mdadm --detail /dev/md1 /dev/md1: Version : 0.90 Creation Time : Thu Aug 26 21:03:47 2010 Raid Level : raid1 Array Size : 2147483648 (2048.00 GiB 2199.02 GB) Used Dev Size : -1 Raid Devices : 2 Total Devices : 2 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Mon Nov 27 11:49:44 2017 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : f76c75fb:7506bc25:dab805d9:e8e5d879 Events : 0.1438 Number Major Minor RaidDevice State 0 8 17 0 active sync /dev/sdb1 1 8 49 1 active sync /dev/sdd1
Unfortunately, mdadm
doesn't support upgrading the metadata version. But
there is a workaround documented on that wiki page, so let's try that:
mdadm --detail /dev/md1 # (as above) # Stop/remove the mirror mdadm --stop /dev/md1 mdadm: Cannot get exclusive access to /dev/md1:Perhaps a running process, mounted filesystem or active volume group? # Okay, deactivate our volume group first vgchange --activate n extra # Retry stop mdadm --stop /dev/md1 mdadm: stopped /dev/md1 # Recreate the mirror with 1.0 metadata (you can't go to 1.1 or 1.2, because they're located differently) # Note that you should specify all your parameters in case the defaults have changed mdadm --create /dev/md1 -l1 -n2 --metadata=1.0 --assume-clean --size=2147483648 /dev/sdb1 /dev/sdd1
That outputs:
mdadm: /dev/sdb1 appears to be part of a raid array: level=raid1 devices=2 ctime=Thu Aug 26 21:03:47 2010 mdadm: /dev/sdd1 appears to be part of a raid array: level=raid1 devices=2 ctime=Thu Aug 26 21:03:47 2010 mdadm: largest drive (/dev/sdb1) exceeds size (2147483648K) by more than 1% Continue creating array? y mdadm: array /dev/md1 started.
Success! Now let's reactivate that volume group again:
vgchange --activate y extra 3 logical volume(s) in volume group "extra" now active
Another wrinkle is that recreating the mirror will have changed the array UUID,
so we need to update the old UUID in /etc/mdadm.conf
:
# Double-check metadata version, and record volume UUID mdadm --detail /dev/md1 # Update the /dev/md1 entry UUID in /etc/mdadm.conf $EDITOR /etc/mdadm.conf
So now, let's try that mdadm --grow
command again:
mdadm --grow /dev/md1 --size=max mdadm: component size of /dev/md1 has been set to 3907016564K # Much better! This will take a while to synch up again now: watch -n5 cat /proc/mdstat
8. (You can wait for this to finish resynching first, but it's optional.) Now we need to let LVM know that the physical volume underneath it has changed size:
# Check our starting point pvdisplay /dev/mda1 --- Physical volume --- PV Name /dev/md1 VG Name extra PV Size 1.82 TiB / not usable 14.50 MiB Allocatable yes PE Size 64.00 MiB Total PE 29808 Free PE 1072 Allocated PE 28736 PV UUID mzLeMW-USCr-WmkC-552k-FqNk-96N0-bPh8ip # Resize the LVM physical volume pvresize /dev/md1 Read-only locking type set. Write locks are prohibited. Can't get lock for system Cannot process volume group system Read-only locking type set. Write locks are prohibited. Can't get lock for extra Cannot process volume group extra Read-only locking type set. Write locks are prohibited. Can't get lock for #orphans_lvm1 Cannot process standalone physical volumes Read-only locking type set. Write locks are prohibited. Can't get lock for #orphans_pool Cannot process standalone physical volumes Read-only locking type set. Write locks are prohibited. Can't get lock for #orphans_lvm2 Cannot process standalone physical volumes Read-only locking type set. Write locks are prohibited. Can't get lock for system Cannot process volume group system Read-only locking type set. Write locks are prohibited. Can't get lock for extra Cannot process volume group extra Read-only locking type set. Write locks are prohibited. Can't get lock for #orphans_lvm1 Cannot process standalone physical volumes Read-only locking type set. Write locks are prohibited. Can't get lock for #orphans_pool Cannot process standalone physical volumes Read-only locking type set. Write locks are prohibited. Can't get lock for #orphans_lvm2 Cannot process standalone physical volumes Failed to find physical volume "/dev/md1". 0 physical volume(s) resized / 0 physical volume(s) not resized
Oops - that doesn't look good. But it turns out it's just a weird
locking type default. If we tell pvresize
it can use local filesystem
write locks we should be good (cf. /etc/lvm/lvm.conf
):
# Let's try that again... pvresize --config 'global {locking_type=1}' /dev/md1 Physical volume "/dev/md1" changed 1 physical volume(s) resized / 0 physical volume(s) not resized # Double-check the PV Size pvdisplay /dev/mda1 --- Physical volume --- PV Name /dev/md1 VG Name extra PV Size 3.64 TiB / not usable 21.68 MiB Allocatable yes PE Size 64.00 MiB Total PE 59616 Free PE 30880 Allocated PE 28736 PV UUID mzLeMW-USCr-WmkC-552k-FqNk-96N0-bPh8ip
Success!
Finally, you can now resize your logical volumes using lvresize
as you
usually would.