Ubuntu - RAID - Software Raid Rebuilding Broken Raid 1

The _ in the cat /proc/mdstat tells me the second disk (/dev/sdb) has failed:

Personalities : [raid1] [raid6] [raid5] [raid4]
md0 : active raid1 sda1[0] sdb1[1]
      129596288 blocks [2/2] [U_]

U means up, _ means down [https://raid.wiki.kernel.org/index.php/Mdstat#.2Fproc.2Fmdstat]

First we remove the disk from the RAID array:

mdadm --manage /dev/md0 --remove /dev/sdb1

Make sure the server can boot from a degraded RAID array:

grep BOOT_DEGRADED /etc/initramfs-tools/conf.d/mdadm

If it says true, continue on. If not, add or change it and rebuild the initramfs using the following command:

update-initramfs -u

We can now safely shut down the server:

shutdown -h 10

Replace the actual disk. For hot swap disks this can be done while the server is on, but if a server has no hot swap disks then it should be shut down.

After that, boot the server from the first disk (via the BIOS/UEFI). Make sure you boot to recovery mode. Select the root shell and mount the disk read/write:

mount -o remount,rw /dev/sda1

Now copy the partition table to the new (in my case, empty) disk:

sfdisk -d /dev/sda > sfdisk /dev/sdb

This will erase data on the new disk.

Add the disk to the RAID array and wait for the rebuilding to be complete:

mdadm --manage /dev/md0 --add /dev/sdb1

This is a nice progress command:

watch cat /proc/mdstat

It will take a while on large disks:

Personalities : [raid1] [raid6] [raid5] [raid4]
md0 : active raid1 sda1[0] sdb1[1]
      129596288 blocks [2/2] [U_]
      [=>...................]  recovery = 2.6% (343392/129596288) finish=67min speed=98840K/sec
 
unused devices: <none>