Linux: Replace Disk in MD

Linux: Replace Disk in MD

On my lab machines I use a pair of physical disks, configured in a RAID 1 with MD, with LVM on top of that.  This gives me a lot of flexibility with full redundancy.  However recently I had a failure of a drive.

Identify the Failure

Here is what MD was telling me after I identified and removed the drive from the chassis to send it off for an RMA, notice how each device only has one member (md0 has sda1, md1 has sda2), this does not make a good RAID 1.  But this is easily resolved.

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0]
524224 blocks super 1.0 [2/1] [U_]

md1 : active raid1 sda2[0]
976105280 blocks super 1.1 [2/1] [U_]
bitmap: 3/8 pages [12KB], 65536KB chunk

unused devices: <none>

Install the New Drive

Once we install the new drive we will see a bare drive, in my case it is /dev/sdb.  However it is important to note that we do not have a matching partition table on both drives.

# fdisk -l | more

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x000750ce

Device Boot Start End Blocks Id System
/dev/sda1 * 1 66 524288 fd Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sda2 66 121602 976236544 fd Linux raid autodetect

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Copy the Partition Table

We can use sfdisk to copy the partition table from /dev/sda to /dev/sdb.

# sfdisk -d /dev/sda | sfdisk /dev/sdb
Checking that no-one is using this disk right now ...
OK

Disk /dev/sdb: 121601 cylinders, 255 heads, 63 sectors/track
/dev/sdb: unrecognized partition table type
Old situation:
No partitions found
New situation:
Units = sectors of 512 bytes, counting from 0

Device Boot Start End #sectors Id System
/dev/sdb1 * 2048 1050623 1048576 fd Linux raid autodetect
/dev/sdb2 1050624 1953523711 1952473088 fd Linux raid autodetect
/dev/sdb3 0 - 0 0 Empty
/dev/sdb4 0 - 0 0 Empty
Warning: partition 1 does not end at a cylinder boundary

sfdisk: I don't like these partitions - nothing changed.
(If you really want this, use the --force option.)

Notice this actually gave us an error, sfdisk doesn’t like my partitions, because they are not properly aligned.  We can use the –force option to override that and write it out anyways.

# sfdisk -d /dev/sda | sfdisk /dev/sdb --force
Checking that no-one is using this disk right now ...
OK

Disk /dev/sdb: 121601 cylinders, 255 heads, 63 sectors/track
/dev/sdb: unrecognized partition table type
Old situation:
No partitions found
New situation:
Units = sectors of 512 bytes, counting from 0

Device Boot Start End #sectors Id System
/dev/sdb1 * 2048 1050623 1048576 fd Linux raid autodetect
/dev/sdb2 1050624 1953523711 1952473088 fd Linux raid autodetect
/dev/sdb3 0 - 0 0 Empty
/dev/sdb4 0 - 0 0 Empty
Warning: partition 1 does not end at a cylinder boundary
Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes: dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)

Now lets see if that one took.  Our fdisk output should match now for /dev/sda and /dev/sdb.

# fdisk -l | more

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x000750ce

Device Boot Start End Blocks Id System
/dev/sda1 * 1 66 524288 fd Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sda2 66 121602 976236544 fd Linux raid autodetect

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 * 1 66 524288 fd Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2 66 121602 976236544 fd Linux raid autodetect

Add the New Partitions to the MD Devices

Now we just need to add the partitions from the new drive to the correct md devices and allow md to rebuild the arrays.

# mdadm --manage /dev/md0 --add /dev/sdb1
mdadm: added /dev/sdb1
# mdadm --manage /dev/md1 --add /dev/sdb2
mdadm: added /dev/sdb2

Monitor Rebuild

Now we can see the progress of the rebuild.  Notice we now have two partitions in each md device.  It shows the rebuild, percentage complete, time to complete, and speed.

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[2] sda1[0]
524224 blocks super 1.0 [2/2] [UU]

md1 : active raid1 sdb2[2] sda2[0]
976105280 blocks super 1.1 [2/1] [U_]
[>....................] recovery = 0.1% (1551488/976105280) finish=94.2min speed=172387K/sec
bitmap: 3/8 pages [12KB], 65536KB chunk

unused devices: <none>

Here is another one a little further along.

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[2] sda1[0]
524224 blocks super 1.0 [2/2] [UU]

md1 : active raid1 sdb2[2] sda2[0]
976105280 blocks super 1.1 [2/1] [U_]
[=====>...............] recovery = 27.5% (268450496/976105280) finish=64.0min speed=184204K/sec
bitmap: 2/8 pages [8KB], 65536KB chunk

unused devices: <none>

And finally this is what it looks like when it is completed.

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[2] sda1[0]
524224 blocks super 1.0 [2/2] [UU]

md1 : active raid1 sdb2[2] sda2[0]
976105280 blocks super 1.1 [2/2] [UU]
bitmap: 0/8 pages [0KB], 65536KB chunk

unused devices: <none>