Replacing a disk in a RAID array
If a disk in a BareMetal server’s RAID array fails, stop using it, remove it from the array, request a replacement drive from support, and add the new disk to the array.
Note
This guide does not apply to disk failures in RAID 0 arrays. Such arrays are not fault-tolerant; a single disk failure results in complete data loss and requires full array reconstruction.
This guide covers a standard RAID10 configuration with four HDDs under Ubuntu 24.04. If your setup differs from this standard configuration, adjust the following steps accordingly.
Remove the failed disk from the RAID array
-
Connect to the server over SSH:
ssh root@<server_public_IP_address>You can also access the server via the KVM console using your username and password.
-
Check the current disk and partition layout in the RAID array:
cat /proc/mdstatResult:
Personalities : [raid10] [raid0] [raid1] [raid6] [raid5] [raid4] md3 : active raid10 sdb4[1] sdc4[2] sdd4[3] sda4[0] 3893569536 blocks super 1.2 256K chunks 2 near-copies [4/4] [UUUU] bitmap: 0/30 pages [0KB], 65536KB chunk md2 : active raid10 sdc3[2] sdb3[1] sdd3[3] sda3[0] 2095104 blocks super 1.2 256K chunks 2 near-copies [4/4] [UUUU] md1 : active raid10 sdc2[2] sdb2[1](F) sda2[0] sdd2[3] 8380416 blocks super 1.2 256K chunks 2 near-copies [4/3] [U_UU]As shown, the RAID array consists of three RAID partitions:
md1composed ofsdb2andsda2disk partitions (on physical diskssdbandsda, respectively),md2composed ofsdb3andsda3disk partitions, andmd3composed ofsdb4andsda4disk partitions. The command output shows that thesdbdisk has failed, indicated by the(F)flag next to its name.Additionally, you can check the role of each partition in the RAID array:
lsblkResult:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS sda 8:0 0 1.8T 0 disk ├─sda1 8:1 0 299M 0 part ├─sda2 8:2 0 4G 0 part │ └─md1 9:1 0 8G 0 raid10 /boot ├─sda3 8:3 0 1G 0 part │ └─md2 9:2 0 2G 0 raid10 [SWAP] └─sda4 8:4 0 1.8T 0 part └─md3 9:3 0 3.6T 0 raid10 / sdb 8:16 0 1.8T 0 disk ├─sdb1 8:17 0 299M 0 part ├─sdb2 8:18 0 4G 0 part │ └─md1 9:1 0 8G 0 raid10 /boot ├─sdb3 8:19 0 1G 0 part │ └─md2 9:2 0 2G 0 raid10 [SWAP]` └─sdb4 8:20 0 1.8T 0 part └─md3 9:3 0 3.6T 0 raid10 / sdc 8:32 0 1.8T 0 disk ├─sdc1 8:33 0 299M 0 part ├─sdc2 8:34 0 4G 0 part │ └─md1 9:1 0 8G 0 raid10 /boot ├─sdc3 8:35 0 1G 0 part │ └─md2 9:2 0 2G 0 raid10 [SWAP] └─sdc4 8:36 0 1.8T 0 part └─md3 9:3 0 3.6T 0 raid10 / sdd 8:48 0 1.8T 0 disk ├─sdd1 8:49 0 299M 0 part ├─sdd2 8:50 0 4G 0 part │ └─md1 9:1 0 8G 0 raid10 /boot ├─sdd3 8:51 0 1G 0 part │ └─md2 9:2 0 2G 0 raid10 [SWAP] └─sdd4 8:52 0 1.8T 0 part └─md3 9:3 0 3.6T 0 raid10 /In our example:
md1:/bootpartition.md2:SWAPpartition.md3:/root partition.
-
Assume the
/dev/sdbdisk has failed. Detach the/dev/sdbdisk's partitions from the RAID array:mdadm /dev/md1 --remove /dev/sdb2 mdadm /dev/md2 --remove /dev/sdb3 mdadm /dev/md3 --remove /dev/sdb4The
mdadmutility prevents disk removal from a RAID array if the disk is still operational or if removal can cause array failure, triggering aDevice busyerror:mdadm: hot remove failed for /dev/sdb2: Device or resource busyIn this case, first mark the disk as failed before retrying the removal:
mdadm /dev/md1 --fail /dev/sdb2 mdadm /dev/md1 --remove /dev/sdb2 mdadm /dev/md2 --fail /dev/sdb3 mdadm /dev/md2 --remove /dev/sdb3 mdadm /dev/md3 --fail /dev/sdb4 mdadm /dev/md3 --remove /dev/sdb4 -
Get the failed disk's ID:
fdisk -lResult:
... Disk /dev/sdb: 838.36 GiB, 900185481216 bytes, 1758174768 sectors Disk model: SAMSUNG MZ7GE900 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: CD2ACB4C-1618-4BAF-A6BB-D2B9******** ...Save the
Disk identifiervalue for your support ticket.
Request physical disk replacement
Submit a disk replacement ticket to technical support, including your BareMetal server ID and failed disk ID.
Wait for data center engineers to replace the failed disk.
Add the new disk to your RAID array
After the physical drive replacement, partition the new disk and add it to the existing RAID array.
-
Use the
gdiskutility to specify the partition table type:GPTorMBR. If needed, installgdiskon your server’s operating system.Run the following command, specifying the ID of the remaining operational disk in the RAID array:
gdisk -l /dev/sdaDepending on the partition table type, the result will be as follows:
GPTMBRPartition table scan: MBR: protective BSD: not present APM: not present GPT: present ...Partition table scan: MBR: MBR only BSD: not present APM: not present GPT: not present ... -
Copy the partition table layout from the remaining operational disk in the RAID array to the new disk:
GPTMBRIf the source disk uses a GPT partition table:
-
Create a copy of the source disk partition table:
sgdisk --backup=table /dev/sdaResult:
The operation has completed successfully. -
Restore the partition table from the backup copy to the new disk:
sgdisk --load-backup=table /dev/sdbResult:
The operation has completed successfully. -
Assign a new random UUID to the new disk:
sgdisk -G /dev/sdbResult:
The operation has completed successfully.
If the source disk uses an MBR partition table:
-
Copy the partition table:
sfdisk -d /dev/sda | sfdisk /dev/sdbWhere:
/dev/sda: The remaining operational disk in the RAID array used as the partition table template./dev/sdb: The new disk that will receive the partition table copy from the source disk.
-
If new partitions are not visible after copying, reload the partition table:
sfdisk -R /dev/sdb
-
-
Add the disk to the RAID array by sequentially adding each of its partitions to their corresponding RAID components. The correspondence between disk partitions and RAID components was described earlier in Remove the failed disk from the RAID array.
Run the following commands:
mdadm /dev/md1 --add /dev/sdb2 mdadm /dev/md2 --add /dev/sdb3 mdadm /dev/md3 --add /dev/sdb4Once added to the array, the disk begins synchronizing. The sync speed depends on the disk capacity and type:
ssdorhdd.Result:
mdadm: added /dev/sdb2 mdadm: added /dev/sdb3 mdadm: added /dev/sdb4 -
Make sure the new disk has been successfully added to the RAID array:
cat /proc/mdstatResult:
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sdb3[2] sda3[0] 6287360 blocks super 1.2 [2/2] [UU] md3 : active raid1 sdb4[2] sda4[0] 849215488 blocks super 1.2 [2/2] [UU] bitmap: 4/7 pages [16KB], 65536KB chunk md1 : active raid1 sdb2[2] sda2[0] 10477568 blocks super 1.2 [2/2] [UU] unused devices: <none> -
Install the Linux bootloader on the new disk:
grub-install /dev/sdbResult:
Installing for i386-pc platform. Installation finished. No error reported.