This is an internal documentation. There is a good chance you’re looking for something else. See Disclaimer.

Software RAID

Work through this document from top to bottom. A RAID is set up first, then various features are shown and, finally, RAID is removed again.

Create Raid

For more information about software RAID on Linux, checkout the official Linux Raid wiki page.

Tip

mdadm needs to be installed:

apt install mdadm

Create storage for virtual disks:

truncate -s 500M disk1 disk2 disk3 disk4 disk5 disk6 disk7

Create virtual disks:

l1=$(sudo losetup --show -f disk1)
l2=$(sudo losetup --show -f disk2)
l3=$(sudo losetup --show -f disk3)
l4=$(sudo losetup --show -f disk4)
l5=$(sudo losetup --show -f disk5)
l6=$(sudo losetup --show -f disk6)
l7=$(sudo losetup --show -f disk7)

Create RAID:

mdadm /dev/md12 --create --bitmap=internal --level=raid10 --raid-devices=4 $l1 $l2 $l3 $l4

Tip

By default, a full sync is run after initializing an array which can impact performance greatly. As alternative, a full sync may be delayed:

  1. Use --assume-clean with --create.

  2. Create FS and copy data onto RAID.

  3. Force resync:

    echo repair >/sys/block/md12/md/sync_action
    

You can also limit the resync speed artificially as described in Performance During Resync/Reshape.

Show status of RAID:

mdadm -D /dev/md12

Show status of a single device in RAID:

mdadm -E $l1

Show status of all RAIDs:

cat /proc/mdadm

Add drives to config:

mdadm --examine --scan --config=mdadm.conf >> /etc/mdadm.conf

Tip

Declaring RAIDs in mdadm.conf is recommended to ensure proper assembly during boot.

Replace Working Disk

Add a spare disk:

mdadm /dev/md12 --add $l7

Mark an active drive for replacement:

mdadm /dev/md12 --replace $l3

When a drive is marked for replacement, it is automatically replaced with one of the spare drives.

Check the status of replacing the drive:

mdadm -D /dev/md12

Or:

cat /proc/mdstat

Tip

If you don’t have enough slots for an additional disk, use mdadm /dev/md12 --fail $3 and the proceed as described in Replace Failed Disk.

Replace Failed Disk

Artificially fail a disk:

mdadm /dev/md12 --fail $l7

Check status:

mdadm -D /dev/md12

# or

cat /proc/mdstat

Add new disk:

mdadm /dev/md12 --add $l3

Remove faulty disk:

mdadm /dev/md12 --remove $l7

Check rebuild status:

mdadm -D /dev/md12

Or:

cat /proc/mdstat

Grow RAID / Add Disks

Check for exiting spare disks:

$ mdadm -D /dev/md12
…
    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1

       2       8       33        -      spare   /dev/sdc1
       3       8       49        -      spare   /dev/sdd1
       4       8       65        -      spare   /dev/sde1
       5       8       81        -      spare   /dev/sdf1

If spare disks are available, you can just grow the RAID.

Otherwise, add spare disks:

mdadm /dev/md12 --add $l6 $l7

You now see the disks as spare drives:

mdadm -D /dev/md12

The disks are promoted to active automatically should any of the active disks fail.

Now you can grow the RAID:

mdadm /dev/md12 --grow --raid-devices=6

Now you see 6 active members:

mdadm -D /dev/md12

You may have to wait a moment for the reshape to complete.

Use the current RAID configuration to generate a fresh config:

mdadm --examine --scan --config=mdadm.conf

And patch /etc/mdadm/mdadm.conf as needed.

Tip

If you grow the RAID, you may have to grow LVM Physical Volumes and/or filesystems also.

Performance During Resync/Reshape

Replacing a disk, resizing an array, changing RAID layout and adding disks can have a non-neglectable performance impact.

An easy way to limit the performance impact is to limit the resync/reshape speed:

sysctl dev.raid.speed_limit_max=20000  # KiB/s

Rather than changing the global limit, adjust limit for a particular array:

echo 20000 >/sys/block/md12/md/sync_speed_max

See RAID array

Alerts / Monitoring

The mdadm packages ships with monitoring and integrity tools.

Drive failure:

The mdmonitor.service sends out mails on disk failure.

Integrity checks:

Cron job at /etc/cron.d/mdadm runs a monthly integrity check comparing copies / checking parity bits. Inconsistencies are reported via mail.

Missing drives:

A (reminder) for missing/failed drives is mailed daily by /etc/cron.daily/mdadm.

Grow RAID / Move to Larger Disks

In reality, you’d just replace all disks with larger disks. For this example, we just resize the virtual disks.

Increase size of virtual disks:

truncate -s 550M  disk1 disk2 disk3 disk4 disk5 disk6
for dev in $l1 $l2 $l3 $l4 $l5 $l6 $l7; do losetup --set-capacity $dev; done

Check current size:

mdadm -D /dev/md12

Increase size of RAID:

mdadm /dev/md12 --grow --size max

Check if size increased:

mdadm -D /dev/md12

Clean up loopback devices:

losetup -d $l1 $l2 $l3 $l4 $l5 $l6 $l7
rm disk[1-7]

RAID Won’t Start

Check out Assemble Run in the official wiki.