Linux software RAID guide
Software RAID under Linux is a rock solid solution for building RAID arrays at no cost. I personnally never had problems with software raid under Linux while i've seen expensive hardware RAID card failing and corrupting arrays. It's a little slower that hardware RAID but it remove a possible point of failure in a system and again, it's free. The only drawback is that configuration can be sometimes tricky. This guide is a collection of command you can use in different situation.
Creating a new array
First use fdisk to create a partition on both hard drive. You must make sure the filesystem is set to Linux raid autodetect (code fd).
Then do something like :
mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
View arrays details
Simply do :
cat /proc/mdstat
or for a specific array
mdadm --query --detail /dev/md0
Adding a new drive or a hotspare to an array
First you need to make sure the partition you are trying to add (eg /dev/sdc1) is the same size as the one in the array (/dev/sda or sdb). You can do that with fdisk or by using sfdisk :
sfdisk --d /dev/sda | sfdisk /dev/sdc
Warning: be careful to specify the right source and destination drives when using sfdisk or your could blank out the partition table on your good drive.
mdadm --add /dev/md0 /dev/sdc1
Removing a failed drive
Having two drives configured in a RAID1 mirror allows the server to continue to function when either drive fails. When a drive fails completely, the kernel RAID driver automatically removes it from the array.
However, a drive may start having seek errors without failing completely. In that situation the RAID driver may not remove it from service and performance will degrade. Luckily you can manually remove a failing drive using the "mdadm" command. For example, to manually mark /dev/sda on /dev/md0 as failed :
mdadm /dev/md0 --fail /dev/sda
and now we can remove it:
mdadm --remove /dev/md0 /dev/sda1
Once removed it is safe to power down the server and replace the failed drive
Removing a working drive
Sometimes you might want to remove a working drive from an array but to keep it in the system. If you follow the previous steps (removing a failed drive), it might reappear in the array after rebooting. To remove it once and for all do the following :
mdadm /dev/md0 --fail /dev/sda1 mdadm --remove /dev/md0 /dev/sda1 mdadm --zero-superblock /dev/sda
Removing an array
Here are the steps to remove completely an array and make sure it doesnt reappear after rebooting (in this example I assume you want to remove /dev/md0 which contains /dev/sda and sdb) :
mdadm --stop /dev/md0 mdadm --zero-superblock /dev/sda mdadm --zero-superblock /dev/sdb /usr/share/mdadm/mkconf force-generate /etc/mdadm/mdadm.conf
Preparing the new drive
Once system as been rebooted with the new unformatted replacement drive in place, some manual intervention is required to partition the drive and add it to the RAID array.
The new drive must have an identical (or nearly identical) partition table to the other. You can use fdisk to manually create a partition table on the new drive identical to the table of the other, or if both drives are identical you can use the “sfdisk” command to duplicate the partition. For example, to copy the partition table from the second drive sdb onto the first drive sda, the sfdisk command is as follows:
sfdisk --d /dev/sdb | sfdisk /dev/sda
Warning: be careful to specify the right source and destinations drives when using sfdisk or your could blank out the partition table on your good drive.
Once the partitions have been created, you can add them to the corresponding RAID devices using "mdadm –add" commands. For example:
mdadm --add /dev/md0 /dev/sda1
Once added, the Linux kernel immediately starts re-syncing contents of the arrays onto the new drive. You can monitor progress via "cat /proc/mdstat". Syncing uses idle CPU cycles to avoid overloading a production system, so performance should not be affected too badly. The busier the server (and larger the partitions), the longer the re-sync will take.
Note that you don't have to wait until all partitions are re-synced servers can be on-line and in production while syncing is in progress: no data will be lost and eventually all drives will become synchronized.
Rebuild speed
If you need for whatever reason to reduce rebuild speed you can do it by doing the following (10mb/s in the example):
echo 10000 > /proc/sys/dev/raid/speed_limit_max
And to increase it (50mb/s) :
echo 50000 > /proc/sys/dev/raid/speed_limit_min
In theory, the rebuilding process has a low priority so it will ajust to disk usage. For example, if you are running a i/o intensive process, the rebuild speed will be decreased automatically to almost nothing (eg 1mb/s). There is situations where you need to manually ajust rebuilding speed, especially with virtualization. It happend once that my Xen host decided to check arrays and my Xen VM became unreponsive, the fix was simply to decrease rebuild speed manually.
Array restoration
Let say you have 2 partitions that are RAID array from a previous installation or whatever the case may be and you want to restore them. Simply do :
mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1
Mdadm email notifications when a drive failed
Simply edit mdadm.conf :
vi /etc/mdadm/mdadm.conf
... and add a MAILADDR line (with your email address) to the file, e.g. like this:
[...] MAILADDR you@yourdomain.com [...]
and restart mdadm
/etc/init.d/mdadm restart
/dev/md0 appear as /dev/md_d0 after reboot
I've recently had a little problem under Ubuntu. After creating an array (eg md0), I would reboot and it would reappear as /dev/md_d0. This is due to mdadm.conf not having the proper arrays UUID. It can be solved by doing the following :
- Stop the array
- Auto detect arrays
- Rewrite mdadm.conf
- Update initramfs
mdadm --stop /dev/md_d0
mdadm --auto-detect/usr/share/mdadm/mkconf force-generate /etc/mdadm/mdadm.conf
update-initramfs -uSummary
Linux software RAID is far more cost effective and flexible than hardware RAID, though it is more complex and requires manual intervention when replacing drives. In most situations, software RAID performance is as good (and often better) than an equivalent hardware RAID solution, all at a lower cost and with greater flexibility. When all you need are mirrored drives, software RAID is often the best choice.

