User Tools

Site Tools


software_raid

Overview

Long name is Redundant Arrays of Inexpensive Disks - see wikipedia for RAID levels.

IMPORTANT: a RAID is no replacement for backups! So: make sure to backup the data on the RAID regularly.

Setup

Raid Setup

The mdadm tool handles linux software RAIDs.

 sudo apt-get install mdadm

Prepare the disks:

 fdisk /dev/sd[abcd]

create a primary partition and set its type to Linux raid autodetect (hex code: fd). Do this for all disks you want to combine as raid.

Create a raid level 1 device node md0 with 2 hard discs:

 mdadm --create --verbose /dev/md0 --level=1 --run --raid-devices=2 /dev/sda /dev/sdb

Format the new device as ext3

 mkfs.ext3 /dev/md0

Write the raid configuration to mdadm's config file

 mdadm --detail --scan --verbose > /etc/mdadm/mdadm.conf

You should add a mail contact to the config so that it finally looks like

 ARRAY /dev/md0 level=raid6 num-devices=4 UUID=595ee5d4:d8fe61ac:e35eacf0:6e4b8477    devices=/dev/sda,/dev/sdb,/dev/sdc,/dev/sdd      MAILADDR mail@bla.org

Create a mountpoint and edit /etc/fstab so the new raid can be mounted automatically

 /dev/md0      /mnt/raid     ext3    defaults    1 2

Make sure the raid is mounted at boot. Put into /etc/rc.local

 mdadm -As
 mount /mnt/raid

mdadm uses the raid configuration provided in the /etc/mdadm/mdadm.conf we created before.

Troubleshooting

Device or Resource Busy

When trying to create a RAID array on Ubuntu Karmic (9.10) you might get an error saying “Device or resource busy”.

The culprit might be the dm-raid driver having taken control of the RAID devices.

#!highlight bash
 sudo apt-get remove dmraid libdmraid<version>

generates a new initrd without the dm-raid driver.

Just reboot afterwards, and try mdadm –create again.

Problems when assembling

If you get error messages when assembling the raid with mdadm -As check the config in /etc/mdadm/mdadm.cfg . Try manually assembling the RAID using something like

#!highlight bash
mdadm --assemble --scan /dev/sda /dev/sdb

If this works then it is most likely that the UUID in mdadm.cfg is wrong. To find the correct UUID, manually assemble the raid (see above) then use

#!highlight bash
sudo mdadm --detail /dev/md0

to display the details. Copy the UUID to mdadm.cfg .

Restoring a RAID array

IMPORTANT: DO NOT USE mdadm –create on an existing array. Use –assemble (see below).

If you have an existing (mdadm) RAID array, you can tell mdadm to automatically find and use it:

#!highlight bash
 sudo mdadm --assemble --scan # scanning tries to guess which partitions are to be assembled

Or you may explicitly choose the partitions to use:

#!highlight bash
  sudo mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1

Usage

Raid monitoring

Installing mdadm activates a monitoring daemon which is started at boot. To see if it's running do

#!highlight bash
 ps ax | grep monitor

You should see something like

 5785 ?        Ss     0:00 /sbin/mdadm --monitor --pid-file /var/run/mdadm/monitor.pid --daemonise --scan --syslog

If you add a mail address to the mdadm.conf, warning mails will be sent by the daemon in case of raid failures.

Access via smb

Install Samba server

#!highlight bash
 sudo apt-get install samba
}}

Edit /etc/samba/smb.conf to make the shares accessible.

<code>
[DATA]
path = /mnt/raid/bla/
browseable = yes
read only = no
guest ok = no
create mask = 0644
directory mask = 0755
force user = rorschach

Create the users who should be allowed to access the shares and give them passwords.

#!highlight bash
 sudo useradd -s /bin/true rorschach  # linux  user who may not login to the system
 sudo smbpasswd -L -a rorschach       #add samba user
 sudo smbpasswd -L -e rorschach        #enable samba user

Failures

RAID Health

#!highlight bash
 mdadm --detail /dev/md0

shows for a healthy raid

 /dev/md0:          Version : 00.90.03    Creation Time : Thu Apr 17 11:21:06 2008       Raid Level : raid6       Array Size : 781422592 (745.22 GiB 800.18 GB)    Used Dev Size : 390711296 (372.61 GiB 400.09 GB)     Raid Devices : 4    Total Devices : 4  Preferred Minor : 0      Persistence : Superblock is persistent          Update Time : Fri Apr 18 09:46:39 2008            State : active   Active Devices : 4  Working Devices : 4   Failed Devices : 0    Spare Devices : 0         Chunk Size : 256K               UUID : 595ee5d4:d8fe61ac:e35eacf0:6e4b8477           Events : 0.15        Number   Major   Minor   RaidDevice State         0       8        0        0      active sync   /dev/sda         1       8       16        1      active sync   /dev/sdb         2       8       32        2      active sync   /dev/sdc         3       8       48        3      active sync   /dev/sdd

Simulated failure

#!highlight bash
 mdadm --manage --set-faulty /dev/md1 /dev/sda

to set one disc as faulty. It says

 mdadm: set /dev/sda faulty in /dev/md0

Check the syslog to see what happens

#!highlight bash
 tail -f /var/log/syslog

The event has been detected and a mail has been sent to the admin.

 Apr 18 10:17:39 INES kernel: [77650.308834]  --- rd:4 wd:3  Apr 18 10:17:39 INES kernel: [77650.308836]  disk 1, o:1, dev:sdb  Apr 18 10:17:39 INES kernel: [77650.308839]  disk 2, o:1, dev:sdc  Apr 18 10:17:39 INES kernel: [77650.308841]  disk 3, o:1, dev:sdd   *Apr 18 10:17:39 INES mdadm: Fail event detected on md device /dev/md0, component device /dev/sda*   Apr 18 10:17:39 INES postfix/pickup[30816]: 86B902CA824F: uid=0 from=  Apr 18 10:17:39 INES postfix/cleanup[32040]: 86B902CA824F: message-id=<20080418081739.86B902CA824F@INES.arfcd.com>  Apr 18 10:17:39 INES postfix/qmgr[14269]: 86B902CA824F: from=, size=861, nrcpt=1 (queue active)   *Apr 18 10:17:39 INES postfix/smtp[32042]: 86B902CA824F: to=, relay=s0ms2.arc.local[172.24.10.6]:25, delay=0.46, delays=0.22/0.04/0.1/0.1, dsn=2.6.0, status=sent*   Apr 18 10:17:39 INES postfix/qmgr[14269]: 86B902CA824F: removed  Apr 18 10:18:39 INES mdadm: SpareActive event detected on md device /dev/md0, component device /dev/sda

Now the raid details look like this

 sudo mdadm --detail /dev/md0  /dev/md0:          Version : 00.90.03     Creation Time : Thu Apr 17 11:21:06 2008       Raid Level : raid6       Array Size : 781422592 (745.22 GiB 800.18 GB)    Used Dev Size : 390711296 (372.61 GiB 400.09 GB)     Raid Devices : 4    Total Devices : 4  Preferred Minor : 0      Persistence : Superblock is persistent         Update Time : Fri Apr 18 10:19:10 2008             *State : clean, degraded*    Active Devices : 3  Working Devices : 3   Failed Devices : 1    Spare Devices : 0         Chunk Size : 256K                UUID : 595ee5d4:d8fe61ac:e35eacf0:6e4b8477           Events : 0.20        Number   Major   Minor   RaidDevice State         0       0        0        0      removed         1       8       16        1      active sync   /dev/sdb         2       8       32        2      active sync   /dev/sdc         3       8       48        3      active sync   /dev/sdd            *4       8        0        -      faulty spare   /dev/sda*

"Exchange" disks

Remove the old disk from the raid

#!highlight bash
 mdadm /dev/md0 -r /dev/sda

Add the new disk to the raid

#!highlight bash
 mdadm /dev/md0 -a /dev/sda

Now you should see a recovery

 sudo mdadm --detail /dev/md0  /dev/md0:          Version : 00.90.03    Creation Time : Thu Apr 17 11:21:06 2008       Raid Level : raid6       Array Size : 781422592 (745.22 GiB 800.18 GB)    Used Dev Size : 390711296 (372.61 GiB 400.09 GB)     Raid Devices : 4    Total Devices : 4  Preferred Minor : 0      Persistence : Superblock is persistent        Update Time : Fri Apr 18 10:25:41 2008             *State : clean, degraded, recovering*    Active Devices : 3  Working Devices : 4   Failed Devices : 0    Spare Devices : 1         Chunk Size : 256K       *Rebuild Status : 1% complete*                UUID : 595ee5d4:d8fe61ac:e35eacf0:6e4b8477           Events : 0.62        Number   Major   Minor   RaidDevice State          *4       8        0        0      spare rebuilding   /dev/sda*          1       8       16        1      active sync   /dev/sdb         2       8       32        2      active sync   /dev/sdc         3       8       48        3      active sync   /dev/sdd

and

#!highlight bash
cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]   md0 : active raid6 sda[4] sdd[3] sdc[2] sdb[1]        781422592 blocks level 6, 256k chunk, algorithm 2 [4/3] [_UUU]        [>....................]  recovery =  3.9% (15251968/390711296) finish=105.1min speed=59486K/sec          unused devices:

Real failure

To recover data from a RAID1, you can try to mount one of the disks as a separate disk

#!highlight bash
 sudo mount -t ext3  /dev/<the device>  # you NEED to specify the filesystem-type manually!

Benchmarking

#!highlight bash
 sudo tiobench --size 66000 --threads 1 --threads 8

to test read and write performance with 1 and 8 threads.

software_raid.txt · Last modified: 2012/03/26 14:46 by mantis