Backblaze pod, somebody set us up the RAID

Just for fun, here are some short notes on setting up the old backblaze pod. This backblaze pod is the older chassis, but with the new SuperMicro X9SCL-F motherboard, Intel i3-2120 and 16GB of ram. It is using the older controllers (not the new RAID controller). The reason this is a franken-pod and not a new one is that the new backblaze units (hardware RAID) use a backplane that requires a new pod, different power supply and the hardware raid, which were pushing the project over budget.  Although there are online instructions for building a pod, I like mine on one page.

In a nutshell, install Debian on the main disk. Note for this I used the netinstall with non-free drivers. Personally, I had to remove the SATA controllers to do this as grub was getting confused and the rescue was not working, so this was simpler. Near the end when it asks for packages, I just installed SSH server and unchecked the rest. There is no need for desktop services/etc here. Finish that up, shutdown, install controllers, boot up. Now lets configure this box o’ drives.
Install a few tools we need first:

apt-get install hdparm mdadm parted xfsprogs lvm2

You will need to partition all of the disks. I leave writing a small one line script for parted as an exercise to the reader, but by hand it would look like:

parted /dev/sdn mklabel gpt
parted /dev/sdn mkpart primary 2048s 100%
parted /dev/sdn set 1 raid on

 

That would be done for all 45 drives (not the boot disk, of course).
Next issue: The most important issue with the backblaze is knowing where each drive is located in the chassis. The backblaze/45 drives people do it by choosing which SATA cables go to which controllers, which works with the exact controllers and motherboard they have. If you use a different one it may not match. If someone else wires up the cables (as in my case) it will not match. The way I present it here takes about 15 minutes with the machine open in front of you, but when you are done you know where the drives are. When you built it you should have noted each drive and its position. We arbitrarily choose one end to start at and call that space 1 and the last space 45 and then write down the serial number of each drive (they are on the drive) and the associated position. After that is done, in Debian, get the serial number associated with the device name by doing:

hdparm -I /dev/sdn |grep Serial

 

for each device. Again, it is left as an exercise to script that out for all of your devices. Write down the device number next to the drive in your list. This will come in very handy when a drive fails.
Now we just need to make the arrays. Since we do not have stacks of backblaze units all backing each other up, we use RAID 6 here and not RAID 5. Every third disk is chosen to more evenly distribute the drives over the controllers and the backplane, as shown in the whiteboard example below.

drive_distribution

Create the raid arrays with the mdadm tool you installed previously. You will note in the commands below that the first drive is not /dev/sda. That is because Debian took it to be the USB stick upon installation. /dev/sdau is the 500G boot drive. It comes in last as it is on the motherboard SATA Controller and Debian is starting with the PCIe SATA Controller drives first.

mdadm –create –verbose /dev/md0 –level=6 –raid-devices=14 /dev/sdb1 /dev/sde1 /dev/sdh1 /dev/sdk1 /dev/sdn1 /dev/sdq1 /dev/sdt1 /dev/sdw1 /dev/sdz1 /dev/sdac1 /dev/sdaf1 /dev/sdai1 /dev/sdal1 /dev/sdao1 –spare-devices=1 /dev/sdar1

mdadm –create –verbose /dev/md0 –level=6 –raid-devices=14 /dev/sdc1 /dev/sdf1 /dev/sdi1 /dev/sdl1 /dev/sdo1 /dev/sdr1 /dev/sdu1 /dev/sdx1 /dev/sdaa1 /dev/sdad1 /dev/sdag1 /dev/sdaj1 /dev/sdam1 /dev/sdap1 –spare-devices=1 /dev/sdas1

mdadm –create –verbose /dev/md2 –level=6 –raid-devices=14 /dev/sdd1 /dev/sdg1 /dev/sdj1 /dev/sdm1 /dev/sdp1 /dev/sds1 /dev/sdv1 /dev/sdy1 /dev/sdab1 /dev/sdae1 /dev/sdah1 /dev/sdak1 /dev/sdan1 /dev/sdaq1 –spare-devices=1 /dev/sdat1

cat /proc/mdstat

 

The output from cat /proc/mdstat should show them all there, but read-only and pending. You will have to force them to get a move on.

mdadm –readwrite /dev/md0
mdadm –readwrite /dev/md1
mdadm –readwrite /dev/md2

 

Now you should see them all in a state of active and resyncing:

cat /proc/mdstat

 

At this point we should not forget to put the array into mdadm so that it will come online at boot.

mdadm –examine –scan

 

When mdadm returns with the output of the three arrays, simply add that to the end of /etc/mdadm/mdadm.conf (right under the line that says: # definition of existing MD arrays).
You can do anything you want with the drives now, but personally, I like to wait until they are done syncing before hammering them with data. Ready? Great, our game plan is to initialize them for use with LVM, create the volume groups using all three RAID arrays then format it and put it to use.

pvcreate /dev/md0pvcreate /dev/md1pvcreate /dev/md2vgcreate -s 64M backblaze2 /dev/md0 /dev/md1 /dev/md2vgdisplay backblaze2

 

Ok, so we initialized the three RAID arrays with pvcreate and then created a single volume group of all three arrays with vgcreate and then displayed the volume group with vgdisplay. This volume group has the name backblaze2. You can name yours fred, or whatever you like. Let us now create the logical volume and then format it. Look carefully at the output of vgdisplay backblaze2. Note there is a line that says: "Free PE / Size"

You will want to take the number that corresponds to the “Free PE” and use that in the next command. In my case it was 34337853. The name after -n is the name of the volume group. You can use fred-backup or whatever you like.

lvcreate -l 34337853 backblaze2 -n backblaze2-backuplvdisplay /dev/backblaze2/backblaze2-backup

So, if that all looks good, let us go ahead and format it with xfs and mount it on /mnt/data.

mkfs.xfs /dev/backblaze2/backblaze2-backupmkdir /mnt/datamount /dev/backblaze2/backblaze2-backup /mnt/data

 

At this point if it all looks good, you can add an entry to fstab that looks like the following to auto mount the filesystem at boot:

/dev/backblaze2/backblaze2-backup /mnt/data  xfs  defaults  0 0

 

You should now be good to go with the drive portion of the backblaze. Next up, how to install Amanda on the pod.

This entry was posted in Servers. Bookmark the permalink.

2 Responses to Backblaze pod, somebody set us up the RAID

  1. Alec Weder says:

    The biggest issue with RAID are the unrecoverable read errors.
    If you loose the drive, the RAID has to read 100% of the remaining drives even if there is no data on portions of the drive. If you get an error on rebuild, the entire array will die.

    http://www.enterprisestorageforum.com/storage-management/making-raid-work-into-the-future-1.html

    A UER on SATA of 1 in 10^14 bits read means a read failure every 12.5 terabytes. A 500
    GB drive has 0.04E14 bits, so in the worst case rebuilding that drive in a five-drive
    RAID-5 group means transferring 0.20E14 bits. This means there is a 20% probability
    of an unrecoverable error during the rebuild. Enterprise class disks are less prone to this problem:

    http://www.lucidti.com/zfs-checksums-add-reliability-to-nas-storage

Leave a Reply

Your email address will not be published. Required fields are marked *