Very Basic RAID Explanation - Unadulterated Nerdery

Teaching RAID for Network+ and Security+ was always pretty easy because things were very surface level and you never had to really employ the concept for anything meaningful. Here’s a quick recap of what I would have told you:

We needed to group discs together and we wanted to get some performance out of it. If pure performance is your game RAID 0 is your name. Instead of three discs getting 25MB/s, you get three times 25MB/s (75MB/s). The drawback is that if you lose even one of those three discs, you lose all of the data.
The high performance of RAID 0 was great, but the losing of all the data scared you. Enter RAID 1. You got so paranoid you reverted to the “2 is one, 1 is none” theory. You have two 1TB drives. The first drive has your data, and the second drive has an exact copy of the first drive. The redundancy is nice, but the downfall is performance is not that great and you have to pay for two 1TB discs just to store 1TB of data.
You liked the idea of performance, but the lack of redundancy scared you. Enter RAID 4. There is a blend of redundancy and performance. The idea of parity discs and data discs is introduced. The data discs contain your information, your data. The parity disc is used to recover a data disc when a disc fails.

This is how the conversation ended the first couple of times I taught this. Sitting around the office, the guy who taught me most of what I know says it doesn’t make any sense that you could recover so much information with just a single disc. I thought about it, and didn’t have much of an answer. When I came back into the office, he says “hey Capt, I figured it out”. I’ve simplified it down a bit since his explanation, but here is the general gist:

For learning purposes, each disc can only store a single bit. Disc 1 has a 1, Disc 2 has a 0, Disc 3 has a 1. The data is added up (1+0+1=2). Because the parity disc only has a single bit, it stores whether the answer is even (0) or odd (1). Our data discs added up to 2, so our parity disc stores a 0. Let’s take a look at what happens when a disc fails. Disc 2 just failed. The information we still have: Disc 1 is a 1, Disc 3 is a 1. We also know that because the parity disc is a 0 that the sum of our data discs must be even. The current sum is even, so if we wrote a 1 on disc two it would be odd (which would be wrong). So there you have it. Disc 2 had to be a 0. This is how RAID 4 works. If you lost a parity disc, that would not be a problem, all you would need to do is insert another parity disc and you could re-compute your parity disc value. We just completed an exercise of what happens when you lose a single data disc. If you lost two data discs, there would be no way of telling what their contents are because two 0s and two 1s would result in the same answer (even). There would also be a problem if you lost a data disc and the parity disc because then you could not re-create the missing data. You probably won’t see RAID 4 with most of the gear we have, but RAID 5 is somewhat similar in concept to RAID 4 but the differences are outside of the scope of this post. More on that here.

To recap, you can lose a data disc, or the parity disc, but not both at the same time. You also cannot lose more than one data disc at a time. Hopefully this post has prepared you for the next time you need to initialize hard drives on a server, or are clicking settings on a filer initial setup.

Jasper Bongertz says:

January 12, 2015 at 5:13 pm

Hi John,

nice article, but I think RAID 5 is a bit different from your explanation – it doesn’t use a single parity disk. That’s what RAID 3 and 4 do. RAID 5 distributes the parity information over all disks to avoid the bottleneck of always having to write the parity information to a single disk.

BTW, with large drives my favorite is RAID 6, where you can lose 2 disks. The reason for that is that when you lose the first, rebuilding with a new disk can take hours, sometimes days. If another disk fails during that rebuild phase (which is more probable than under normal operations because of the high read/write load, and touching areas that may not have been active for a long time) RAID 5 is doomed. RAID 6 can survive another dead disk.

Cheers,
Jasper

John says:

January 12, 2015 at 5:41 pm

Jasper,
You are absolutely right, and I am tracking on that RAID 5 is actually distributed parity, where RAID 4 more closely resembles the admittedly simplistic example I used. I probably should have just called it RAID 4, the reason I didn’t it because I thought distributed parity went outside the concept of VERY basic, and most of the gear I have gotten my hands on only gave me the option of 0,1, or 5. I also like RAID 6, and am planning on a future post covering it. Thanks for the diligence and keeping me honest. I’ll edit the post to make sure this is clear.

Leo Thuringer says:

February 5, 2015 at 8:54 pm

Don’t forget that read errors (URE) affect RAID more than most people think.
If we were to lose a single drive, and any of the surviving drives experience an unrecoverable read error (URE), the entire array will fail.

http://forums.storagereview.com/index.php/topic/34094-is-raid-56-dead-due-to-large-drive-capacities/

As drives increase in size, any drive failure will always be accompanied by a read error. So RAID 6 will give you no more protection than RAID 5 does now, but you’ll pay more anyway for extra disk capacity and slower write performance.

http://www.lucidti.com/zfs-checksums-add-reliability-to-nas-storage

Leave a Reply Cancel reply