Very Basic RAID Explanation

Teaching RAID for Network+ and Security+ was always pretty easy because things were very surface level and you never had to really employ the concept for anything meaningful.  Here’s a quick recap of what I would have told you:

  • We needed to group discs together and we wanted to get some performance out of it.  If pure performance is your game RAID 0 is your name.  Instead of three discs getting 25MB/s, you get three times 25MB/s (75MB/s).  The drawback is that if you lose even one of those three discs, you lose all of the data.
  • The high performance of RAID 0 was great, but the losing of all the data scared you.  Enter RAID 1.  You got so paranoid you reverted to the “2 is one, 1 is none” theory.  You have two 1TB drives.  The first drive has your data, and the second drive has an exact copy of the first drive.  The redundancy is nice, but the downfall is performance is not that great and you have to pay for two 1TB discs just to store 1TB of data.
  • You liked the idea of performance, but the lack of redundancy scared you.  Enter RAID 4.  There is a blend of redundancy and performance.  The idea of parity discs and data discs is introduced.  The data discs contain your information, your data.  The parity disc is used to recover a data disc when a disc fails.

This is how the conversation ended the first couple of times I taught this.  Sitting around the office, the guy who taught me most of what I know says it doesn’t make any sense that you could recover so much information with just a single disc.  I thought about it, and didn’t have much of an answer.  When I came back into the office, he says “hey Capt, I figured it out”.  I’ve simplified it down a bit since his explanation, but here is the general gist:

For learning purposes, each disc can only store a single bit.  Disc 1 has a 1, Disc 2 has a 0, Disc 3 has a 1.  The data is added up (1+0+1=2).  Because the parity disc only has a single bit, it stores whether the answer is even (0) or odd (1).  Our data discs added up to 2, so our parity disc stores a 0.  Let’s take a look at what happens when a disc fails.  Disc 2 just failed.  The information we still have: Disc 1 is a 1, Disc 3 is a 1.  We also know that because the parity disc is a 0 that the sum of our data discs must be even.  The current sum is even, so if we wrote a 1 on disc two it would be odd (which would be wrong).  So there you have it.  Disc 2 had to be a 0.  This is how RAID 4 works.  If you lost a parity disc, that would not be a problem, all you would need to do is insert another parity disc and you could re-compute your parity disc value.  We just completed an exercise of what happens when you lose a single data disc.  If you lost two data discs, there would be no way of telling what their contents are because two 0s and two 1s would result in the same answer (even).  There would also be a problem if you lost a data disc and the parity disc because then you could not re-create the missing data.  You probably won’t see RAID 4 with most of the gear we have, but RAID 5 is somewhat similar in concept to RAID 4 but the differences are outside of the scope of this post.  More on that here.

To recap, you can lose a data disc, or the parity disc, but not both at the same time.  You also cannot lose more than one data disc at a time.  Hopefully this post has prepared you for the next time you need to initialize hard drives on a server, or are clicking settings on a filer initial setup.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.