Page 1: RAID Levels
Intro:
RAID: Redundant Array of Independent (or Inexpensive) Disks. We have already seen what »terminology and technology are used when describing RAID in general, but how do these technologies fit into the actual RAID standard? There are levels which correspond to a type of RAID which determines what technology is used and how. If you missed part 1, »now is your chance to read up on RAID again.
Levels Are Not:
RAID 5 may be a higher level than RAID 1, but that does not mean that RAID 5 is superior to RAID 1 in every way. A RAID level is just a technique being used and does not correspond to any hierarchy. When looking at a RAID controller, it is important to understand what the benefits are to each RAID level as most controllers only support a subset of levels. Originally, only RAID 1 through RAID 5 were defined but since the 1988 Berkeley paper was written, other levels have been added.
RAID 0: Striping
RAID 0 is the bastard child of RAID. The reason it has a zero for a number is because it is not redundant at all. RAID 0 is a straight stripe with no parity. The benefits to using RAID 0 are great, though. With RAID 0, you will have increased read and write speeds and well as a faster random access time. The drawback is that if one drive in the array fails, the entire array is destroyed as is the data that is contained on the drive. When using RAID 0, it is very important to keep backups of critical data handy just in case the inevitable happens. All drives will break as they wear out. It is better to be safe than sorry.
I personally have not seen any server system use RAID 0 unless the data was not important. I have read that there are some fileserver systems that use RAID 0 to get more space out of a single logical device, but this technique is asking for trouble. RAID 0 is typically used for when speed is needed and data redundancy is at a minimum. Any sort of editing such as for video or audio would benefit from using RAID 0 if the files are large and there are many transforms being done on the data. As long as the system needs to use the swap space, it will benefit from RAID 0. Generally, many gamers that have high-end systems will use RAID 0 for that extra push as well.
HDD Requirements: Minimum of two drives for a stripe and can increase from there.
Array Capacity: Smallest drive space * number of drives (100% efficiency if same size drives are used).
Fault Tolerance: Zero. Hopefully, you are backing up your data.
Cost: Very inexpensive and supported on almost all controllers.
RAID 1: Mirroring or Duplexing
The true first RAID level is mirroring. Mirroring involves taking one set of data and duplicating it to multiple locations. In the simplest of cases, data block 1 goes to drive A and drive B. You can expand this to a full duplex by using multiple controllers. The effect is the same. This RAID level is the simplest form of redundancy and offers a high data protection fault tolerance. A mirror takes exactly two drives. If the same size drives are used to form the mirror, the space loss is 50% due to redundancy. Either drive can fail and the array would work fine.
I have used and seen servers and home systems that use RAID 1 for all their storage needs. Many people will pick RAID 1 since drives are cheap nowadays and that little extra redundancy really helps between backups incase a drive does fail. Even if you have great backups, getting back up and running is usually a hassle. RAID 1 can help that problem by never going down at all. RAID 1 offers better performance while reading than a single drive, but should be slower on writes due to a duplicate drive being written to.
Hard Disk Requirements: Two drives that are exactly the same
Array Capacity: Smallest drive space (50% efficiency if above condition is met)
Fault Tolerance: Good. Any drive can fail and the array will still function.
Cost: Moderate due to the need to duplicate components. RAID 1 is supported on nearly every controller.
RAID 2: Bit-level Striping with Hamming Code (ECC)
If RAID 2 was actually used today, you would see systems that would have 10+ drives in them. This RAID level is mostly gone and is no longer even considered for use today. The ECC built into hard drives negated the use of this RAID level. It should not even be bothered to learn what it is.
RAID 3: Byte-level Striping With Parity (Dedicated)
Data is striped like RAID 0 across all drives in the array on the byte-level instead of block level. RAID 3 contains one dedicated parity drive, but this is not a single point of failure. Any drive in a RAID 3 array can fail and the array will still work. The dedicated parity drive is usually a source of a performance bottleneck due to the requirement that it must be written to every time anything is sent to the array. You will lose one drive to parity.
I have not seen anything using RAID 3. I believe that RAID 3 and RAID 4 are basically overshadowed by the better (opinion) RAID 5. RAID 3 and RAID 4 are very similar, just the size of the data block being sent is different.
Hard Disk Requirements: Three drives minimum or more.
Array Capacity: Smallest drive space * (number of drives - 1) (efficiency would be (number of drives - 1)/number of drives))
Fault Tolerance: Good. Any drive in the array can fail and the array will still function.
Cost: Moderate due to the need for a special controller and three hard drives.
RAID 4: Block-level Striping With Parity (Dedicated)
RAID 3 and RAID 4 are very much the same, the only different being the size of the data blocks. In RAID 4, the block size is larger than bytes and can comprise multiple KB of data. Like with RAID 3, RAID 4 can support a single drive failure in the array and the array will still function. Since RAID 4 uses a dedicated parity drive, there is a bottleneck in performance due to the requirement that it must be written to every time something is sent to the array. You lose a single drive space due to parity.
Like RAID 3, I have not personally seen any implementation of RAID 4. Most businesses and such that use striping with parity are using RAID 5 or a combination of RAID levels.
Hard Disk Requirements: Three drives minimum or more.
Array Capacity: Smallest drive space * (number of drives - 1) (efficiency would be (number of drives - 1)/number of drives))
Fault Tolerance: Good. Any drive in the array can fail and the array will still function.
Cost: Moderate due to the need for a special controller and three hard drives.
RAID 5: Byte-level Striping With Parity (Distributed)
RAID 5 shares the same traits as RAID 3 and RAID 4, the difference being that RAID 5 does not contain a dedicated parity drive. Instead, in a RAID 5 array, the parity is distributed along with the data to each drive in the array. This simple change negates the performance bottleneck of the dedicated parity drive that occurs with RAID 3 and RAID 4. Performance is generally higher than that of RAID 3 and RAID 4, yet the controller will still need to do extra work to calculate parity. You lose a single drive space due to parity. One thing to note is that an array rebuild will take some time due to the distributed parity. Make sure you replace a broken drive as soon as possible.
This RAID level is the most popular one for servers. I have seen it used on low end to high end servers and everything covering the middle. Many controllers for workstations now support RAID 5. Drives are cheap. Use RAID 5 over RAID 0 for that extra data protection.
Hard Disk Requirements: Three drives minimum or more.
Array Capacity: Smallest drive space * (number of drives - 1) (efficiency would be (number of drives - 1)/number of drives))
Fault Tolerance: Good. Any drive in the array can fail and the array will still function.
Cost: Low to moderate due to the need of three hard drives. Most controllers now support RAID 5.
RAID 6: Byte-level Striping With Dual Parity (Distributed)
What is RAID 6? This one is a new form of RAID that I have seen. It is basically RAID 5 with more protection built in. Instead of having parity distributed so that a single drive can fail, it uses much more parity so that any two drives can fail and the array will still function. RAID 6 may be able to handle more failures, but do to the extra parity, it will suffer on performance. When servers need high availability, RAID 6 is a good option to use over RAID 5... if your controller supports it. Rebuilds will take an extremely long time due to the dual parity calculation.
I have seen controllers that use RAID 6 and they are expensive. RAID 6 will be an asset to servers that require high availability over RAID 5.
Hard Disk Requirements: Four drives minimum or more.
Array Capacity: Smallest drive space * (number of drives - 2) (efficiency would be (number of drives - 2)/number of drives))
Fault Tolerance: Great. Any two drives in the array can fail and the array will still function.
Cost: High due to the need of four hard drives and a special controller
Conclusion:
While many power users use a combination of RAID 0 or RAID 1, there are many people that combine RAID levels to give extra performance and protection. Stay tuned for another article that explains multiple RAID levels.
RAID: Redundant Array of Independent (or Inexpensive) Disks. We have already seen what »terminology and technology are used when describing RAID in general, but how do these technologies fit into the actual RAID standard? There are levels which correspond to a type of RAID which determines what technology is used and how. If you missed part 1, »now is your chance to read up on RAID again.
Levels Are Not:
RAID 5 may be a higher level than RAID 1, but that does not mean that RAID 5 is superior to RAID 1 in every way. A RAID level is just a technique being used and does not correspond to any hierarchy. When looking at a RAID controller, it is important to understand what the benefits are to each RAID level as most controllers only support a subset of levels. Originally, only RAID 1 through RAID 5 were defined but since the 1988 Berkeley paper was written, other levels have been added.
RAID 0: Striping
RAID 0 is the bastard child of RAID. The reason it has a zero for a number is because it is not redundant at all. RAID 0 is a straight stripe with no parity. The benefits to using RAID 0 are great, though. With RAID 0, you will have increased read and write speeds and well as a faster random access time. The drawback is that if one drive in the array fails, the entire array is destroyed as is the data that is contained on the drive. When using RAID 0, it is very important to keep backups of critical data handy just in case the inevitable happens. All drives will break as they wear out. It is better to be safe than sorry.
I personally have not seen any server system use RAID 0 unless the data was not important. I have read that there are some fileserver systems that use RAID 0 to get more space out of a single logical device, but this technique is asking for trouble. RAID 0 is typically used for when speed is needed and data redundancy is at a minimum. Any sort of editing such as for video or audio would benefit from using RAID 0 if the files are large and there are many transforms being done on the data. As long as the system needs to use the swap space, it will benefit from RAID 0. Generally, many gamers that have high-end systems will use RAID 0 for that extra push as well.
HDD Requirements: Minimum of two drives for a stripe and can increase from there.
Array Capacity: Smallest drive space * number of drives (100% efficiency if same size drives are used).
Fault Tolerance: Zero. Hopefully, you are backing up your data.
Cost: Very inexpensive and supported on almost all controllers.
RAID 1: Mirroring or Duplexing
The true first RAID level is mirroring. Mirroring involves taking one set of data and duplicating it to multiple locations. In the simplest of cases, data block 1 goes to drive A and drive B. You can expand this to a full duplex by using multiple controllers. The effect is the same. This RAID level is the simplest form of redundancy and offers a high data protection fault tolerance. A mirror takes exactly two drives. If the same size drives are used to form the mirror, the space loss is 50% due to redundancy. Either drive can fail and the array would work fine.
I have used and seen servers and home systems that use RAID 1 for all their storage needs. Many people will pick RAID 1 since drives are cheap nowadays and that little extra redundancy really helps between backups incase a drive does fail. Even if you have great backups, getting back up and running is usually a hassle. RAID 1 can help that problem by never going down at all. RAID 1 offers better performance while reading than a single drive, but should be slower on writes due to a duplicate drive being written to.
Hard Disk Requirements: Two drives that are exactly the same
Array Capacity: Smallest drive space (50% efficiency if above condition is met)
Fault Tolerance: Good. Any drive can fail and the array will still function.
Cost: Moderate due to the need to duplicate components. RAID 1 is supported on nearly every controller.
RAID 2: Bit-level Striping with Hamming Code (ECC)
If RAID 2 was actually used today, you would see systems that would have 10+ drives in them. This RAID level is mostly gone and is no longer even considered for use today. The ECC built into hard drives negated the use of this RAID level. It should not even be bothered to learn what it is.
RAID 3: Byte-level Striping With Parity (Dedicated)
Data is striped like RAID 0 across all drives in the array on the byte-level instead of block level. RAID 3 contains one dedicated parity drive, but this is not a single point of failure. Any drive in a RAID 3 array can fail and the array will still work. The dedicated parity drive is usually a source of a performance bottleneck due to the requirement that it must be written to every time anything is sent to the array. You will lose one drive to parity.
I have not seen anything using RAID 3. I believe that RAID 3 and RAID 4 are basically overshadowed by the better (opinion) RAID 5. RAID 3 and RAID 4 are very similar, just the size of the data block being sent is different.
Hard Disk Requirements: Three drives minimum or more.
Array Capacity: Smallest drive space * (number of drives - 1) (efficiency would be (number of drives - 1)/number of drives))
Fault Tolerance: Good. Any drive in the array can fail and the array will still function.
Cost: Moderate due to the need for a special controller and three hard drives.
RAID 4: Block-level Striping With Parity (Dedicated)
RAID 3 and RAID 4 are very much the same, the only different being the size of the data blocks. In RAID 4, the block size is larger than bytes and can comprise multiple KB of data. Like with RAID 3, RAID 4 can support a single drive failure in the array and the array will still function. Since RAID 4 uses a dedicated parity drive, there is a bottleneck in performance due to the requirement that it must be written to every time something is sent to the array. You lose a single drive space due to parity.
Like RAID 3, I have not personally seen any implementation of RAID 4. Most businesses and such that use striping with parity are using RAID 5 or a combination of RAID levels.
Hard Disk Requirements: Three drives minimum or more.
Array Capacity: Smallest drive space * (number of drives - 1) (efficiency would be (number of drives - 1)/number of drives))
Fault Tolerance: Good. Any drive in the array can fail and the array will still function.
Cost: Moderate due to the need for a special controller and three hard drives.
RAID 5: Byte-level Striping With Parity (Distributed)
RAID 5 shares the same traits as RAID 3 and RAID 4, the difference being that RAID 5 does not contain a dedicated parity drive. Instead, in a RAID 5 array, the parity is distributed along with the data to each drive in the array. This simple change negates the performance bottleneck of the dedicated parity drive that occurs with RAID 3 and RAID 4. Performance is generally higher than that of RAID 3 and RAID 4, yet the controller will still need to do extra work to calculate parity. You lose a single drive space due to parity. One thing to note is that an array rebuild will take some time due to the distributed parity. Make sure you replace a broken drive as soon as possible.
This RAID level is the most popular one for servers. I have seen it used on low end to high end servers and everything covering the middle. Many controllers for workstations now support RAID 5. Drives are cheap. Use RAID 5 over RAID 0 for that extra data protection.
Hard Disk Requirements: Three drives minimum or more.
Array Capacity: Smallest drive space * (number of drives - 1) (efficiency would be (number of drives - 1)/number of drives))
Fault Tolerance: Good. Any drive in the array can fail and the array will still function.
Cost: Low to moderate due to the need of three hard drives. Most controllers now support RAID 5.
RAID 6: Byte-level Striping With Dual Parity (Distributed)
What is RAID 6? This one is a new form of RAID that I have seen. It is basically RAID 5 with more protection built in. Instead of having parity distributed so that a single drive can fail, it uses much more parity so that any two drives can fail and the array will still function. RAID 6 may be able to handle more failures, but do to the extra parity, it will suffer on performance. When servers need high availability, RAID 6 is a good option to use over RAID 5... if your controller supports it. Rebuilds will take an extremely long time due to the dual parity calculation.
I have seen controllers that use RAID 6 and they are expensive. RAID 6 will be an asset to servers that require high availability over RAID 5.
Hard Disk Requirements: Four drives minimum or more.
Array Capacity: Smallest drive space * (number of drives - 2) (efficiency would be (number of drives - 2)/number of drives))
Fault Tolerance: Great. Any two drives in the array can fail and the array will still function.
Cost: High due to the need of four hard drives and a special controller
Conclusion:
While many power users use a combination of RAID 0 or RAID 1, there are many people that combine RAID levels to give extra performance and protection. Stay tuned for another article that explains multiple RAID levels.