PLS Explan RAID to me

That Mason Guy · June 4, 2017

I have no understanding to what RAID is for and what the different numbers are for, like I really just don't understand If someone could explain it to me that'd be great

Oshino Shinobu · June 4, 2017

RAID = Redundant Array of Independent Disks/Drives.

It is primarily used as a way to provide some data redundancy so that if a drive dies you don't lose all of your data. There are several different versions of RAID, each with different benefits and drawbacks. The main ones:

RAID 0: While technically not RAID as it doesn't provide any redundancy, it is still classed as such. In RAID 0, data is split (striped is a more accurate term) across all the drives in the array. It improves read and write performance, though all data on the array is lost if any of the drives fail. Unlike other versions of RAID, RAID 0 actually increases your chances of losing data. RAID 0 requires a minimum of 2 drives and can theoretically span across an unlimited number of drives (obviously, practicality limitations will come in to play, so don't expect to have 1000s of drives in RAID 0). Capacity of the array is the capacity of the smallest drive in the array multiplied by the number of drives in the array.

RAID 1: Simple mirroring. All data is written to all drives in the array, so as long as one drive in the array remains, you do not lose data. Like RAID 0, it requires at least 2 drives and can theoretically scale infinitely. As all data is written to all drives, the capacity is that of the smallest drive in the array. RAID 1 becomes very cost inefficient when you add more than a few of drives. Personally, I wouldn't advise anything over 3 drives for RAID 1.

RAID 10: Basically RAID 1+0. It operates in the same way as RAID 0, but each stripe of data is mirrored via RAID 1. RAID 1 requires at least 4 drives but becomes pretty cost inefficient when going much over that.

RAID 5: Data is striped across the drives, along with a recovery/parity partition on each drive. You can lose any drive and then add a new one back in to rebuild the array. You cannot lose two drives at once without losing data. RAID 5 requires at least 3 drives but it is generally considered too inefficient with less than 4 drives. Capacity is that of the smallest drive multiplied by the number of drives minus 1 drive.

RAID 6: Like RAID 5 but with double parity, so you can lose any two drives at a time. Requires 4 drives or more. Capacity is the same as RAID 5 but minus 2 drives.

RAID 2, 3 and 4 are also a thing, but are not as commonly used. This is a good article to read: http://searchstorage.techtarget.com/definition/RAID

EDIT: It's important to note that RAID is not an alternative to a backup. RAID does not protect against data corruption, theft, fire/water damage, malware (WannaCry, for example) or many other things. If you have to choose between RAID and a backup, go for the backup.

ChalkChalkson · June 4, 2017

59 minutes ago, Oshino Shinobu said:

RAID 1 requires at least 4 drives

I am pretty sure that that was a typo, pretty sure you meant to say RAID 10.

1 hour ago, Oshino Shinobu said:

along with a recovery/parity partition on each drive

More technically: the parity drive stores the xor of all the other drives

Xor is the same as adding all numbers up and writing down the last digit in binary

Example if you have 3 drives in raid 5, you have 2 data drives, lets imagine the data on these looks like this (using X as replacement for 1 because of formatting):

X00X0X0XXX0X0X00X0

0X0X0XX0X0X0X00X0X

then the data on the parity looks like this:

XXX000XX0XXXXX0XXX

If you add more drives it is the same as doing this operation to the first 2 drives and then pretending this result is a drive that replaces the other 2.

If now one of the drives fail you can just do the same operation with all the remaining drives and the parity and you get the data that was on the missing drive.

For RAID 6 you need a second function that works similarly but is way more complex in its details

H0R53 · June 4, 2017

4 hours ago, That Mason Guy said:

I have no understanding to what RAID is for and what the different numbers are for, like I really just don't understand If someone could explain it to me that'd be great

Do you not have access to Google and WikiPedia? Holy shit.

RAID 0[edit]

Diagram of a RAID 0 setup
RAID 0 (also known as a stripe set or striped volume) splits ("stripes") data evenly across two or more disks, without parity information, redundancy, or fault tolerance. Since RAID 0 provides no fault tolerance or redundancy, the failure of one drive will cause the entire array to fail; as a result of having data striped across all disks, the failure will result in total data loss. This configuration is typically implemented having speed as the intended goal.[2][3] RAID 0 is normally used to increase performance, although it can also be used as a way to create a large logical volume out of two or more physical disks.[4]

A RAID 0 setup can be created with disks of differing sizes, but the storage space added to the array by each disk is limited to the size of the smallest disk. For example, if a 120 GB disk is striped together with a 320 GB disk, the size of the array will be 120 GB × 2 = 240 GB. However, some RAID implementations allow the remaining 200 GB to be used for other purposes.

The diagram in this section shows how the data is distributed into Ax stripes on two disks, with A1:A2 as the first stripe, A3:A4 as the second one, etc. Once the stripe size is defined during the creation of a RAID 0 array, it needs to be maintained at all times. Since the stripes are accessed in parallel, an n-drive RAID 0 array appears as a single large disk with a data rate n times higher than the single-disk rate.

Performance[edit]
A RAID 0 array of n drives provides data read and write transfer rates up to n times higher than the individual drive rates, but with no data redundancy. As a result, RAID 0 is primarily used in applications that require high performance and are able to tolerate lower reliability, such as in scientific computing[5] or computer gaming.[6]

Some benchmarks of desktop applications show RAID 0 performance to be marginally better than a single drive.[7][8] Another article examined these claims and concluded that "striping does not always increase performance (in certain situations it will actually be slower than a non-RAID setup), but in most situations it will yield a significant improvement in performance".[9][10] Synthetic benchmarks show different levels of performance improvements when multiple HDDs or SSDs are used in a RAID 0 setup, compared with single-drive performance. However, some synthetic benchmarks also show a drop in performance for the same comparison.[11][12]

RAID 1[edit]
See also: RAID 1E

Diagram of a RAID 1 setup
RAID 1 consists of an exact copy (or mirror) of a set of data on two or more disks; a classic RAID 1 mirrored pair contains two disks. This configuration offers no parity, striping, or spanning of disk space across multiple disks, since the data is mirrored on all disks belonging to the array, and the array can only be as big as the smallest member disk. This layout is useful when read performance or reliability is more important than write performance or the resulting data storage capacity.[13][14]

The array will continue to operate so long as at least one member drive is operational.[15]

Performance[edit]
Any read request can be serviced and handled by any drive in the array; thus, depending on the nature of I/O load, random read performance of a RAID 1 array may equal up to the sum of each member's performance,[a] while the write performance remains at the level of a single disk. However, if disks with different speeds are used in a RAID 1 array, overall write performance is equal to the speed of the slowest disk.[14][15]

Synthetic benchmarks show varying levels of performance improvements when multiple HDDs or SSDs are used in a RAID 1 setup, compared with single-drive performance. However, some synthetic benchmarks also show a drop in performance for the same comparison.[11][12]

RAID 2[edit]

Diagram of a RAID 2 setup
RAID 2, which is rarely used in practice, stripes data at the bit (rather than block) level, and uses a Hamming code for error correction. The disks are synchronized by the controller to spin at the same angular orientation (they reach index at the same time[clarification needed]), so it generally cannot service multiple requests simultaneously. Extremely high data transfer rates are possible.[16][17]

With all hard disk drives implementing internal error correction, the complexity of an external Hamming code offered little advantage over parity so RAID 2 has been rarely implemented; it is the only original level of RAID that is not currently used.[16][17]

RAID 3[edit]

Diagram of a RAID 3 setup of six-byte blocks and two parity bytes, shown are two blocks of data in different colors.
RAID 3, which is rarely used in practice, consists of byte-level striping with a dedicated parity disk. One of the characteristics of RAID 3 is that it generally cannot service multiple requests simultaneously, which happens because any single block of data will, by definition, be spread across all members of the set and will reside in the same location.[clarification needed] Therefore, any I/O operation requires activity on every disk and usually requires synchronized spindles.

This makes it suitable for applications that demand the highest transfer rates in long sequential reads and writes, for example uncompressed video editing. Applications that make small reads and writes from random disk locations will get the worst performance out of this level.[17]

The requirement that all disks spin synchronously (in a lockstep) added design considerations to a level that provided no significant advantages over other RAID levels, so it quickly became useless and is now obsolete.[16] Both RAID 3 and RAID 4 were quickly replaced by RAID 5.[18] RAID 3 was usually implemented in hardware, and the performance issues were addressed by using large disk caches.[17]

RAID 4[edit]

Diagram 1: A RAID 4 setup with dedicated parity disk with each color representing the group of blocks in the respective parity block (a stripe)
RAID 4 consists of block-level striping with a dedicated parity disk. As a result of its layout, RAID 4 provides good performance of random reads, while the performance of random writes is low due to the need to write all parity data to a single disk.[19]

In diagram 1, a read request for block A1 would be serviced by disk 0. A simultaneous read request for block B1 would have to wait, but a read request for B2 could be serviced concurrently by disk 1.

RAID 5[edit]

Diagram of a RAID 5 setup with distributed parity with each color representing the group of blocks in the respective parity block (a stripe). This diagram shows left asymmetric algorithm
RAID 5 consists of block-level striping with distributed parity. Unlike in RAID 4, parity information is distributed among the drives. It requires that all drives but one be present to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data is lost.[5] RAID 5 requires at least three disks.[20]

In comparison to RAID 4, RAID 5's distributed parity evens out the stress of a dedicated parity disk among all RAID members. Additionally, write performance is increased since all RAID members participate in serving of the write requests. Although it won't be as efficient as a non RAID setup, (because parity must still be written), this is just no longer a bottleneck.[21]

RAID 6[edit]

Diagram of a RAID 6 setup, which is identical to RAID 5 other than the addition of a second parity block
RAID 6 extends RAID 5 by adding another parity block; thus, it uses block-level striping with two parity blocks distributed across all member disks.[22]

According to the Storage Networking Industry Association (SNIA), the definition of RAID 6 is: "Any form of RAID that can continue to execute read and write requests to all of a RAID array's virtual disks in the presence of any two concurrent disk failures. Several methods, including dual check data computations (parity and Reed-Solomon), orthogonal dual parity check data and diagonal parity, have been used to implement RAID Level 6."[23]

Performance[edit]
RAID 6 does not have a performance penalty for read operations, but it does have a performance penalty on write operations because of the overhead associated with parity calculations. Performance varies greatly depending on how RAID 6 is implemented in the manufacturer's storage architecture—in software, firmware, or by using firmware and specialized ASICs for intensive parity calculations. RAID 6 can read up to the same speed as RAID 5 with the same number of physical drives.[24]

That Mason Guy · June 5, 2017

7 hours ago, H0R53 said:

Do you not have access to Google and WikiPedia? Holy shit.

RAID 0[edit]

Diagram of a RAID 0 setup
RAID 0 (also known as a stripe set or striped volume) splits ("stripes") data evenly across two or more disks, without parity information, redundancy, or fault tolerance. Since RAID 0 provides no fault tolerance or redundancy, the failure of one drive will cause the entire array to fail; as a result of having data striped across all disks, the failure will result in total data loss. This configuration is typically implemented having speed as the intended goal.[2][3] RAID 0 is normally used to increase performance, although it can also be used as a way to create a large logical volume out of two or more physical disks.[4]

A RAID 0 setup can be created with disks of differing sizes, but the storage space added to the array by each disk is limited to the size of the smallest disk. For example, if a 120 GB disk is striped together with a 320 GB disk, the size of the array will be 120 GB × 2 = 240 GB. However, some RAID implementations allow the remaining 200 GB to be used for other purposes.

The diagram in this section shows how the data is distributed into Ax stripes on two disks, with A1:A2 as the first stripe, A3:A4 as the second one, etc. Once the stripe size is defined during the creation of a RAID 0 array, it needs to be maintained at all times. Since the stripes are accessed in parallel, an n-drive RAID 0 array appears as a single large disk with a data rate n times higher than the single-disk rate.

Performance[edit]
A RAID 0 array of n drives provides data read and write transfer rates up to n times higher than the individual drive rates, but with no data redundancy. As a result, RAID 0 is primarily used in applications that require high performance and are able to tolerate lower reliability, such as in scientific computing[5] or computer gaming.[6]

Some benchmarks of desktop applications show RAID 0 performance to be marginally better than a single drive.[7][8] Another article examined these claims and concluded that "striping does not always increase performance (in certain situations it will actually be slower than a non-RAID setup), but in most situations it will yield a significant improvement in performance".[9][10] Synthetic benchmarks show different levels of performance improvements when multiple HDDs or SSDs are used in a RAID 0 setup, compared with single-drive performance. However, some synthetic benchmarks also show a drop in performance for the same comparison.[11][12]

RAID 1[edit]
See also: RAID 1E

Diagram of a RAID 1 setup
RAID 1 consists of an exact copy (or mirror) of a set of data on two or more disks; a classic RAID 1 mirrored pair contains two disks. This configuration offers no parity, striping, or spanning of disk space across multiple disks, since the data is mirrored on all disks belonging to the array, and the array can only be as big as the smallest member disk. This layout is useful when read performance or reliability is more important than write performance or the resulting data storage capacity.[13][14]

The array will continue to operate so long as at least one member drive is operational.[15]

Performance[edit]
Any read request can be serviced and handled by any drive in the array; thus, depending on the nature of I/O load, random read performance of a RAID 1 array may equal up to the sum of each member's performance,[a] while the write performance remains at the level of a single disk. However, if disks with different speeds are used in a RAID 1 array, overall write performance is equal to the speed of the slowest disk.[14][15]

Synthetic benchmarks show varying levels of performance improvements when multiple HDDs or SSDs are used in a RAID 1 setup, compared with single-drive performance. However, some synthetic benchmarks also show a drop in performance for the same comparison.[11][12]

RAID 2[edit]

Diagram of a RAID 2 setup
RAID 2, which is rarely used in practice, stripes data at the bit (rather than block) level, and uses a Hamming code for error correction. The disks are synchronized by the controller to spin at the same angular orientation (they reach index at the same time[clarification needed]), so it generally cannot service multiple requests simultaneously. Extremely high data transfer rates are possible.[16][17]

With all hard disk drives implementing internal error correction, the complexity of an external Hamming code offered little advantage over parity so RAID 2 has been rarely implemented; it is the only original level of RAID that is not currently used.[16][17]

RAID 3[edit]

Diagram of a RAID 3 setup of six-byte blocks and two parity bytes, shown are two blocks of data in different colors.
RAID 3, which is rarely used in practice, consists of byte-level striping with a dedicated parity disk. One of the characteristics of RAID 3 is that it generally cannot service multiple requests simultaneously, which happens because any single block of data will, by definition, be spread across all members of the set and will reside in the same location.[clarification needed] Therefore, any I/O operation requires activity on every disk and usually requires synchronized spindles.

This makes it suitable for applications that demand the highest transfer rates in long sequential reads and writes, for example uncompressed video editing. Applications that make small reads and writes from random disk locations will get the worst performance out of this level.[17]

The requirement that all disks spin synchronously (in a lockstep) added design considerations to a level that provided no significant advantages over other RAID levels, so it quickly became useless and is now obsolete.[16] Both RAID 3 and RAID 4 were quickly replaced by RAID 5.[18] RAID 3 was usually implemented in hardware, and the performance issues were addressed by using large disk caches.[17]

RAID 4[edit]

Diagram 1: A RAID 4 setup with dedicated parity disk with each color representing the group of blocks in the respective parity block (a stripe)
RAID 4 consists of block-level striping with a dedicated parity disk. As a result of its layout, RAID 4 provides good performance of random reads, while the performance of random writes is low due to the need to write all parity data to a single disk.[19]

In diagram 1, a read request for block A1 would be serviced by disk 0. A simultaneous read request for block B1 would have to wait, but a read request for B2 could be serviced concurrently by disk 1.

RAID 5[edit]

Diagram of a RAID 5 setup with distributed parity with each color representing the group of blocks in the respective parity block (a stripe). This diagram shows left asymmetric algorithm
RAID 5 consists of block-level striping with distributed parity. Unlike in RAID 4, parity information is distributed among the drives. It requires that all drives but one be present to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data is lost.[5] RAID 5 requires at least three disks.[20]

In comparison to RAID 4, RAID 5's distributed parity evens out the stress of a dedicated parity disk among all RAID members. Additionally, write performance is increased since all RAID members participate in serving of the write requests. Although it won't be as efficient as a non RAID setup, (because parity must still be written), this is just no longer a bottleneck.[21]

RAID 6[edit]

Diagram of a RAID 6 setup, which is identical to RAID 5 other than the addition of a second parity block
RAID 6 extends RAID 5 by adding another parity block; thus, it uses block-level striping with two parity blocks distributed across all member disks.[22]

According to the Storage Networking Industry Association (SNIA), the definition of RAID 6 is: "Any form of RAID that can continue to execute read and write requests to all of a RAID array's virtual disks in the presence of any two concurrent disk failures. Several methods, including dual check data computations (parity and Reed-Solomon), orthogonal dual parity check data and diagonal parity, have been used to implement RAID Level 6."[23]

Performance[edit]
RAID 6 does not have a performance penalty for read operations, but it does have a performance penalty on write operations because of the overhead associated with parity calculations. Performance varies greatly depending on how RAID 6 is implemented in the manufacturer's storage architecture—in software, firmware, or by using firmware and specialized ASICs for intensive parity calculations. RAID 6 can read up to the same speed as RAID 5 with the same number of physical drives.[24]

I May have access to google or Wikipedia, yet it was kindly explained to me in the first reply which made sense as it was not filled with a whole bunch of random crap that makes no sense, thus now I understand what each are what they do, no need for your comment buddy, should've just scrolled up.

Vapare · June 5, 2017

Just adding to the explanation on top.

Blake · June 5, 2017

13 hours ago, H0R53 said:

Do you not have access to Google and WikiPedia? Holy shit.

<snip>

Can you show more understanding then just copying and pasting? holy shit.

18 hours ago, That Mason Guy said:

I have no understanding to what RAID is for and what the different numbers are for, like I really just don't understand If someone could explain it to me that'd be great

Basically RAID just mean read and write to multiple HDD's at the same time (while this isn't 100% true it'll help you get your head around what is actually happening). In a mirror, your computer writes the same thing to all HDD's, so they are an exact copy. In a SPAN, the first part goes to drive 1 and the second part goes to drive 2, in a parity raid, it will write a bit to each drive and then calculate a way to recreate the data should one of the other parts disappear, then write that to one of the disks.

Sign In

PLS Explan RAID to me

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Featured Topics

Topics

Latest From Linus Tech Tips:

I Was Never Meant to Have This Prototype CPU

Latest From Tech Quickie:

Why Do Speakers Hiss?

Latest From TechLinked:

Yep, it’s an App

Latest From GameLinked:

Bethesda Knows It’s Broken

Latest From ShortCircuit:

How is this even handheld?! - OneXPlayer X1

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!