Jump to content

Raid Card

LordMastodon
Go to solution Solved by leadeater,
1 minute ago, LordMastodon said:

Can you just post something saying to get that RAID card, the SSDs and the CacheVault battery so I can give it Best Answer?

Just mark that post (yours just now) as the best answer, doesn't worry me just happy to help out.

7 minutes ago, LordMastodon said:

I found this one, which is PCIe 3.0, so I'll go with it.

Not really worth the extra $$$, only time you'll be pushing any constantly high network traffic is during a backup and you'll be disk throughput limited before network or PCIe bus limited.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, scottyseng said:

Worst for me was being shocked by 480v...I was lucky that the circuit tripped. The worst accident we've seen related to electricity is where one of our friends who does electricity got shocked by the same 480v above and fell down 18 feet onto concrete in a department store.

 

Yeah, the PCIe 2.0 will be fine. We're still not even bottlenecking PCIe 2.0 with today's tech.

 
 

Damn. That sounds painful. I sometimes get slightly shocked when putting things in or taking things out of ports when I'm in Israel, where they use the European outlets, but it's not a big deal. 

 

I forgot, this is probably the most dangerous thing that I've done with electricity. I was pulling on those European to American outlet adapters out of the outlet on a power strip, and it was super old, and stuck in there, I had to pull it really hard. Apparently, during the process of pulling it, I managed to bridge some contact or something, because there was just a straight up explosion, and white sparks flew everywhere and the power went out in that whole side of the room. I guess all it was was I tripped the breakers, so someone came down, flipped them back, and everything was back to normal. Except for the fact that, my laptop was plugged into the same strip, and it just straight up died the second the explosion happened. As in turned off and would not turn back on. So I had to take it apart and disconnect/reconnect every single wire and ribbon cable inside that darned thing, just to make sure they weren't shorted or something, and then when I put it back together it turned on. Which confused me to no end, but I didn't want to tempt fate so I just let it be without plugging it back into power and let it sit there until the next day, when someone checked it out and made sure it was fine.

I will most likely not respond to you in a thread unless you quote me.

$500 PC | $800 PC | $1000 PC | $1200 PC | $1500 PC | $2000 PC | $2500 PC | $3000 PC | $4000 PC

Spoiler

Damnit Carl (My portable POS):

CPU: Core i7-6700HQ

Motherboard: Toshiba L55-C5392 Mobo

RAM: 8GB DDR3 (even though I have Skylake)

GPU: Intel HD Graphics 530

Case: Toshiba L55-C5392 Case

Storage: 525 GB Crucial MX300 SSD

PSU: Whatever power jack comes with it

Display: Some 1366 x 768 garbage + an OK 1080p monitor

Cooling: Not enough + an external laptop tray

Keyboard: The included one

Mouse: $4 Lenovo 3D Optical Mouse (not as bad as you (rightly) assumed)

Sound: The Skullcandy branding right under the power button should clue you in

Operating System: Windows 10 Home

PCPartPicker URL: pcpartpicker.com/i-wish-i-had-enough-money-for-a-desktop-my-laptop-is-so-sh*t-its-not-even-on-portablepicker

 
 
 
 
 
 
 
Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, leadeater said:

Not really worth the extra $$$, only time you'll be pushing any constantly high network traffic is during a backup and you'll be disk throughput limited before network or PCIe bus limited.

 

Alright, I'll stick with the PCIe 2.0 then.

I will most likely not respond to you in a thread unless you quote me.

$500 PC | $800 PC | $1000 PC | $1200 PC | $1500 PC | $2000 PC | $2500 PC | $3000 PC | $4000 PC

Spoiler

Damnit Carl (My portable POS):

CPU: Core i7-6700HQ

Motherboard: Toshiba L55-C5392 Mobo

RAM: 8GB DDR3 (even though I have Skylake)

GPU: Intel HD Graphics 530

Case: Toshiba L55-C5392 Case

Storage: 525 GB Crucial MX300 SSD

PSU: Whatever power jack comes with it

Display: Some 1366 x 768 garbage + an OK 1080p monitor

Cooling: Not enough + an external laptop tray

Keyboard: The included one

Mouse: $4 Lenovo 3D Optical Mouse (not as bad as you (rightly) assumed)

Sound: The Skullcandy branding right under the power button should clue you in

Operating System: Windows 10 Home

PCPartPicker URL: pcpartpicker.com/i-wish-i-had-enough-money-for-a-desktop-my-laptop-is-so-sh*t-its-not-even-on-portablepicker

 
 
 
 
 
 
 
Link to comment
Share on other sites

Link to post
Share on other sites

@leadeater @scottyseng Thank you both so much for helping, you were both great fun and you really did solve pretty much everything. Thanks to you guys, I think I'm probably going to get this approved and I'll save the school roughly $2,400, something they'll be pretty happy about.

I will most likely not respond to you in a thread unless you quote me.

$500 PC | $800 PC | $1000 PC | $1200 PC | $1500 PC | $2000 PC | $2500 PC | $3000 PC | $4000 PC

Spoiler

Damnit Carl (My portable POS):

CPU: Core i7-6700HQ

Motherboard: Toshiba L55-C5392 Mobo

RAM: 8GB DDR3 (even though I have Skylake)

GPU: Intel HD Graphics 530

Case: Toshiba L55-C5392 Case

Storage: 525 GB Crucial MX300 SSD

PSU: Whatever power jack comes with it

Display: Some 1366 x 768 garbage + an OK 1080p monitor

Cooling: Not enough + an external laptop tray

Keyboard: The included one

Mouse: $4 Lenovo 3D Optical Mouse (not as bad as you (rightly) assumed)

Sound: The Skullcandy branding right under the power button should clue you in

Operating System: Windows 10 Home

PCPartPicker URL: pcpartpicker.com/i-wish-i-had-enough-money-for-a-desktop-my-laptop-is-so-sh*t-its-not-even-on-portablepicker

 
 
 
 
 
 
 
Link to comment
Share on other sites

Link to post
Share on other sites

Stupidest thing I've done was when I was around 8-10 and I wanted to make my slot car set go faster. I got a power cable and cut the end off it and exposed the wires then tried to connect it to the metal tracks, it wasn't touching very well so I pressed down with each thumb :P. Got a 230v shock which hurt like hell and melted the cars to the track.

 

With that I'm going to sleep, will read any updates tomorrow.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, leadeater said:

Stupidest thing I've done was when I was around 8-10 and I wanted to make my slot car set go faster. I got a power cable and cut the end off it and exposed the wires then tried to connect it to the metal tracks, it wasn't touching very well so I pressed down with each thumb :P. Got a 230v shock which hurt like hell and melted the cars to the track.

 

With that I'm going to sleep, will read any updates tomorrow.

 

Damn. That's actually something I can imagine myself trying, which scares the hell out of me.

I will most likely not respond to you in a thread unless you quote me.

$500 PC | $800 PC | $1000 PC | $1200 PC | $1500 PC | $2000 PC | $2500 PC | $3000 PC | $4000 PC

Spoiler

Damnit Carl (My portable POS):

CPU: Core i7-6700HQ

Motherboard: Toshiba L55-C5392 Mobo

RAM: 8GB DDR3 (even though I have Skylake)

GPU: Intel HD Graphics 530

Case: Toshiba L55-C5392 Case

Storage: 525 GB Crucial MX300 SSD

PSU: Whatever power jack comes with it

Display: Some 1366 x 768 garbage + an OK 1080p monitor

Cooling: Not enough + an external laptop tray

Keyboard: The included one

Mouse: $4 Lenovo 3D Optical Mouse (not as bad as you (rightly) assumed)

Sound: The Skullcandy branding right under the power button should clue you in

Operating System: Windows 10 Home

PCPartPicker URL: pcpartpicker.com/i-wish-i-had-enough-money-for-a-desktop-my-laptop-is-so-sh*t-its-not-even-on-portablepicker

 
 
 
 
 
 
 
Link to comment
Share on other sites

Link to post
Share on other sites

@scottyseng My boss is kind of unsure as to whether we should even continue with RAID, and his main argument is that when you replace a failed hard drive in a RAID 5/6 array there's only a 70% of getting it perfect and up and running, which means it's kind of unreliable. What do you think?

I will most likely not respond to you in a thread unless you quote me.

$500 PC | $800 PC | $1000 PC | $1200 PC | $1500 PC | $2000 PC | $2500 PC | $3000 PC | $4000 PC

Spoiler

Damnit Carl (My portable POS):

CPU: Core i7-6700HQ

Motherboard: Toshiba L55-C5392 Mobo

RAM: 8GB DDR3 (even though I have Skylake)

GPU: Intel HD Graphics 530

Case: Toshiba L55-C5392 Case

Storage: 525 GB Crucial MX300 SSD

PSU: Whatever power jack comes with it

Display: Some 1366 x 768 garbage + an OK 1080p monitor

Cooling: Not enough + an external laptop tray

Keyboard: The included one

Mouse: $4 Lenovo 3D Optical Mouse (not as bad as you (rightly) assumed)

Sound: The Skullcandy branding right under the power button should clue you in

Operating System: Windows 10 Home

PCPartPicker URL: pcpartpicker.com/i-wish-i-had-enough-money-for-a-desktop-my-laptop-is-so-sh*t-its-not-even-on-portablepicker

 
 
 
 
 
 
 
Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, LordMastodon said:

@scottyseng My boss is kind of unsure as to whether we should even continue with RAID, and his main argument is that when you replace a failed hard drive in a RAID 5/6 array there's only a 70% of getting it perfect and up and running, which means it's kind of unreliable. What do you think?

Well, I have no idea how he came up with a 70% figure...It begins to heal itself immediately as soon as you replace the failed hard drive (It does take quite some time to heal though). Yes, there is a chance of a second drive and possible a third drive failing while the RAID6 is healing the array, but it's kind of rare. I'd also mention that you're using SSDs which would be able to heal faster than hard drives.

 

RAID10 can be killed in the exact same way where a second drive dies while the array is healing (If that second drive happens to be the other drive in the single RAID1 pair, the array is gone). RAID10 just has a faster healing time (It just needs to copy and paste vs parity based RAID5/6 which have to do calculations to retrieve / write data).

 

You could stand there for weeks debating how likely drive failure would be able to take out RAID6 vs RAID10...heck, you could even have all four drives fail at the same time...that's why we have backups.

 

Yeah, I still vote for RAID6 because the boost in sequential speeds is worth it (RAID6 is able to read and write from all four drives at the same time), not to mention you don't lose half the space.

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, scottyseng said:

Well, I have no idea how he came up with a 70% figure...It begins to heal itself immediately as soon as you replace the failed hard drive (It does take quite some time to heal though). Yes, there is a chance of a second drive and possible a third drive failing while the RAID6 is healing the array, but it's kind of rare. I'd also mention that you're using SSDs which would be able to heal faster than hard drives.

 

RAID10 can be killed in the exact same way where a second drive dies while the array is healing (If that second drive happens to be the other drive in the single RAID1 pair, the array is gone). RAID10 just has a faster healing time (It just needs to copy and paste vs parity based RAID5/6 which have to do calculations to retrieve / write data).

 

You could stand there for weeks debating how likely drive failure would be able to take out RAID6 vs RAID10...heck, you could even have all four drives fail at the same time...that's why we have backups.

 

Yeah, I still vote for RAID6 because the boost in sequential speeds is worth it (RAID6 is able to read and write from all four drives at the same time), not to mention you don't lose half the space.

 

Yeah, what he means is that there's a 70% chance that even when it's done healing it will be usable. His strategy will be to (in essence), have all four drives in RAID 0 (but not actually), so that means doing backups every 20 minutes or so. He also claims that it takes longer than reimaging or cloning (although to do either of those you'd have to take one drive out of the system, meaning you'd only have two in there for the whole time it was doing that). Basically, his opinion is that RAID is expensive (I agree with that), unreliable (not true), and less efficient (also not true). 

I will most likely not respond to you in a thread unless you quote me.

$500 PC | $800 PC | $1000 PC | $1200 PC | $1500 PC | $2000 PC | $2500 PC | $3000 PC | $4000 PC

Spoiler

Damnit Carl (My portable POS):

CPU: Core i7-6700HQ

Motherboard: Toshiba L55-C5392 Mobo

RAM: 8GB DDR3 (even though I have Skylake)

GPU: Intel HD Graphics 530

Case: Toshiba L55-C5392 Case

Storage: 525 GB Crucial MX300 SSD

PSU: Whatever power jack comes with it

Display: Some 1366 x 768 garbage + an OK 1080p monitor

Cooling: Not enough + an external laptop tray

Keyboard: The included one

Mouse: $4 Lenovo 3D Optical Mouse (not as bad as you (rightly) assumed)

Sound: The Skullcandy branding right under the power button should clue you in

Operating System: Windows 10 Home

PCPartPicker URL: pcpartpicker.com/i-wish-i-had-enough-money-for-a-desktop-my-laptop-is-so-sh*t-its-not-even-on-portablepicker

 
 
 
 
 
 
 
Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, LordMastodon said:

Yeah, what he means is that there's a 70% chance that even when it's done healing it will be usable. His strategy will be to (in essence), have all four drives in RAID 0 (but not actually), so that means doing backups every 20 minutes or so. He also claims that it takes longer than reimaging or cloning (although to do either of those you'd have to take one drive out of the system, meaning you'd only have two in there for the whole time it was doing that). Basically, his opinion is that RAID is expensive (I agree with that), unreliable (not true), and less efficient (also not true). 

Please give him a good smack for me. Any amount of thought in to the frankly terrible math would mean millions of RAID arrays around the world for the last 10 years would have failed, never seen that news story ever.

 

Snippet from one of many threads on this very forum regarding URE and how wrong people can be when looking at it. It is a real issue and as disks get larger the problem gets worse, that is why ZFS and ReFS/Storage Spaces exist.

On 1/14/2016 at 11:25 AM, leadeater said:

 

 

Well even though the math is rather sound in the many URE calculation blogs etc the actuality is far from it. Unless you are using a RAID card with absolutely zero features other than doing RAID itself, rare and ones' own fault for using one, rebuild failures or corruption from URE are nowhere near the numbers they indicate.

 

For a very long time now hardware RAID cards do preventative maintenance and active fixing on errors so URE does not cause array failure. Background scubbing/patrol reads are done to check for bad blocks and will fix them with another copy or rebuild from parity data, also during a rebuild if a bad block is found that causes the array rebuilding to stall or fail it will try another copy if it can (RAID10 & 6).

 

Even linux mdadm software RAID can do these checks on schedule to prevent URE/bit rot from causing long term damage and fixing the bad block before it cannot be done.

 

People need to stop and think before just looking at the raw probability numbers and accepting them as true fact for likelihood of actual failure. If these numbers were actually the case RAID would have not been used for so long, and is still use now. Also hardware manufactures would not sit around doing nothing about the issue and not come up with tools to help prevent this, which they have.

 

The only array that I have seen fail to rebuild was a 16x 1TB RAID 5 on the cheapest 3Ware controller possible built in 2008. What actually happened was the card failed during a rebuild so was replaced and the array would get stuck at 90% rebuild, this could have been URE or corruption from the failed card. I cannot know for certain what it was.

 

I have built and maintained many RAID arrays, like hundreds, of many different disk counts and sizes from 3 disks to 32+ of sizes beyond 200TB. If the numbers were the real world probability of failure then almost every single array I have built should have failed, this has not been the case.

 

This is not to say there hasn't been a real concern over the issue in the IT industry, this is why RAID 6 is now the preferred parity RAID. Also other storage technologies have been introduced like ZFS and other custom software features on enterprise storage systems.

 

TL;DR The indicated failure probability is not the real world actual likelihood, don't believe them but do keep them in mind as it is important to know about URE and how to prevent damage from them.

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, leadeater said:

-snip-

Yeah, I was going to reply earlier, but my brain started shutting down when I read that RAID arrays "had a 70% of working after they healed"...

 

I figured I would get a funnier response waiting for yours first. haha

Link to comment
Share on other sites

Link to post
Share on other sites

@LordMastodon

I do agree with the sentiment to not use RAID however. If the server is going to be running Windows then use Storage Spaces  + ReFS (don't use parity configuration with that few disks) or for Linux use ZFS or BTRFS.

 

The reason to not use a parity configuration with Windows Storage Spaces is you need dedicated journal SSDs which are used as write-back caches to speed up writes, without them write performance is only about 60MB/s-100MB/s. Normally you would have 2 SSDs set as journal disks then 4+ HDDs as a parity tier for the actual storage.

 

With Server 2016 they have a new feature called 'Multi Resilient Virtual Disks'. What's nice about this is that you create on the pool of 4 SSDs you intend to use a 2-way mirror (like RAID 10) tier of around 100GB-200GB then a parity tier (like RAID 6) for the reset of the space. What this does is all writes, even modifications of existing data, go to the faster mirror tier then get written down to the parity tier after the fact.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, leadeater said:

@LordMastodon

I do agree with the sentiment to not use RAID however. If the server is going to be running Windows then use Storage Spaces  + ReFS (don't use parity configuration with that few disks) or for Linux use ZFS or BTRFS.

 

The reason to not use a parity configuration with Windows Storage Spaces is you need dedicated journal SSDs which are used as write-back caches to speed up writes, without them write performance is only about 60MB/s-100MB/s. Normally you would have 2 SSDs set as journal disks then 4+ HDDs as a parity tier for the actual storage.

 

With Server 2016 they have a new feature called 'Multi Resilient Virtual Disks'. What's nice about this is that you create on the pool of 4 SSDs you intend to use a 2-way mirror (like RAID 10) tier of around 100GB-200GB then a parity tier (like RAID 6) for the reset of the space. What this does is all writes, even modifications of existing data, go to the faster mirror tier then get written down to the parity tier after the fact.

 

It turns out that my boss was still iffy about RAID 6 (at first he thought I meant RAID 5 + 1), despite my assurances that it was the best redundancy you can have with a fully RAIDed array of drives. In the end I concocted some plan to have two of the drives in a RAID 0, for performance, and then every <amount of time goes here> do a backup of the RAID 0 to both of the other drives, or better yet figure out some sort of auto-imaging software, perhaps using something like IPMI. That means that two drives can fail at the exact same time and we wouldn't have to care. It also means that potentially up to 3 drives could fail instantaneously (as long as two of them were in the RAID array) and we'd still be fine. I'm not really sure if you can get any better redundancy with any combination of RAID or RAIDed drives with 4 drives. The boss also said he wanted to make sure we combined redundancy and performance, so I think the above is probably the best way to do it.

I will most likely not respond to you in a thread unless you quote me.

$500 PC | $800 PC | $1000 PC | $1200 PC | $1500 PC | $2000 PC | $2500 PC | $3000 PC | $4000 PC

Spoiler

Damnit Carl (My portable POS):

CPU: Core i7-6700HQ

Motherboard: Toshiba L55-C5392 Mobo

RAM: 8GB DDR3 (even though I have Skylake)

GPU: Intel HD Graphics 530

Case: Toshiba L55-C5392 Case

Storage: 525 GB Crucial MX300 SSD

PSU: Whatever power jack comes with it

Display: Some 1366 x 768 garbage + an OK 1080p monitor

Cooling: Not enough + an external laptop tray

Keyboard: The included one

Mouse: $4 Lenovo 3D Optical Mouse (not as bad as you (rightly) assumed)

Sound: The Skullcandy branding right under the power button should clue you in

Operating System: Windows 10 Home

PCPartPicker URL: pcpartpicker.com/i-wish-i-had-enough-money-for-a-desktop-my-laptop-is-so-sh*t-its-not-even-on-portablepicker

 
 
 
 
 
 
 
Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, LordMastodon said:

-snip-

Yeah, there is a RAID51 and RAID61, but that is a lot of space loss...It's pretty much a RAID1 of RAID6 arrays....Not too popular (Not exactly standard either). RAID50 and 60 are what you usually see if more speed is needed from a RAID5 or RAID6. There's a few other rare RAIDs as well.

 

Well, the while you could lose three drives, it has to be the right three drives otherwise you're out of your data. I'd rather have RAID6 where I can lose any two drives.

 

Also, you're missing the other goal of RAID, which is to have less time lost to having to migrate data from a backup. Yes, you can have two separate non RAID drives backup the RAID0 array, but if the RAID0 array dies, you have to build another RAID0 array, or shut down the system and have it run off of the backup on one of the two non RAID drives. You have no downtime if you lose any two or single drive in RAID6. You can immediately take the drive out and slide a new one in (provided you have a hot swap backplane) without having to turn the system off.

 

Another thing to consider is that the RAID0 array is larger than the two single drives (All drives are one TB). If the data in the RAID0 array outgrows the space available in the two single drives, you won't be able to image the RAID0 array any more (Assuming you want to snapshot all of the data in the RAID0 array as is (no compression or otherwise).

 

Finally, if the RAID0 dies, and your snapshot is 20 minutes behind, you potentially at the max, lost 19ish minutes of work. Multiply that by how many employees lost their work during that time frame by the cost they get paid by an hour, and it adds up. Also have to factor in how long it'll take to switch the system over to the backup on the single drive.

 

It depends on how much downtime you can afford to have though. Most enterprises running hardware RAID controllers want to have their stuff with as little downtime as possible so having hot swap always online even when degraded RAID6 is preferred over a solution that involves transferring data from a backup for a drive failure.

 

You are correct on the drive loss for RAID levels though, with only four drives, losing any two drives is as good as it gets (RAID6).

Link to comment
Share on other sites

Link to post
Share on other sites

47 minutes ago, LordMastodon said:

It turns out that my boss was still iffy about RAID 6 (at first he thought I meant RAID 5 + 1), despite my assurances that it was the best redundancy you can have with a fully RAIDed array of drives. In the end I concocted some plan to have two of the drives in a RAID 0, for performance, and then every <amount of time goes here> do a backup of the RAID 0 to both of the other drives, or better yet figure out some sort of auto-imaging software, perhaps using something like IPMI. That means that two drives can fail at the exact same time and we wouldn't have to care. It also means that potentially up to 3 drives could fail instantaneously (as long as two of them were in the RAID array) and we'd still be fine. I'm not really sure if you can get any better redundancy with any combination of RAID or RAIDed drives with 4 drives. The boss also said he wanted to make sure we combined redundancy and performance, so I think the above is probably the best way to do it.

Well the main problem is that if 1 disk in a RAID 0 dies everything is dead, do you mean RAID 1? After the array is dead from that single disk failure what is the restore process and time. Having a backup is only half the story.

 

As for backups how this is done is also important, you will never get full system images in a small enough time window that would be acceptable. There would be too much data loss. The system load from taking these full system images would also be catastrophic to user experience, backups happen at night for a reason and these are usually only incremental change backups not a full every time.

 

There is pretty much a standard magic formula for configuring servers that has existed since the 90's and will be continued in some shape forever. Servers have 2 dedicated OS disks in a RAID 1 mirror. The reason for this is that you can read the data even without a RAID card if you need to, you can also pop a disk out during maintenance so if something goes wrong you switch disks and you are back before you changed anything.

 

After this you have dedicated data disks and which RAID configuration you pick depends on application type. For a file server you use RAID 5/6 as this has the best $/GB and very good performance for that workload type. For a database server you use RAID 10, not because it is safer as it is not, because this has lower write latency which gives better random I/O performance which is key to database server performance.

 

He is creating an issue where there isn't one. I've used RAID 6 on a SAN that had 150+ disks in a single RAID group, there were multiple RAID groups of the size.

 

As for performance that is also a non issue, you are using SSDs on a file server. There is no configuration that could make this perform slow, other than hardware parity RAID without a battery (BBU or CacheVault). A single SSD performs better than a tray of 24 10k rpm SAS disks and by a decent amount so imagine what 4 is like. The 4 disks in a RAID 6 will have higher performance than the proposed plan you mentioned and will have better up time, disk failures won't cause a restore to be required and take the system offline.

 

The proposed plan is just so bad I didn't know where to start or how to properly respond, this is the literal depiction of watching a train wreck in slow motion before it drives off the cliff.

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, scottyseng said:

Yeah, there is a RAID51 and RAID61, but that is a lot of space loss...It's pretty much a RAID1 of RAID6 arrays....Not too popular (Not exactly standard either). RAID50 and 60 are what you usually see if more speed is needed from a RAID5 or RAID6. There's a few other rare RAIDs as well.

 

Well, the while you could lose three drives, it has to be the right three drives otherwise you're out of your data. I'd rather have RAID6 where I can lose any two drives.

 

Also, you're missing the other goal of RAID, which is to have less time lost to having to migrate data from a backup. Yes, you can have two separate non RAID drives backup the RAID0 array, but if the RAID0 array dies, you have to build another RAID0 array, or shut down the system and have it run off of the backup on one of the two non RAID drives. You have no downtime if you lose any two or single drive in RAID6. You can immediately take the drive out and slide a new one in (provided you have a hot swap backplane) without having to turn the system off.

 

Another thing to consider is that the RAID0 array is larger than the two single drives (All drives are one TB). If the data in the RAID0 array outgrows the space available in the two single drives, you won't be able to image the RAID0 array any more (Assuming you want to snapshot all of the data in the RAID0 array as is (no compression or otherwise).

 

Finally, if the RAID0 dies, and your snapshot is 20 minutes behind, you potentially at the max, lost 19ish minutes of work. Multiply that by how many employees lost their work during that time frame by the cost they get paid by an hour, and it adds up. Also have to factor in how long it'll take to switch the system over to the backup on the single drive.

 

It depends on how much downtime you can afford to have though. Most enterprises running hardware RAID controllers want to have their stuff with as little downtime as possible so having hot swap always online even when degraded RAID6 is preferred over a solution that involves transferring data from a backup for a drive failure.

 

You are correct on the drive loss for RAID levels though, with only four drives, losing any two drives is as good as it gets (RAID6).

Thanks you addressed the downtime issue very well, which as you said is why RAID is used in the first place. Backups that exist on the same system as the source data are not backups.

 

RAID = up time

Backups = data loss protection

 

Different tools for different purposes.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, leadeater said:

Well the main problem is that if 1 disk in a RAID 0 dies everything is dead, do you mean RAID 1? After the array is dead from that single disk failure what is the restore process and time. Having a backup is only half the story.

Ah, to get the three drive failure figure desired, it has to be both of the drives in the RAID0 that dies and one of the two separate backup drives...that leaves the single separate other backup drive as the last drive standing...but yeah, it has to be the right three drives otherwise the data's gone...It took me some thinking how to get the three drive figure because I kept thinking the two separate backup drives and one of the drives in the RAID0 array would die, killing all of the data...

 

Still not a great idea though...too much downtime and potential for data loss (Since you have to overwrite the existing data in the separate two SSDs in order to image it..). That's a lot of data traffic being used to mirror the RAID0 to the two separate drives as well...every 20 min...

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, scottyseng said:

Ah, to get the three drive failure figure desired, it has to be both of the drives in the RAID0 that dies and one of the two separate backup drives...that leave the single separate other backup drive as the last drive standing...but yeah, it has to be the right three drives otherwise the data's gone...It took me some thinking how to get the three drive figure because I kept thinking the two separate backup drives and one of the drives in the RAID0 array would die, killing all of the data...

 

Still not a great idea though...too much downtime.

Yea I was just looking at the original source data configuration, the 3 drive failure thing doesn't really fly that well with a client when 1 disk fails and the server goes offline. Then you have to wait 1 hour or more for a technician to show up, then the issue has to be found. Once the issue is found and determined that a restore is required what's the plan at that point?

 

Do you boot the server off of 1 of the backup disks to finish out the day then at the end of the day recreate the RAID 0 array with the hot spare SSD (you have one right???) or do you do that straight away and have to wait for the restore to finish?

 

With that single disk failure the server could in theory be down for half or the whole working day, that is one pissed off client. "But you assured me that the server was extremely resilient and 3 disks could fail without an issue". Perfect scenarios don't work, plan for the worst case. If a disk is going to fail it will be the most critical and most impactful disk.

 

Edit: I was way too lose when I said 'everything is dead', really meant the server is offline at that point.

Link to comment
Share on other sites

Link to post
Share on other sites

@scottyseng @leadeater If I'm honest, I'm really not a fan of that configuration I concocted either. There's really no reason whatsoever not to just use RAID 6, and I think my boss may not fully understand it. His email said:

Quote

We can do full system images of the system drive in 20 minutes every day if we wanted to and we can do full data backups to multiple drives and the cloud every day in real time. When a drive fails in a raid array and you put in a new one and try to rebuild, the percentage of getting a perfect replacement up and running is only about 70% and it takes longer then reimaging a new drive or copying data. So my opinion is, raid might be expensive, unreliable and less efficient with the price of storage being cheap these days.

 

And I really don't agree with any of those things. I really think my boss may be trying to just save himself a little bit of work in not having to setup a RAID 6 (not that hard), or something similar.

 

Side note: Is that RAID card hot-swappable? Because otherwise I'd have to shut down the system every time it failed anyway...

I will most likely not respond to you in a thread unless you quote me.

$500 PC | $800 PC | $1000 PC | $1200 PC | $1500 PC | $2000 PC | $2500 PC | $3000 PC | $4000 PC

Spoiler

Damnit Carl (My portable POS):

CPU: Core i7-6700HQ

Motherboard: Toshiba L55-C5392 Mobo

RAM: 8GB DDR3 (even though I have Skylake)

GPU: Intel HD Graphics 530

Case: Toshiba L55-C5392 Case

Storage: 525 GB Crucial MX300 SSD

PSU: Whatever power jack comes with it

Display: Some 1366 x 768 garbage + an OK 1080p monitor

Cooling: Not enough + an external laptop tray

Keyboard: The included one

Mouse: $4 Lenovo 3D Optical Mouse (not as bad as you (rightly) assumed)

Sound: The Skullcandy branding right under the power button should clue you in

Operating System: Windows 10 Home

PCPartPicker URL: pcpartpicker.com/i-wish-i-had-enough-money-for-a-desktop-my-laptop-is-so-sh*t-its-not-even-on-portablepicker

 
 
 
 
 
 
 
Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, LordMastodon said:

-snip-

To be honest, I actually find trying to set up what your boss wants harder than a RAID6 setup...thought if you're going to do what the boss wants, you probably don't need a RAID card and can stick with the onboard Sata ports...

 

The RAID card itself? No, anything PCIe isn't hotswappable. Pulling out a PCIe card while a system is running will make it crash.

 

The drives connected to the RAID card? Yes, they're hot swappable. I don't know if it's hot swappable with breakout cables (I'm pretty sure are) but yeah, with a server backplane, it's hot swappable.

 

I would just try to tell your boss...a full image...at the worst case is 960GB...at 20 min, that's three full images an hour, roughly 3TB an hour. If it happens every hour in a day, it's 24 x 3...72TB per day...That's quite a lot of abuse.

 

I would also make sure your boss doesn't fire you if this system backfires....I also wouldn't trust the cloud if you have a lot of data...on normal internet connections, it would take days or weeks to upload a TB of data at the worst case. My friends bug me why I made myself a home server when I could just upload everything to the cloud...yeah...even on 75/75 Mb/s Verizon FIOS, uploading my data (9TB) would take quite a long time....

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, scottyseng said:

To be honest, I actually find trying to set up what your boss wants harder than a RAID6 setup...thought if you're going to do what the boss wants, you probably don't need a RAID card and can stick with the onboard Sata ports...

 

The RAID card itself? No, anything PCIe isn't hotswappable. Pulling out a PCIe card while a system is running will make it crash.

 

The drives connected to the RAID card? Yes, they're hot swappable. I don't know if it's hot swappable with breakout cables (I'm pretty sure are) but yeah, with a server backplane, it's hot swappable.

 

I would just try to tell your boss...a full image...at the worst case is 960GB...at 20 min, that's three full images an hour, roughly 3TB an hour. If it happens every hour in a day, it's 24 x 3...72TB per day...That's quite a lot of abuse.

 

I would also make sure your boss doesn't fire you if this system backfires....I also wouldn't trust the cloud if you have a lot of data...on normal internet connections, it would take days or weeks to upload a TB of data at the worst case. My friends bug me why I made myself a home server when I could just upload everything to the cloud...yeah...even on 75/75 Mb/s Verizon FIOS, uploading my data (9TB) would take quite a long time....

 

I agree with you, because it means having multiple drives connected in different places, and were anyone else to come along and start servicing this server, they'd have absolutely no clue what the hell was going unless we provided a manual, and as we all know from TFTS, no one RTFMs.

 

Yeah, I meant the drives connected to the card, not the card itself. If they are hot-swappable, that's good because it means that if a drive fails in the middle of the week and we are unable to reach the server until the weekend, we can have a drive or two in a cabinet or something and tell someone there to open up the machine, take out the drive that failed (we can number them with a label-maker and tell the person exactly which number drive failed), make a new label and insert the drive. Instead of having to deal with shutting it off gracefully, and then turning it on and making sure everything still works, they can do it while the machine is running.

 

It is quite a lot of abuse, and even data center SSDs probably aren't meant to handle that. I'll tell him.

 

My boss definitely will not fire me if this system backfires, he's a pretty damn good boss (except for this particular incident), and my job is working at an actual storefront (although it is part-time) and I'm pretty much his only reliable part-time employee because I regularly show up. I also do a lot of things to make his life a whole lot easier, like automation, I've written programs before, I've written Bash scripts, etc. He's not going to fire me.

 

We're using a 10 GiB connection, so I don't think that uploading to the cloud will be a huge issue, but then again we don't have to upload every image (if we even end up doing images) to the cloud; we could do half-daily cloud uploads or something.

 

Anyway, it seems all of this may have become a slightly moot point because it turns out that the only two approved retailers for us to buy things from are Dell and Best Buy (don't worry, I want to strangle the Board of Directors as well), and Best Buy carries all of one part on our the list, the RAM. The only reason we even went with a custom-build was that we wanted to get away from Dell's ludicrous overpricing and customizability, or lack thereof. So it seems that if my boss can't convince the Board to approve Newegg and Amazon we'll be forced to go with Dell (sad faces all around).

I will most likely not respond to you in a thread unless you quote me.

$500 PC | $800 PC | $1000 PC | $1200 PC | $1500 PC | $2000 PC | $2500 PC | $3000 PC | $4000 PC

Spoiler

Damnit Carl (My portable POS):

CPU: Core i7-6700HQ

Motherboard: Toshiba L55-C5392 Mobo

RAM: 8GB DDR3 (even though I have Skylake)

GPU: Intel HD Graphics 530

Case: Toshiba L55-C5392 Case

Storage: 525 GB Crucial MX300 SSD

PSU: Whatever power jack comes with it

Display: Some 1366 x 768 garbage + an OK 1080p monitor

Cooling: Not enough + an external laptop tray

Keyboard: The included one

Mouse: $4 Lenovo 3D Optical Mouse (not as bad as you (rightly) assumed)

Sound: The Skullcandy branding right under the power button should clue you in

Operating System: Windows 10 Home

PCPartPicker URL: pcpartpicker.com/i-wish-i-had-enough-money-for-a-desktop-my-laptop-is-so-sh*t-its-not-even-on-portablepicker

 
 
 
 
 
 
 
Link to comment
Share on other sites

Link to post
Share on other sites

12 hours ago, LordMastodon said:

We're using a 10 GiB connection, so I don't think that uploading to the cloud will be a huge issue, but then again we don't have to upload every image (if we even end up doing images) to the cloud; we could do half-daily cloud uploads or something.

 

Anyway, it seems all of this may have become a slightly moot point because it turns out that the only two approved retailers for us to buy things from are Dell and Best Buy (don't worry, I want to strangle the Board of Directors as well), and Best Buy carries all of one part on our the list, the RAM. The only reason we even went with a custom-build was that we wanted to get away from Dell's ludicrous overpricing and customizability, or lack thereof. So it seems that if my boss can't convince the Board to approve Newegg and Amazon we'll be forced to go with Dell (sad faces all around).

You may have a 10Gb network connection locally but it's unlikely they have 10Gb internet connection. We as a University have 10Gb and 40Gb WAN connections across the country but our actual internet connection is slightly less than 1Gb and even then we only get about 20Mb-40Mb throughput to Azure.

 

One key point I would like to point out is that when buying a Dell server the only configuration options are standard RAID setups, if that is good enough for millions of customers it's good enough for that school. Just something to step back and think about.

 

I do agree there is a high cost going with Dell/HP etc but those costs are there for a reason. These vendors have to pay wages for employees to design multiple different components of each product line, run through extensive testing and validation of every possible configuration then pass this off to pre-sales engineers and sales/marketing departments. There is also the cost in having support centers and spare parts with express delivery. Everything adds up and has to scale across the world, lots of over heads.

 

To put some cost perspective in to the time I spent helping you if I were to go out to a customer and give the same amount of time you would be looking at between $2000-$4000 per day. I enjoyed helping though, it's something I like to do :).

 

I actually used to work for an IT service company that exclusively worked in the education sector, here's a post I did a little while ago with a typical setup for a school of 1400 students/120~ staff/600 school owned devices/60~ Aruba APs and large student BYOD.

On 7/8/2016 at 2:57 PM, leadeater said:

For my old job the standard spec we installed for schools is below, we only serviced the education sector fyi.

 

2x IBM x3650 M4/M5 with 2x Intel E5-2630 64GB RAM (Now lenovo)

1x IBM DS3524 (Now V3700) (Also now Lenovo)

1x Allied Telesis x900-24XS or a stack of Allied Telesis x610's

1x FortiGate 600C (router provided by ISP in bridge mode)

1x 12 3.5" Bay NAS (take your pick, QNAP/Synology/WD etc) for backups in different building

3x Eaton 9130 3000VA UPS with network module

VMware vSphere Essentials Plus

Veeam Backup & Replication

Microsoft licensing was taken care of by government education agreement, which was almost all you can eat.

 

Virtual server list below (I know I've forgotten some VMs but you get the idea):

<schoolcode>-DC01

<schoolcode>-DC02

<schoolcode>-FS01 (staff file server)

<schoolcode>-FS02 (student file server)

<schoolcode>-PS01 (print server, print costing application also usually PaperCut)

<schoolcode>-TS01 (terminal server, staff)

<schoolcode>-TS02 (terminal server, student)

<schoolcode>-EX01 (Exchange)

<schoolcode>-DB01 (Database server)

<schoolcode>-AS01 (Application server, student management system)

<schoolcode>-AS02 (Application server, LMS usually Moodle)

<schoolcode>-AS03 (Application server, usually financial application

<schoolcode>-LIB01 (Application server, Library application)

<schoolcode>-NM01 (Network management)

<schoolcode>-VC01 (vCenter)

<schoolcode>-VMA01 (VMware integration server to do clean shutdown of VMs and hosts during power failure, interfaces with the UPS network modules)

 

The above spec is for a school of around 1400 students with 6 computer labs of 31 computers, 3 labs being Macs, with a total computer count including staff laptops of about 600. This also had a wireless network of about 60 access points (details not shown) for student BYOD. You can scale the server spec up or down to meet the size of the school or increase the number but 2 is the minimum. We had around 1000 schools following this model, important to keep things similar across clients as it makes support much simpler and cheaper.

 

Edge network cabinets used Allied Telesis x510's, as many as required. Desktops were HP EliteDesk 800 series and again subsidized through a government purchasing scheme called All of Government (AOG). Macs had to be purchased at full cost.

 

Networking costs were also taken care of by the government under the School Network Upgrade Program (SNUP), state schools only had to pay for 20% of the cost.

 

Happy to help further :).

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×