Jump to content

Confirmation Before Going for Linux Server

Just now, Vitalius said:

He's using 4TB drives.

His chances of a URE during rebuild are pretty significant at 4TB worth of storage with 3 drives or more (which RAID 5's minimum is 3 drives). In fact, I think it's over 70% at 12TB of total storage (3 x 4 = 12).

Err - you're claiming that there's a 70% chance that your RAID array (a 3x 4TB RAID5) will encounter an unrecoverable error during a rebuild? Based on what source?

 

I looked up the drive in question: Seagate 4TB Desktop drive (ST4000DM000). It has an URE (unrecoverable Read Error) in every 1 out of 10E14 bytes - that equals one byte out of every 100TB of reads. Even if we increase that by the number of drives in the array, that would leave us with a URE of roughly 3 bytes per 100TB. So, at best, we're looking at something like a 12% chance.

 

Is that a little high? Yes. I would never take that percentage in a business production environment. But would I take that chance at home? Yes, I probably would.

 

On top of that, since the OP is going to be using Linux MDADM software RAID most likely, the URE risk would be drastically reduced, because Software RAID has the benefit of being able to schedule automatic scrubbing of the array, verifying integrity (If a corrupted file is detected for any reason, it can be restored from parity - or if a byte in the parity is corrupted, it can be recalculated via the original file).

 

URE's are a much bigger risk in traditional hardware RAID, because most RAID cards aren't bit-level aware. They don't see the file system the same way that software raid like MDADM and ZFS do.

For Sale: Meraki Bundle

 

iPhone Xr 128 GB Product Red - HP Spectre x360 13" (i5 - 8 GB RAM - 256 GB SSD) - HP ZBook 15v G5 15" (i7-8850H - 16 GB RAM - 512 GB SSD - NVIDIA Quadro P600)

 

Link to comment
Share on other sites

Link to post
Share on other sites

Just a couple thoughts on operating system - it sort of sounds like you're new to setting all of this up @DocSwag, with that in mind I'd try to keep your first foray into this as simple as possible. It's definitely good to go the debian route since there is a ton of support out there for it. For linux, the most popular raid solution is using mdadm (software). If you want checksuming for your files, then you can use btrfs (a younger brother to ZFS if you will).

 

Staying with the linux route, I would look at containers / docker to segregate your solutions. Like if you want to run plex, I'd separate it out into a container. (Also if you install plex home theater.)

 

However because you're new, I'd push towards something more turn-key like FreeNAS (or even windows). Just about everything you could want exists as a port, which means you just fire up a jail and install said port. Nice thing is the initial setup of FreeNAS and sharing files - very easy. Then you can grow and learn into creating jails and installing ports.

 

What it boils down to is if you want to learn, what you want to learn, and if you consider the learning process fun.

Link to comment
Share on other sites

Link to post
Share on other sites

I use unraid, same thing 7 games 1 pc is based on. I would see if that specific program you want to use is available in docker format. An unraid server is super easy to setup and is rock solid data protection. It is still a linux environment but packaged up in easy to use package with great support for a linux style environment.

Link to comment
Share on other sites

Link to post
Share on other sites

20 hours ago, dalekphalm said:

Err - you're claiming that there's a 70% chance that your RAID array (a 3x 4TB RAID5) will encounter an unrecoverable error during a rebuild? Based on what source?

 

I looked up the drive in question: Seagate 4TB Desktop drive (ST4000DM000). It has an URE (unrecoverable Read Error) in every 1 out of 10E14 bytes - that equals one byte out of every 100TB of reads. Even if we increase that by the number of drives in the array, that would leave us with a URE of roughly 3 bytes per 100TB. So, at best, we're looking at something like a 12% chance.

 

Is that a little high? Yes. I would never take that percentage in a business production environment. But would I take that chance at home? Yes, I probably would.

 

On top of that, since the OP is going to be using Linux MDADM software RAID most likely, the URE risk would be drastically reduced, because Software RAID has the benefit of being able to schedule automatic scrubbing of the array, verifying integrity (If a corrupted file is detected for any reason, it can be restored from parity - or if a byte in the parity is corrupted, it can be recalculated via the original file).

 

URE's are a much bigger risk in traditional hardware RAID, because most RAID cards aren't bit-level aware. They don't see the file system the same way that software raid like MDADM and ZFS do.

So I got two things to say about that. One is long winded, but specific and the other is simple and short.

Simple first:

UREs are rated in bits, not bytes. So 10E14 is 100 Terabits which is 12.5(ish) TB. The long winded thing is a conservative (imo) example of a typical rebuild from a home user's perspective, presuming he bothers to use his RAID adequately but not excessively. It comes out to 14TB of reads by the time the rebuild is done, which assuming the URE rate is spot on, is a 100% chance of hitting a URE.

If you need a source: http://blog.dshr.org/2015/05/unrecoverable-read-errors.html

Quote

Trevor Pott has a post at The Register entitled Flash banishes the spectre of the unrecoverable data error in which he points out that while disk manufacturers quoted Bit Error Rates (BER) for hard disks are typically 10 or 10-15....

BER is another form of URE, just written the reverse. Rather than 10^14 bits, it's read as 10^-14 as "how often it happens" essentially. It's more intuitive to use the URE notation though.

 

Unless something has changed and URE rates are in bytes now, which everywhere I look, that's not true, RAID 5 in this situation with 4TB drives is a bad idea long term. It's basically pointless. May as well solo use each drive independently.

 

Also, you don't increase it by the number of drives. It just has to happen to one drive, and assuming they're all the same drive, the rate wouldn't change since it only needs to happen once to stop the rebuild. It *could* happen more, but it doesn't matter as long as it happens once.

Long winded:

Spoiler

So, it's more than just the amount of a total rebuild on it's own.

What I mean is, you'd have to consider total life time reads of the array up to the point of a rebuild. If a drive dies immediately after building the array, the chances are much lower.

For example, in a 12TB 3 drive array, you have 8TB of usable space and 4TB of parity space. If you filled it half way up, you'd have used 4TB of the 8TB, but you'd have actually written at least 6TB of actual data due to parity data.

Every time there's a write, there's a read to verify that it was successful. So that's at least 6TB of actual reads. Then consider the non-write verification reads. That's a lot harder to guess, but lets be conservative and say 2TB of reads on top of the 6TB leaving us with 8TB of reads over the lifetime of the RAID, up to the point a drive dies and a rebuild is necessary. This is presuming a 3 year time before a drive fails as somewhat average or reasonable.

A rebuild consists of reading all the parity data on the living two drives and pushing the recovered data to the new drive. Then it requires reading the two living drives and pushing the calculated parity data from them to the third new drive.

All the parity data on the living two drives is (2TB/3)*2 so 4/3rd of a TB. All the data on the other two drives (that isn't parity) is (4TB/3)*2 so 8/3rds of a TB. Add those together and it's 4TB which is correct because if you had 6TB across 3 drives, 2 drives should have 4TB.

That's reading the parity data and reading the non-parity data all to make the new drive in sync with the RAID. That requires writing all that data to the new drive which means it needs to be verified as well.

So:
6TB for writes which need reads to be verified as successful. This is up to the rebuild time.

2TB of reads from just reading the data normally for whatever reason.
4/3 TB of reads to turn parity data into the lost data and push it to the new HDD.

4/3 TB of reads to verify the writes to the new HDD for the recovered data.
8/3 TB of reads of the non-parity data on the living two drives to convert it to parity data for the new drive.
2/3 TB of reads to verify the writes of the parity data to the new drive.

Total of the fractions is 18/3 or 6TB of reads overall for a rebuild of a 6TB (real storage including parity) RAID 5 array. 

So total reads ends up being 14TB over it's lifetime once the rebuild is done. The math seems complex but of course it's easy when it's seen that the amount of reads caused by a rebuild is equal to the amount of data on the entire RAID including parity.

 

 

† Christian Member †

For my pertinent links to guides, reviews, and anything similar, go here, and look under the spoiler labeled such. A brief history of Unix and it's relation to OS X by Builder.

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Yeah, I would never even think about RAID until you have a complete back up going. If you only 2 drives in one machine run backup from one to the other. 

On 12/11/2016 at 6:38 PM, DocSwag said:

For files, is all I have to do is plug the server into the modem or router and it'll be available to all our other computers? Or do I need to do a little more?

Install and configure SAMBA. 

Link to comment
Share on other sites

Link to post
Share on other sites

  • 3 weeks later...

I've had a storage server running for a good while now at home. I have it running Gentoo Linux. For the storage drives, I have a RACKABLE SE3016-SAS hooked up via SAS cable. I have it running software RAID5 right now. Need to get a new drive, and upgrade it to 6 one day. I like that one since when I get a new drive, I just slap it in, and the system sees it. I've never had any issues other than my own mistakes. I'd either get a small SSD drive, or an extra HDD drive you have laying around. You don't need speed, just something to boot the OS. You'll need to setup Samba in Linux. That's how you share files with Windows.A basic setup isn't too hard.

 

I just upgraded my server to a real server board running dual Xeon's. And, I have ESXi setup on it now. I've got 4 servers running on it doing several things.. web/mail/media/streamer.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, FullBoat said:

I've had a storage server running for a good while now at home. I have it running Gentoo Linux. For the storage drives, I have a RACKABLE SE3016-SAS hooked up via SAS cable. I have it running software RAID5 right now. Need to get a new drive, and upgrade it to 6 one day. I like that one since when I get a new drive, I just slap it in, and the system sees it. I've never had any issues other than my own mistakes. I'd either get a small SSD drive, or an extra HDD drive you have laying around. You don't need speed, just something to boot the OS. You'll need to setup Samba in Linux. That's how you share files with Windows.A basic setup isn't too hard.

 

I just upgraded my server to a real server board running dual Xeon's. And, I have ESXi setup on it now. I've got 4 servers running on it doing several things.. web/mail/media/streamer.

Uh... I already got the server and set up samba on it...

Make sure to quote me or tag me when responding to me, or I might not know you replied! Examples:

 

Do this:

Quote

And make sure you do it by hitting the quote button at the bottom left of my post, and not the one inside the editor!

Or this:

@DocSwag

 

Buy whatever product is best for you, not what product is "best" for the market.

 

Interested in computer architecture? Still in middle or high school? P.M. me!

 

I love computer hardware and feel free to ask me anything about that (or phones). I especially like SSDs. But please do not ask me anything about Networking, programming, command line stuff, or any relatively hard software stuff. I know next to nothing about that.

 

Compooters:

Spoiler

Desktop:

Spoiler

CPU: i7 6700k, CPU Cooler: be quiet! Dark Rock Pro 3, Motherboard: MSI Z170a KRAIT GAMING, RAM: G.Skill Ripjaws 4 Series 4x4gb DDR4-2666 MHz, Storage: SanDisk SSD Plus 240gb + OCZ Vertex 180 480 GB + Western Digital Caviar Blue 1 TB 7200 RPM, Video Card: EVGA GTX 970 SSC, Case: Fractal Design Define S, Power Supply: Seasonic Focus+ Gold 650w Yay, Keyboard: Logitech G710+, Mouse: Logitech G502 Proteus Spectrum, Headphones: B&O H9i, Monitor: LG 29um67 (2560x1080 75hz freesync)

Home Server:

Spoiler

CPU: Pentium G4400, CPU Cooler: Stock, Motherboard: MSI h110l Pro Mini AC, RAM: Hyper X Fury DDR4 1x8gb 2133 MHz, Storage: PNY CS1311 120gb SSD + two Segate 4tb HDDs in RAID 1, Video Card: Does Intel Integrated Graphics count?, Case: Fractal Design Node 304, Power Supply: Seasonic 360w 80+ Gold, Keyboard+Mouse+Monitor: Does it matter?

Laptop (I use it for school):

Spoiler

Surface book 2 13" with an i7 8650u, 8gb RAM, 256 GB storage, and a GTX 1050

And if you're curious (or a stalker) I have a Just Black Pixel 2 XL 64gb

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×