Jump to content

Do You really Need ECC RAM for a home NAS???

25 minutes ago, AshleyAshes said:

Again, redundancy.  The entire ZFS file system is purpose built to avoid such concerns.  The ZFS file system is built like a brick shit house but you clearly have no idea how it works.  Unless the data is in memory and being written to the file system, it will just sit there.  Even a corrupted OS doesn't jump up and start eating the file system like some kind of random Skynet creation.  Only data in the process of being written, either in memory or needing to be read from a cache drive and copied to the file ZFS filesystem would be at risk.  And these are the rare situations where ECC can be a benefit.  A random cosmic ray flipping a bit in your memory will not cause you to lose all data.  Protection against this is literally why ZFS exists.

 

And it makes sense in an enterprise situation.  There's a LOT of data moving in and out of memory to meet the demands of whatever service and you need to maximize uptime.  Meanwhile most end users who have a NAS are just hoarding files, usually media, files that will sit on the storage and rarely be read and when they are read they will be read for a pretty brief time (Relatively speaking).  Hard drive failures, now those are cause for a doom scenario.  Or a PSU failure that just up and burns out everything attached to it.  Power surge in your electrical grid.  Your idiot roommate flooding the place.  These are the things you should fear.  Insisting on ECC memory for a home user to avoid the 'monster' of 'total data loss', that's just ignorant nerds trying to out fearmonger each other.

 

15 minutes ago, djdwosk97 said:

You can choose to believe what you want, but I'll still be sitting here knowing full well that my data isn't at risk of eating itself alive every time a scrub is done or a file gets written. And there is literally no reason to go with FreeNAS if you're not going to use ECC memory, the reason to use ZFS is to take advantage of its superior data corruption prevention scheme -- which doesn't do jack without ECC memory. 

 

Quote

One major feature that distinguishes ZFS from other file systems is that it is designed with a focus on data integrity by protecting the user's data on disk against silent data corruption caused by data degradation, currentspikes, bugs in disk firmware, phantom writes (the previous write did not make it to disk), misdirected reads/writes (the disk accesses the wrong block), DMA parity errors between the array and server memory or from the driver (since the checksum validates data inside the array), driver errors (data winds up in the wrong buffer inside the kernel), accidental overwrites (such as swapping to a live file system), etc.

Recent research shows that none of the currently widespread filesystems—such as UFS, Ext,[8] XFS, JFS, or NTFS—nor hardware RAID (which has some issues with data integrity) provide sufficient protection against such problems.[9][10][11][12] Initial research indicates that ZFS protects data better than earlier efforts.[13][14] It is also faster than UFS[15][16] and can be seen as a replacement

 

Quote

For ZFS, data integrity is achieved by using a Fletcher-based checksum or a SHA-256 hash throughout the file system tree.[17] Each block of data is checksummed and the checksum value is then saved in the pointer to that block—rather than at the actual block itself. Next, the block pointer is checksummed, with the value being saved at its pointer. This checksumming continues all the way up the file system's data hierarchy to the root node, which is also checksummed, thus creating a Merkle tree.[17] In-flight data corruption or phantom reads/writes (the data written/read checksums correctly but is actually wrong) are undetectable by most filesystems as they store the checksum with the data. ZFS stores the checksum of each block in its parent block pointer so the entire pool self-validates.[17]

When a block is accessed, regardless of whether it is data or meta-data, its checksum is calculated and compared with the stored checksum value of what it "should" be. If the checksums match, the data are passed up the programming stack to the process that asked for it; if the values do not match, then ZFS can heal the data if the storage pool provides data redundancy (such as with internal mirroring), assuming that the copy of data is undamaged and with matching checksums.[18] If the storage pool consists of a single disk, it is possible to provide such redundancy by specifying copies=2 (or copies=3), which means that data will be stored twice (or three times) on the disk, effectively halving (or, for copies=3, reducing to one third) the storage capacity of the disk.[19] If redundancy exists, ZFS will fetch a copy of the data (or recreate it via a RAID recovery mechanism), and recalculate the checksum—ideally resulting in the reproduction of the originally expected value. If the data passes this integrity check, the system can then update the faulty copy with known-good data so that redundancy can be restored

Source: https://en.wikipedia.org/wiki/ZFS#Data_integrity

 

ZFS in no way requires ECC for it's advanced data integrity features to work. ECC is just another added protection that if the RAM is faulting enough to cause data corruption but not crash the system and would exceed the damage threshold for ZFS to fix the files effected by this corruption. The way that ZFS works means corruption happens at the file level and not the volume level.

 

It is extremely rare that an entire pool would be compromised by anything other than more disks failing than the pool is configured to handle.

 

I personally will always use ECC, will recommend it if asked but will not say that it is mandatory for a home NAS.

 

Edit: Fixed up some awful spelling and grammar errors, tired and in the middle of an upgrade :P

Link to comment
Share on other sites

Link to post
Share on other sites

36 minutes ago, leadeater said:

 

 

 

Source: https://en.wikipedia.org/wiki/ZFS#Data_integrity

 

ZFS in no way requires ECC for it's advanced data integrity features to work. ECC is just another added protection that if the RAM is faulting enough to cause data corruption but not crash the system and would exceed the damage threshold for ZFS to require the files effected by this corruption. The way that ZFS works means corruption happens at the file level and not the volume level.

 

It's is extremely rare that an entire pool would be compromised by another other than more disks failing than the pool is configured to handle.

 

I personally will always by ECC, will recommend it if asked but will not say that it is mandatory for a home NAS.

Checksumming and scrubs together are one of the reasons why ZFS protects your datas integrity as well as it does. If you run a scrub with non-ECC memory, and the memory is functioning perfectly, then you don't have a problem. Try to do the same thing with a bad stick and things are going to get bad and fast. ECC memory will prevent that level of corruption as it will either determine that the checksum is invalid or it will halt the system depending on the severity of the error at hand. 

 

And even if you're just reading data, ZFS will correct any incorrect data (assuming of course you're using some form of parity raid), so even if you have scrubs disabled you can still end up corrupting your data with bad ram if it's not ECC. 

 

Perfectly functioning non-ECC is just as good as ECC RAM, but if you care about the integrity of your data then you should be planning for the worst case scenario. That's why you run some form of redundant RAID; that's why you should have some form of a backup; that's why you should have a UPS; that's why you should use sever grade hardware. The idea is to minimize the number of points of failure in order to reduce the likelihood of having a problem. 

PSU Tier List | CoC

Gaming Build | FreeNAS Server

Spoiler

i5-4690k || Seidon 240m || GTX780 ACX || MSI Z97s SLI Plus || 8GB 2400mhz || 250GB 840 Evo || 1TB WD Blue || H440 (Black/Blue) || Windows 10 Pro || Dell P2414H & BenQ XL2411Z || Ducky Shine Mini || Logitech G502 Proteus Core

Spoiler

FreeNAS 9.3 - Stable || Xeon E3 1230v2 || Supermicro X9SCM-F || 32GB Crucial ECC DDR3 || 3x4TB WD Red (JBOD) || SYBA SI-PEX40064 sata controller || Corsair CX500m || NZXT Source 210.

Link to comment
Share on other sites

Link to post
Share on other sites

18 minutes ago, djdwosk97 said:

Checksumming and scrubs together are one of the reasons why ZFS protects your datas integrity as well as it does. If you run a scrub with non-ECC memory, and the memory is functioning perfectly, then you don't have a problem. Try to do the same thing with a bad stick and things are going to get bad and fast. ECC memory will prevent that level of corruption as it will either determine that the checksum is invalid or it will halt the system depending on the severity of the error at hand. 

 

 

Perfectly functioning non-ECC is just as good as ECC, that's not the case when a stick runs into a problem. 

1) If you try to do anything on any computer with a damaged/incorrectly operating stick of RAM, things are going to 'get bad and fast'. Ya know, like all of those operating system files that are always in memory.

 

2) ECC memory's single bit error protection in no way offers protection from data corruption that would be caused by damaged/incorrectly operating RAM.  ECC memory -only- offers production from single bit flips typically caused by outside radiation such as cosmic rays.

 

So you apparently also don't know what ECC memory actually does either now, super.  But you are still trying to argue here, it's important to blindly and ignorantly stick to your guns!

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, AshleyAshes said:

1) If you try to do anything on any computer with a damaged/incorrectly operating stick of RAM, things are going to 'get bad and fast'. Ya know, like all of those operating system files that are always in memory.

 

2) ECC memory's single bit error protection in no way offers protection from data corruption that would be caused by damaged/incorrectly operating RAM.  ECC memory -only- offers production from single bit flips typically caused by outside radiation such as cosmic rays.

 

So you apparently also don't know what ECC memory actually does either now, super.  But you are still trying to argue here, it's important to blindly and ignorantly stick to your guns!

Except ECC memory will halt the system in the even of a serious issue whereas non-ECC memory won't even know that anything is wrong and will continue to make a mess of things. 

 

I'd really suggest having a look at the ECC vs. non ECC thread on the FreeNAS forums. It's worth a read, as are the dozens of other articles on the matter. 

PSU Tier List | CoC

Gaming Build | FreeNAS Server

Spoiler

i5-4690k || Seidon 240m || GTX780 ACX || MSI Z97s SLI Plus || 8GB 2400mhz || 250GB 840 Evo || 1TB WD Blue || H440 (Black/Blue) || Windows 10 Pro || Dell P2414H & BenQ XL2411Z || Ducky Shine Mini || Logitech G502 Proteus Core

Spoiler

FreeNAS 9.3 - Stable || Xeon E3 1230v2 || Supermicro X9SCM-F || 32GB Crucial ECC DDR3 || 3x4TB WD Red (JBOD) || SYBA SI-PEX40064 sata controller || Corsair CX500m || NZXT Source 210.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, djdwosk97 said:

Except ECC memory will halt the system in the even of a serious issue whereas non-ECC memory won't even know that anything is wrong and will continue to make a mess of things.

Do you have any idea how rare a multi-bit error happening to ECC memory to a single end user computer would be?  We're not talking about server farms with 30+ blades in just one rack, as they sit lined up rack after rack after rack, we're talking about one end users's NAS box.  ...Well, we can tell how rare it is, because people are running desktops and laptops with non-ECC memory and yet their computers are not randomly blowing up in significant numbers and when they do it's almost always drive or hardware failure.  ECC protects against an uncommon thread, an uncommon threat that large machine fleets or highly sensitive data is worth protecting against, but you are trying to convince people that a simple NAS box without ECC marks some kind of doom scenario and it's patently false.  Now you are trying to grasp at every straw you can just to try to argue that in some limited way you were 'technically' right.

 

1) ZFS offers significant storage protection advantages even with Non-ECC memory over other storage solutions, especially single disk solutions, despite your clams.

 

2) The kinds of errors that ECC memory protects against are exceptionally uncommon threats to end user scenarios, despite your fear mongering.

Link to comment
Share on other sites

Link to post
Share on other sites

10 hours ago, AshleyAshes said:

Do you have any idea how rare a multi-bit error happening to ECC memory to a single end user computer would be?  We're not talking about server farms with 30+ blades in just one rack, as they sit lined up rack after rack after rack, we're talking about one end users's NAS box.  ...Well, we can tell how rare it is, because people are running desktops and laptops with non-ECC memory and yet their computers are not randomly blowing up in significant numbers and when they do it's almost always drive or hardware failure.  ECC protects against an uncommon thread, an uncommon threat that large machine fleets or highly sensitive data is worth protecting against, but you are trying to convince people that a simple NAS box without ECC marks some kind of doom scenario and it's patently false.  Now you are trying to grasp at every straw you can just to try to argue that in some limited way you were 'technically' right.

 

1) ZFS offers significant storage protection advantages even with Non-ECC memory over other storage solutions, especially single disk solutions, despite your clams.

 

2) The kinds of errors that ECC memory protects against are exceptionally uncommon threats to end user scenarios, despite your fear mongering.

So memory related issues are rare, therefore you should just pretend like they don't exist? It's not fear mongering in any way. The likelihood of there being some kind of catastrophic damage to your server is also quite low, so then there shouldn't be any advantage to having a backup of your already redundantly-stored data, but that's not the case. The fact of the matter is that if you care about your data then your intention should be to minimize every reasonable source of risk there is. That means using server grade hardware and not getting the cheapest shit around, that means getting a UPS, that means running some form of redundant RAID, and keeping backups of your data. With the exception of a backup (which can become cost prohibitive as it literally doubles the cost of everything) any one failure point will completely negate ANY precautions you've taken. 

 

1) As do plenty of other OS's that are significantly lighter. The big plus to ZFS is all the checksumming and the fact that on any read/scrub the system will try to correct any data that is perceived to be incorrect (and if you happen to have bad memory, non-ECC is going to "correct" (read: destroy) most of your data, with ECC memory the system will detect the error and either fix it or halt depeneding on the severity of th error).

 

2) Except for the fact that something as simple as scrubbing a drive (in order to prevent corruption) with non-ECC memory with a stuck bit is going to corrupt a lot of your data. That's an extremely common situation, and is by no means fear mongering. 

PSU Tier List | CoC

Gaming Build | FreeNAS Server

Spoiler

i5-4690k || Seidon 240m || GTX780 ACX || MSI Z97s SLI Plus || 8GB 2400mhz || 250GB 840 Evo || 1TB WD Blue || H440 (Black/Blue) || Windows 10 Pro || Dell P2414H & BenQ XL2411Z || Ducky Shine Mini || Logitech G502 Proteus Core

Spoiler

FreeNAS 9.3 - Stable || Xeon E3 1230v2 || Supermicro X9SCM-F || 32GB Crucial ECC DDR3 || 3x4TB WD Red (JBOD) || SYBA SI-PEX40064 sata controller || Corsair CX500m || NZXT Source 210.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, djdwosk97 said:

Checksumming and scrubs together are one of the reasons why ZFS protects your datas integrity as well as it does. If you run a scrub with non-ECC memory, and the memory is functioning perfectly, then you don't have a problem. Try to do the same thing with a bad stick and things are going to get bad and fast. ECC memory will prevent that level of corruption as it will either determine that the checksum is invalid or it will halt the system depending on the severity of the error at hand. 

 

 

Perfectly functioning non-ECC is just as good as ECC, that's not the case when a stick runs into a problem. 

True, pretty much my second sentence scenario I guess. I'm still firmly of the belief that if your going to be running a properly built and sized ZFS system ECC memory overall does not add significant cost to realisticly consider not using it. RAM just isn't that expensive and if you ignore the significantly incorrect RAM requirements people give for ZFS this will be only around 10%-20% of the total system cost.

 

1GB-5GB ram per 1TB is only recommended if you are using deduplication and put more load on the system than a home user would ever get close to.

 

Our Netapp 8060's at work have 64GB RAM per controller or 128GB RAM per pair and each pair are certified to have 9600TB of storage using 1200 disks. Your basic home NAS isn't going to ever need more than 16GB ram. Also note the Netapp file system and ZFS are very similar.

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, leadeater said:

Our Netapp 8060's at work have 64GB RAM per controller or 128GB RAM per pair and each pair are certified to have 9600TB of storage using 1200 disks. Your basic home NAS isn't going to ever need more than 16GB ram. Also note the Netapp file system and ZFS are very similar.

I'd go as far as saying that even 8GB is sufficient for most people. I ran my server off of 8GB and the only issue I had was when I tried running two concurrent 1080p Plex streams I would get the occasional (once every 30 minutes or so) stutter (I would get a similar stutter with three transcoded streams too). Now I still don't know if that stuttering was caused by 8GB of RAM, the CPU in my server, or the fact that 2/2 and 2/3 devices being used to test were on wifi. I had initially assumed it was a memory issue, but even now I'll get an occasional stutter (albeit far less frequently) and I'm currently running a 1230v2 and 32GB of RAM...so I now think it was actually a network issue entirely. 

PSU Tier List | CoC

Gaming Build | FreeNAS Server

Spoiler

i5-4690k || Seidon 240m || GTX780 ACX || MSI Z97s SLI Plus || 8GB 2400mhz || 250GB 840 Evo || 1TB WD Blue || H440 (Black/Blue) || Windows 10 Pro || Dell P2414H & BenQ XL2411Z || Ducky Shine Mini || Logitech G502 Proteus Core

Spoiler

FreeNAS 9.3 - Stable || Xeon E3 1230v2 || Supermicro X9SCM-F || 32GB Crucial ECC DDR3 || 3x4TB WD Red (JBOD) || SYBA SI-PEX40064 sata controller || Corsair CX500m || NZXT Source 210.

Link to comment
Share on other sites

Link to post
Share on other sites

  • 4 years later...
On 4/6/2016 at 10:01 AM, RedWulf said:

ecc isnt highy required, its like server oriented drives or gold rated PSUs. We recommend it but odds are you'll be fine withou it

So even if I'm building a FreeNAS and set my drive to ZFS without ECC memory, it's OK? I'm trying to build a NAS based on a retired machine with i5 8500.

Link to comment
Share on other sites

Link to post
Share on other sites

I think you'll usually hear two sides of the story.

 

Those who have ran a small environment for a little bit who haven't run into corruption type of issues.

Those who have been in the game a while and have had issues where features such as ECC would have prevented corruption.

 

I have had some RAM issues in the past with bit flips on sticks that passed memtest86+, but was noticing some of my files changed hashes and had weird artifacts in documents.  After running hours of tests there was a specific pattern in memtester on Linux that indicated a discrepancy, after replacing the modules all was fine.

 

Personally the cost premium is less than the troubleshooting effort and data restoration/integrity factors, so it ends up being less expensive to 'do it properly'.

PC : 3600 · Crosshair VI WiFi · 2x16GB RGB 3200 · 1080Ti SC2 · 1TB WD SN750 · EVGA 1600G2 · Define C 

Link to comment
Share on other sites

Link to post
Share on other sites

Well, I look at this two ways:

 

1) If you already have the hardware, such as non-ECC RAM, AND have backups of your files already, or are going to have proper backups... then it doesn't matter so much.

 

2) If you're buying new, then as others have pointed out, you might as well get ECC RAM while you're at it. If necessary because of cost constraints, then maybe get half as much RAM, and then buy the other half when you can. Although, I can't think of the minimal price difference to make any noticeable difference TBH. See below

 

8GB ECC £44  https://uk.pcpartpicker.com/product/gQHRsY/kingston-8-gb-1-x-8-gb-ddr4-2400-memory-ksm24es88me

8GB Non-ECC £28  https://uk.pcpartpicker.com/product/NJzkcf/adata-8-gb-1-x-8-gb-ddr4-2666-memory-ad4u266638g19-s

That's a £16 difference, going by the cheapest modules possible. And a similar amount for a 16GB kit, which is roughly £1 per GB more expensive, TBH if you can't afford £1 per GB extra then you might want to rethink computing entirely.

 

 

 

Please quote my post, or put @paddy-stone if you want me to respond to you.

Spoiler
  • PCs:- 
  • Main PC build  https://uk.pcpartpicker.com/list/2K6Q7X
  • ASUS x53e  - i7 2670QM / Sony BD writer x8 / Win 10, Elemetary OS, Ubuntu/ Samsung 830 SSD
  • Lenovo G50 - 8Gb RAM - Samsung 860 Evo 250GB SSD - DVD writer
  •  
  • Displays:-
  • Philips 55 OLED 754 model
  • Panasonic 55" 4k TV
  • LG 29" Ultrawide
  • Philips 24" 1080p monitor as backup
  •  
  • Storage/NAS/Servers:-
  • ESXI/test build  https://uk.pcpartpicker.com/list/4wyR9G
  • Main Server https://uk.pcpartpicker.com/list/3Qftyk
  • Backup server - HP Proliant Gen 8 4 bay NAS running FreeNAS ZFS striped 3x3TiB WD reds
  • HP ProLiant G6 Server SE316M1 Twin Hex Core Intel Xeon E5645 2.40GHz 48GB RAM
  •  
  • Gaming/Tablets etc:-
  • Xbox One S 500GB + 2TB HDD
  • PS4
  • Nvidia Shield TV
  • Xiaomi/Pocafone F2 pro 8GB/256GB
  • Xiaomi Redmi Note 4

 

  • Unused Hardware currently :-
  • 4670K MSI mobo 16GB ram
  • i7 6700K  b250 mobo
  • Zotac GTX 1060 6GB Amp! edition
  • Zotac GTX 1050 mini

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/24/2020 at 5:15 AM, DDonlien said:

So even if I'm building a FreeNAS and set my drive to ZFS without ECC memory, it's OK? I'm trying to build a NAS based on a retired machine with i5 8500.

Firstly, your CPU does not support ECC memory.

 

Second, I suppose you read this whole thread.

Many people say ECC is almost 'required' if you value your data.

Well, ECC generally is good idea. Yes, it can help those odd situations you may get corruption.

But honestly - how often does this happen?

 

I've worked in telco company which has about 1000 servers. In my 20 years working there, we had just few situations where RAM sticks have gone bad - or actually reported ECC errors.

 

For vast majority of home users, there is no real worry of such issue happening. It's like 1 in 10000 chance you'll be hit by this situation - and honestly, much higher probability you'll lose data to other sources - primarily by user error, then by some other HW or SW issue.

 

So again - ECC is good thing. Is it required so your data doesn't miraculously just disappear in a  week? - No.

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×