Jump to content
Search In
  • More options...
Find results that contain...
Find results in...

NAS do you need ECC?

This is a PSA bear with me:
 

Is ECC memory required for Freenas/Truenas?

  No but highly recommended.
    https://www.ixsystems.com/blog/hardware-guide/#ECC-Memory
    https://openzfs.github.io/openzfs-docs/Performance and Tuning/Hardware.html#ecc-memory

 

Is ECC memory required for X.Y.Z:

  Same as above, this is because any filesystem is susceptible to "bit flips"  ECC memory is a mitigation for this problem. No snapshot will save you if the written data is corrupted to begin with.

Commonly you'll hear people say:
  "I don't use ECC ram and never had a problem", this is a fallacy of composition.
You may also get the hit:

  "I do snapshots so isn't a problem."

  It is because if it's corrupted at the buffer level the filesystem will have no way of knowing if it's corrupted or not to begin with.

Link to post
Share on other sites

I'd love to hear you go into more detail if you don't mind. I've known the basic function of ECC memory for a long time but I've never really learned how important or practical it is.

 

For example how does a system with Non-ECC memory and a NTFS File System function fine or mostly fine for year but a server File System like ZFS or BTRFS (could be wrong about the second one) have it be highly recommended?

 

How do the two systems differ in such a way where ECC becomes a necessity?

Guides & Tutorials:

Testing for RAM Errors w/ MemTest86

How To: Remotely Access a Computer, Server, or NAS

How To: Access Remote Systems at Home/Work Securely from Anywhere with Pritunl

How to Format Storage Devices in Windows 10

A How-To: Drive Sharing in Windows 10

VFIO GPU Pass-though w/ Looking Glass KVM on Ubuntu 19.04

A How-To Guide: Building a Rudimentary Disk Enclosure

Three Methods to Resetting a Windows Login Password

 

Guide/Tutorial in Progress:

iPXE Network Booting to an iSCSI Target

 

In the Queue:

 

 

Don't see what you need? Check the Full List or *PM me, if I haven't made it I'll add it to the list.

*NOTE: I'll only add it to the list if the request is something I know I can do.

Link to post
Share on other sites
24 minutes ago, Windows7ge said:

I'd love to hear you go into more detail if you don't mind. I've known the basic function of ECC memory for a long time but I've never really learned how important or practical it is.

 

For example how does a system with Non-ECC memory and a NTFS File System function fine or mostly fine for year but a server File System like ZFS or BTRFS (could be wrong about the second one) have it be highly recommended?

 

How do the two systems differ in such a way where ECC becomes a necessity?

The problem can occur on any filesystem, it's a necessity every time you need either high reliability or/and high availability in a system.
The 2 main benefits to ECC ram in this case would be the ability that it has to detect that there's an error (thus allowing operations to fail before the data is written when it can't self-heal) and the ability to self correct using multiple cache levels and bit parity (kind like a raid1 but for bits), there are also more techniques involved in the memory controller that I don't personally know.

Link to post
Share on other sites

Regular desktops don't use ECC because of 2 main factors price and speed.

Link to post
Share on other sites
1 hour ago, zhnu said:

The problem can occur on any filesystem, it's a necessity every time you need either high reliability or/and high availability in a system.
The 2 main benefits to ECC ram in this case would be the ability that it has to detect that there's an error (thus allowing operations to fail before the data is written when it can't self-heal) and the ability to self correct using multiple cache levels and bit parity (kind like a raid1 but for bits), there are also more techniques involved in the memory controller that I don't personally know.

As I understand it fundamentally maybe there are more than one implementation but on a normal Non-ECC module data is distributed in 64-bit chunks. On ECC memory it has an additional black chip and distributes data in 72-bit chunks with 8-bits being for parity (I'd compare more closely to RAID5 than RAID1).

 

How prevalent this is in most systems to prevent corruption I can't say. I've never heard about bit flips in RAM being a rampant issue but given it is a nice extra cushion and isn't particularly more expensive than Non-ECC (depending on how dense and weather it's UDIMM or RDIMM) it's kind of a "why not?" situation. Doesn't cost you much or any more.

Guides & Tutorials:

Testing for RAM Errors w/ MemTest86

How To: Remotely Access a Computer, Server, or NAS

How To: Access Remote Systems at Home/Work Securely from Anywhere with Pritunl

How to Format Storage Devices in Windows 10

A How-To: Drive Sharing in Windows 10

VFIO GPU Pass-though w/ Looking Glass KVM on Ubuntu 19.04

A How-To Guide: Building a Rudimentary Disk Enclosure

Three Methods to Resetting a Windows Login Password

 

Guide/Tutorial in Progress:

iPXE Network Booting to an iSCSI Target

 

In the Queue:

 

 

Don't see what you need? Check the Full List or *PM me, if I haven't made it I'll add it to the list.

*NOTE: I'll only add it to the list if the request is something I know I can do.

Link to post
Share on other sites
6 minutes ago, Windows7ge said:

As I understand it fundamentally maybe there are more than one implementation but on a normal Non-ECC module data is distributed in 64-bit chunks. On ECC memory it has an additional black chip and distributes data in 72-bit chunks with 8-bits being for parity (I'd compare more closely to RAID5 than RAID1).

 

How prevalent this is in most systems to prevent corruption I can't say. I've never heard about bit flips in RAM being a rampant issue but given it is a nice extra cushion and isn't particularly more expensive than Non-ECC (depending on how dense and weather it's UDIMM or RDIMM) it's kind of a "why not?" situation. Doesn't cost you much or any more.

Well there's also the added benefit that applications you run also benefit from those features, it might not be important to the common person, but it is very important for a lot of use cases in example a financial institution where critical data must flow fast and without issues.

Link to post
Share on other sites
1 minute ago, zhnu said:

Well there's also the added benefit that applications you run also benefit from those features, it might not be important to the common person, but it is very important for a lot of use cases in example a financial institution where critical data must flow fast and without issues.

Agreed. I'd love to hear an in depth analysis as to how/why ZFS benefits from ECC. As I understand it it already has it's own internal checksum features to help make sure the data written to the pool isn't corrupt. Of course if the data handed to ZFS (from memory) is already corrupt then ZFS won't know the difference I suppose.

Guides & Tutorials:

Testing for RAM Errors w/ MemTest86

How To: Remotely Access a Computer, Server, or NAS

How To: Access Remote Systems at Home/Work Securely from Anywhere with Pritunl

How to Format Storage Devices in Windows 10

A How-To: Drive Sharing in Windows 10

VFIO GPU Pass-though w/ Looking Glass KVM on Ubuntu 19.04

A How-To Guide: Building a Rudimentary Disk Enclosure

Three Methods to Resetting a Windows Login Password

 

Guide/Tutorial in Progress:

iPXE Network Booting to an iSCSI Target

 

In the Queue:

 

 

Don't see what you need? Check the Full List or *PM me, if I haven't made it I'll add it to the list.

*NOTE: I'll only add it to the list if the request is something I know I can do.

Link to post
Share on other sites
2 minutes ago, Windows7ge said:

Of course if the data handed to ZFS (from memory) is already corrupt then ZFS won't know the difference I suppose.

Exactly that's the main problem. The openzfs link (which is what truenas uses in the most recent versions), offers a good explanation to what can happen.

Link to post
Share on other sites
29 minutes ago, Windows7ge said:

Agreed. I'd love to hear an in depth analysis as to how/why ZFS benefits from ECC. As I understand it it already has it's own internal checksum features to help make sure the data written to the pool isn't corrupt. Of course if the data handed to ZFS (from memory) is already corrupt then ZFS won't know the difference I suppose.

ZFS doesn't benefit from ecc more than any other filesystem.  basically All filesystems and oses  assume ram is correct and don't check the data. 

 

The zfs checksumming is for disk checking, not rm.

 

 

Link to post
Share on other sites
3 minutes ago, Electronics Wizardy said:

ZFS doesn't benefit from ecc more than any other filesystem.  basically All filesystems and oses  assume ram is correct and don't check the data. 

 

The zfs checksumming is for disk checking, not rm.

 

 

Should I rephrase the so it's more clear?

Quote

  Same as above, this is because any filesystem is susceptible to "bit flips"  ECC memory is a mitigation for this problem. No snapshot will save you if the written data is corrupted to begin with.

 

Link to post
Share on other sites
1 minute ago, Electronics Wizardy said:

The zfs checksumming is for disk checking, not rm.

That's what I figured.

 

Would you say it comes down to use case/application as to why ECC isn't standard in desktops? Is it that desktops can stand to deal with bit flips without noticeable issue or...

Guides & Tutorials:

Testing for RAM Errors w/ MemTest86

How To: Remotely Access a Computer, Server, or NAS

How To: Access Remote Systems at Home/Work Securely from Anywhere with Pritunl

How to Format Storage Devices in Windows 10

A How-To: Drive Sharing in Windows 10

VFIO GPU Pass-though w/ Looking Glass KVM on Ubuntu 19.04

A How-To Guide: Building a Rudimentary Disk Enclosure

Three Methods to Resetting a Windows Login Password

 

Guide/Tutorial in Progress:

iPXE Network Booting to an iSCSI Target

 

In the Queue:

 

 

Don't see what you need? Check the Full List or *PM me, if I haven't made it I'll add it to the list.

*NOTE: I'll only add it to the list if the request is something I know I can do.

Link to post
Share on other sites
3 minutes ago, Windows7ge said:

That's what I figured.

 

Would you say it comes down to use case/application as to why ECC isn't standard in desktops? Is it that desktops can stand to deal with bit flips without noticeable issue or...

The main problem is cost (ECC is more expensive) and speed (ECC doesn't clock as high as desktop memory also it will always be more slow due to the verifications it has to make) desktops are also susceptible to bit flips.

Link to post
Share on other sites
1 minute ago, Windows7ge said:

That's what I figured.

 

Would you say it comes down to use case/application as to why ECC isn't standard in desktops? Is it that desktops can stand to deal with bit flips without noticeable issue or...

Cost, and no one wants to make it stndrad it seems.

 

For most consumer users, bit flips don't seem to matter.

 

THey do keep adding checksumming, like ddr4 adds checksumming to the data copying.

 

4 minutes ago, zhnu said:

Should I rephrase the so it's more clear?

 

is what  I said wrong? ECC is always and reduces the chance of memory errors.

 

Link to post
Share on other sites
Just now, zhnu said:

The main problem is cost (ECC is more expensive) and speed (ECC doesn't clock as high as desktop memory also it will always be more slow due to the verifications it has to make)

the big issue is the systems that support ecc don't normally support memory overclocking, so they don't get the high speed. But ecc should be able to run at the same speeds as non ecc dimms, at least for udimms, assuming the same quality of chips.

Link to post
Share on other sites
1 minute ago, Electronics Wizardy said:

Cost, and no one wants to make it stndrad it seems.

 

For most consumer users, bit flips don't seem to matter.

 

THey do keep adding checksumming, like ddr4 adds checksumming to the data copying.

 

is what  I said wrong? ECC is always and reduces the chance of memory errors.

 

I was referring to my top post I think will rephrase it to make it more clear

Link to post
Share on other sites
Just now, zhnu said:

I was referring to my top post I think will rephrase it to make it more clear

Yea it was written fine. Don't worry about your wording here.

Link to post
Share on other sites
1 minute ago, zhnu said:

The main problem is cost (ECC is more expensive) and speed (ECC doesn't clock as high as desktop memory also it will always be more slow due to the verifications it has to make) desktops are also susceptible to bit flips.

2 minutes ago, Electronics Wizardy said:

Cost, and no one wants to make it stndrad it seems.

 

For most consumer users, bit flips don't seem to matter.

 

THey do keep adding checksumming, like ddr4 adds checksumming to the data copying.

So it comes down to cost and in practice there isn't a real benefit. Alright, makes sense. Good to know as well.

Guides & Tutorials:

Testing for RAM Errors w/ MemTest86

How To: Remotely Access a Computer, Server, or NAS

How To: Access Remote Systems at Home/Work Securely from Anywhere with Pritunl

How to Format Storage Devices in Windows 10

A How-To: Drive Sharing in Windows 10

VFIO GPU Pass-though w/ Looking Glass KVM on Ubuntu 19.04

A How-To Guide: Building a Rudimentary Disk Enclosure

Three Methods to Resetting a Windows Login Password

 

Guide/Tutorial in Progress:

iPXE Network Booting to an iSCSI Target

 

In the Queue:

 

 

Don't see what you need? Check the Full List or *PM me, if I haven't made it I'll add it to the list.

*NOTE: I'll only add it to the list if the request is something I know I can do.

Link to post
Share on other sites
1 minute ago, Electronics Wizardy said:

the big issue is the systems that support ecc don't normally support memory overclocking, so they don't get the high speed. But ecc should be able to run at the same speeds as non ecc dimms, at least for udimms, assuming the same quality of chips.

1 servers don't usually run overclocked because of reliability and stability reasons.

2 ECC will always need to take extra logical steps to ensure data protection.
3 Usually ECC RAM requires more testing and regulation than regular one, which means it usually takes more time to hit the market.

Link to post
Share on other sites

ZFS by design doesn't actually need ECC to function properly, but some weird things can and do happen that ECC memory can help prevent. A couple years ago I would've said not to bother with it and get a raid card for the price difference, but nowadays ECC memory is so afforfable it's basically a no-brainer.
 

That extra 1% of security for a 20$ difference per 16GB stick just doesn't make it worth taking the risk. You can also find earlier ECC memory skus in clearance because DDR4 has been around for a while now, so yea why even bother with non-ECC?

Link to post
Share on other sites
11 hours ago, dbx10 said:

ZFS by design doesn't actually need ECC to function properly, but some weird things can and do happen that ECC memory can help prevent. A couple years ago I would've said not to bother with it and get a raid card for the price difference, but nowadays ECC memory is so afforfable it's basically a no-brainer.
 

That extra 1% of security for a 20$ difference per 16GB stick just doesn't make it worth taking the risk. You can also find earlier ECC memory skus in clearance because DDR4 has been around for a while now, so yea why even bother with non-ECC?

The issue with ecc is your not just paying for the stics, but the the platform has to support it as well, and most desktops don't have ecc support. And ryzen kinda does, but only udimms. Most cheap ddr4 is registered, so it won't work on ryzen/ xeon e/i3. 

 

BUt zfs isn't any more affected by a lack of ecc than any other filesystem, so getting a raid card won't help the non ecc issues at all, you will have the exact same data intergrity issues.

Link to post
Share on other sites
3 minutes ago, Electronics Wizardy said:

The issue with ecc is your not just paying for the stics, but the the platform has to support it as well, and most desktops don't have ecc support. And ryzen kinda does, but only udimms. Most cheap ddr4 is registered, so it won't work on ryzen/ xeon e/i3. 

 

BUt zfs isn't any more affected by a lack of ecc than any other filesystem, so getting a raid card won't help the non ecc issues at all, you will have the exact same data intergrity issues.

the platforms are dirt cheap on ebay right now. Also, Ryzen works with ECC udimms, yes. Testing shows that it does work even though it's not reporting error corrections. tbh, I bought a 7 year old xeon server with 16GB ECC rdimms for under 400 CAD with shipping, it's all a matter of where you look.

Link to post
Share on other sites
2 minutes ago, dbx10 said:

the platforms are dirt cheap on ebay right now. Also, Ryzen works with ECC udimms, yes. Testing shows that it does work even though it's not reporting error corrections. tbh, I bought a 7 year old xeon server with 16GB ECC rdimms for under 400 CAD with shipping, it's all a matter of where you look.

The issue with the older platforms is power consumption. If you want a low power system with ecc, you either have to pay much more for xeon e or hope ryzen is working with ecc correctly.

Link to post
Share on other sites
10 minutes ago, Electronics Wizardy said:

The issue with the older platforms is power consumption. If you want a low power system with ecc, you either have to pay much more for xeon e or hope ryzen is working with ecc correctly.

What do you mean by hope? It's well established that udimms work as intended on supported motherboards.

Link to post
Share on other sites
Just now, dbx10 said:

What do you mean by hope? It's well established that udimms work as intended on supported motherboards.

The issue is its not fully supported by the manufacture. A lot of people and companies don't want to use a features thats not officially supported as they might take it out, or there might be issues. Part of the reason why ryzen isn't used in small servers from companies like dell.

Link to post
Share on other sites
11 minutes ago, Electronics Wizardy said:

The issue is its not fully supported by the manufacture. A lot of people and companies don't want to use a features thats not officially supported as they might take it out, or there might be issues. Part of the reason why ryzen isn't used in small servers from companies like dell.

Asrockrack has server boards specifically for ryzen, and ECC support is listed on many manufacturer websites now. Gigabyte AORUS models come to mind.

The reason why companies like Dell don't make small Ryzen servers is likely for validation and certification purposes, but it's likelier it's for profit reasons.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Newegg

×