Jump to content

Is ECC necessary? 

 

I will be using my home server for running game servers and basic storage.

 

Also what parts would you recommend with a budget of $1000-$1500?

 

Processor I was thinking a Intel I5 14600 or a ryzen 7 7700x

 

Ram I will likely want 16-32gb ddr5

 

Power supply must be fully modular

 

Storage im thinking 3 8tb Seagate ironwolf nas in a raid 5 configuration 

 

I'm open to all advice as this is my first home server, thanks in advance 

Link to comment
https://linustechtips.com/topic/1608143-home-server-questions-ecc-or-no-ecc/
Share on other sites

Link to post
Share on other sites

27 minutes ago, Tallin21 said:

Is ECC necessary? 

 

I will be using my home server for running game servers and basic storage.

Do you earn money with the server and absolutely cannot afford system crashes due to memory errors caused by cosmic rays?

 

Otherwise, no it is not necessary, but certainly one additional factor in keeping the system stable. But you will also need to buy a board that supports ECC.

Remember to either quote or @mention others, so they are notified of your reply

Link to post
Share on other sites

51 minutes ago, Tallin21 said:

Is ECC necessary? 

 

I will be using my home server for running game servers and basic storage.

A game server and basic storage doesn't sound like you need ECC to me, unless it is SUPER important that you're able to detect corruption.

 

Do you care about anything stored on that server, or if you do, do you hold copies of it elsewhere (and/or you can easily download it again)?

 

Also, will you actually be monitoring the server, or just leaving it running?

Link to post
Share on other sites

To me, the answer is simple… personally yes, I run ECC.

 

Why spend thousands of dollars on my home server which I trust to store my entire digital life, but introduce the potential risk of bit flips in RAM that can corrupt data. 
 

Is it needed? No… it’s needed as much as a home server itself is needed. But to me, since I want every reasonable possibility of my data being intact, it’s worth it. 

Rig: i7 13700k +Contact Frame - - Asus Z790-P Wifi - - RTX 4080 - - 4x16GB 6000MHz - - Samsung 990 Pro 2TB NVMe Boot + Main Programs - - Crucial P3 2TB NVMe for photo work - - Corsair RM850x - - Sound BlasterX EA-5 - - Corsair XC8 JTC Edition - - Corsair GPU Full Cover GPU Block - - PTM 7950 - - XT45 X-Flow 420 + UT60 280 rads externally mounted - - EK XRES RGB PWM - - Fractal Define S2 - - DellAlienware AW3423DWF 34" -- Logitech Pro X Superlight - - Logitech G710+ - - LTT Northern Lights Deskpad

 

Headphones/amp/dac: Schiit Bifrost Multibit - -  Schiit Lyr 3 - - Fostex TR-X00 - - Sennheiser HD 6xx

 

Homelab/Media Server: Proxmox VE host - - 512 NVMe Samsung 980 RAID Z1 for VM's/Proxmox boot - - Xeon e5 2660 V4- - Supermicro X10SRF-i - - 128 GB ECC 2133 - - 10x8TB WD Red RAID Z2 - - 2x 800 GB SAS SSD’s (1 SLOG, 1 L2Arc) - - 45 HomeLab HL15 15 Drive 4U - - Corsair RM650i - - LSI 9305-16i HBA - - TreuNAS + many other VM’s

 

Unifi UDM Pro in front of full unifi network infrastructure

 

iPhone 17 Pro - - MacBook Air M3

Link to post
Share on other sites

I'm also doing a simple home build, but I am probably going ECC, though only because I found a good deal on them making the price difference very small. If the price difference was more than 20€, I'd go for non-ECC. Other than that, whatever you pick be sure it can use ECC. I've had a rough time figuring out if the platform I pick has support for ECC, and ended up going for Ryzen Pro since it's better suited for these kinds of builds.

Link to post
Share on other sites

I've had plenty of crashes and corruption on non-ECC servers, even with settings and hardware that are normally stable for basic usage. Definitely go for it

Ryzen 7 5700X3D (CO -30) - AX370-Gaming 5 - 2x16GB @3600C18 - EVGA RTX 3070 8G XC3

[PBO2] CO -25/-25/-30/-30/-30/-30/-30/-30

[BIOS] Vsoc 1.1 / DRAM XMP

 

i5-6400 4.38GHz @1.36v (162.2 BCLK) - Z170M-Plus - 2x8GB @3244C16- Biostar RX 570 8G w/ MSI Armor cooler

[BIOS] BCLK: 162.2 (x27) / Vcore 1.35 / DRAM 3244 (XMP timings) / FCLK 1GHz (1622) / RebarUEFI patched

 

ROG G531GT : i7-9750H (uv) - GTX 1650 +700mem - 16+8GB @2666 - 1920x1080@145Hz (up to 172Hz) IPS panel

[Throttlestop] FIVR - Vcore -160 / Vcache -105 / iGPU+unslice -125 (IccMax 255)

 

i5-4690K + Z97-AR + Panram Blue DDR3 2800 2x4GB Lightsaber Blue

iMac 21.5" (late 2011) : i5-2400S - Samsung 4x4GB PC3-1333 - HD 6750M 512MB - cheap Winten SSD (MacOS High Sierra) - 1920x1080@60 LCD

Acer Z5610 "Theatre" Core 2 Quad Q9550 - 2x2GB PC3-1333 (Samsung) - 1920x1080@60Hz Touch LCD - great internal speakers

Link to post
Share on other sites

4 hours ago, Tetras said:

A game server and basic storage doesn't sound like you need ECC to me, unless it is SUPER important that you're able to detect corruption.

 

Do you care about anything stored on that server, or if you do, do you hold copies of it elsewhere (and/or you can easily download it again)?

 

Also, will you actually be monitoring the server, or just leaving it running?

I'm not sure if I will be monitoring it as I don't know a ton about servers at the moment

Link to post
Share on other sites

If you can get ECC RAM for same price as non-ECC RAM, yes why not. You will not see any difference in system workload.

Dell Precision | CPU: Intel Xeon | RAM: 64Gb | NVme: 2Tb / Type: Raid 0 | Windows 11 Pro For Workstations

Surface Pro X | CPU: SQ2 | RAM: 16Gb | NVMe: 512Gb | Windows 11 Pro

MacBook Pro M1 | RAM: 16Gb | NVMe: 1Tb | MacOS Sonoma

Samsung Galaxy S24 Ultra Titanium Grey

iPhone 16 Pro Max 1Tb Natural Titanium

Link to post
Share on other sites

1 hour ago, UNIXNETWORK said:

If you can get ECC RAM for same price as non-ECC RAM, yes why not. You will not see any difference in system workload.

You'd need a ECC motherboard tho. That and the ECC ram I've seen is almost twice as much

Link to post
Share on other sites

I know you need not only motherboard but CPU should support ECC RAM. I bought 64GB ECC RAM for GBP120. Which isn't so more expensive from non-ECC GBP96.

Dell Precision | CPU: Intel Xeon | RAM: 64Gb | NVme: 2Tb / Type: Raid 0 | Windows 11 Pro For Workstations

Surface Pro X | CPU: SQ2 | RAM: 16Gb | NVMe: 512Gb | Windows 11 Pro

MacBook Pro M1 | RAM: 16Gb | NVMe: 1Tb | MacOS Sonoma

Samsung Galaxy S24 Ultra Titanium Grey

iPhone 16 Pro Max 1Tb Natural Titanium

Link to post
Share on other sites

23 minutes ago, UNIXNETWORK said:

I know you need not only motherboard but CPU should support ECC RAM. I bought 64GB ECC RAM for GBP120. Which isn't so more expensive from non-ECC GBP96.

Where can I find it for a similar price? I'm in the US

Link to post
Share on other sites

If you have to ask, the answer is always no. Honesty the best analogy I use is that you’re far more likely to need redundant power supplies and a UPS before ECC will ever help. A single or even several dozen flipped bits will not destroy your data, and most modern systems are capable of correcting or at least recovering from this. In fact, the likelihood of ever getting a flipped bit from a cosmic ray is mathematically highly improbable. There’s lots of research on this topic widely available and it’s extremely interesting… well to some folks I guess haha!

 

That all having been said, if you don’t care and the cost doesn’t matter, go for it, it can’t hurt anything other than your wallet.

 

Source: I work with extremely high volume cryptographic data processing where it really matters.

Link to post
Share on other sites

2 minutes ago, Echothedolpin said:

If you have to ask, the answer is always no. Honesty the best analogy I use is that you’re far more likely to need redundant power supplies and a UPS before ECC will ever help. A single or even several dozen flipped bits will not destroy your data, and most modern systems are capable of correcting or at least recovering from this. In fact, the likelihood of ever getting a flipped bit from a cosmic ray is mathematically highly improbable. There’s lots of research on this topic widely available and it’s extremely interesting… well to some folks I guess haha!

 

That all having been said, if you don’t care and the cost doesn’t matter, go for it, it can’t hurt anything other than your wallet.

 

Source: I work with extremely high volume cryptographic data processing where it really matters.

Awesome, thanks for the info, sounds like a awesome job. Above my head tho, I'll have to look into more of what a cryptographic data processor does

Link to post
Share on other sites

41 minutes ago, Tallin21 said:

Awesome, thanks for the info, sounds like a awesome job. Above my head tho, I'll have to look into more of what a cryptographic data processor does

Glad to help! I want to emphasize I have expertise in my own field and it deals specifically with cybersecurity, signaling analysis, radio frequency spectrum usage and some other things. There are other realms of expertise that deal with large data sets in different ways and have very different methods of handling the data in transit. So I can only speak from my experience of course!

Link to post
Share on other sites

5 hours ago, Tallin21 said:

I'm not sure if I will be monitoring it as I don't know a ton about servers at the moment

Arguably the biggest benefit of ECC is not the correction, but that you don't get hidden data corruption, but that only works if either: the PC is set to stop/halt on something it can't fix, or you regularly monitor the error reports.

 

If the PC doesn't halt and you don't check the error reports, you're left with it silently correcting what it can fix and ignoring what it can't fix. In that situation, the benefit isn't that significant, since (at least from what I've seen, which is admittedly not a big sample) known good (i.e. not faulty) memory rarely gets errors in regular usage. It can change though, with e.g. the degree of load/utilisation, capacity, number of sticks, heat.

 

If you don't care about data corruption, because nothing on the PC is important or it is easily recoverable, I don't think it is worth it.

 

11 hours ago, Tallin21 said:

Processor I was thinking a Intel I5 14600 or a ryzen 7 7700x

FYI: even though the Intel CPU states on Intel Ark that it is supported, you need a different chipset, whereas the 7700X can support it on regular boards.

Link to post
Share on other sites

2 hours ago, Tetras said:

Arguably the biggest benefit of ECC is not the correction, but that you don't get hidden data corruption, but that only works if either: the PC is set to stop/halt on something it can't fix, or you regularly monitor the error reports.

 

If the PC doesn't halt and you don't check the error reports, you're left with it silently correcting what it can fix and ignoring what it can't fix. In that situation, the benefit isn't that significant, since (at least from what I've seen, which is admittedly not a big sample) known good (i.e. not faulty) memory rarely gets errors in regular usage. It can change though, with e.g. the degree of load/utilisation, capacity, number of sticks, heat.

 

If you don't care about data corruption, because nothing on the PC is important or it is easily recoverable, I don't think it is worth it.

 

FYI: even though the Intel CPU states on Intel Ark that it is supported, you need a different chipset, whereas the 7700X can support it on regular boards.

So if you have a 7700x and a non ECC motherboard it will still work if you use ECC ram? Or are you meaning if I had a ECC motherboard and a Intel CPU it won't work with ECC?

Link to post
Share on other sites

10 hours ago, Tallin21 said:

So if you have a 7700x and a non ECC motherboard it will still work if you use ECC ram? Or are you meaning if I had a ECC motherboard and a Intel CPU it won't work with ECC?

A 7700X in a non-ECC motherboard won't work. A 7700X in an ECC supporting motherboard should have the ECC function, but the halt status / reporting, I'm not sure.

 

This is an old article on the topic:

https://hardwarecanucks.com/cpu-motherboard/ecc-memory-amds-ryzen-deep-dive/

 

An Intel CPU in a ECC supporting motherboard should have full support, depending on how you configure it.

 

What I was trying to say though, is that (in my opinion) you really need to decide: do I care about the data? And, if yes: will I configure/monitor the server in a way that makes best use of the ECC?

 

If you don't care about the data and you won't be monitoring it, I don't consider that there's much point.

Link to post
Share on other sites

1 hour ago, Tetras said:

If you don't care about the data and you won't be monitoring it, I don't consider that there's much point.

What do you mean by monitoring the data? I certainly don’t “monitor” my ECC logs, but that doesn’t mean it’s not there, silently protecting my data from potential corruption. 
 

Bit flips in ram are certainly possibly, even tho they are rare. And as I said earlier, the point of a server is to be always on, so the chance obviously goes up with higher on times. 
 

Snd buying used server gear makes a lot of the cost a non-factor. I have 128GB of ECC DDR4, either 2133 or 2400 (can’t remember what speed it is off hand), and it only cost me about 150 bucks when I built it 2 or so years ago. This was well worth it to me seeing as this machine will be on 24/7 for a decade until it gets replaced and upgraded. 

Rig: i7 13700k +Contact Frame - - Asus Z790-P Wifi - - RTX 4080 - - 4x16GB 6000MHz - - Samsung 990 Pro 2TB NVMe Boot + Main Programs - - Crucial P3 2TB NVMe for photo work - - Corsair RM850x - - Sound BlasterX EA-5 - - Corsair XC8 JTC Edition - - Corsair GPU Full Cover GPU Block - - PTM 7950 - - XT45 X-Flow 420 + UT60 280 rads externally mounted - - EK XRES RGB PWM - - Fractal Define S2 - - DellAlienware AW3423DWF 34" -- Logitech Pro X Superlight - - Logitech G710+ - - LTT Northern Lights Deskpad

 

Headphones/amp/dac: Schiit Bifrost Multibit - -  Schiit Lyr 3 - - Fostex TR-X00 - - Sennheiser HD 6xx

 

Homelab/Media Server: Proxmox VE host - - 512 NVMe Samsung 980 RAID Z1 for VM's/Proxmox boot - - Xeon e5 2660 V4- - Supermicro X10SRF-i - - 128 GB ECC 2133 - - 10x8TB WD Red RAID Z2 - - 2x 800 GB SAS SSD’s (1 SLOG, 1 L2Arc) - - 45 HomeLab HL15 15 Drive 4U - - Corsair RM650i - - LSI 9305-16i HBA - - TreuNAS + many other VM’s

 

Unifi UDM Pro in front of full unifi network infrastructure

 

iPhone 17 Pro - - MacBook Air M3

Link to post
Share on other sites

18 minutes ago, LIGISTX said:

What do you mean by monitoring the data? I certainly don’t “monitor” my ECC logs, but that doesn’t mean it’s not there, silently protecting my data from potential corruption. 

Not the data, the error reports.

 

As I said earlier, arguably the biggest benefit is that you don't get hidden corruption, where say: you have a faulty stick, but you don't know about it until e.g. you try to unzip an archive and the whole archive is corrupted.

 

Depending on how the system and the OS is configured, it may not respond to uncorrectable errors and in that case, if you're not checking the logs you can still experience data corruption.

 

From the article I linked:

Quote

HOWEVER, things are not quite perfect. On that last line you will notice “1 UE”. That is an uncorrected error (UE), otherwise known as a two-bit error or a hard error. Two-bit errors cannot be corrected by ECC memory. What is supposed to happen when they occur is that they should be detected, logged and ideally the system should be immediately halted. These are considered fatal errors and they can easily cause data corruption if the system is not quickly halted and/or rebooted. Regrettably, only 2 of the 3 steps happened. The hard error was detected and it was logged, but the system kept running. The only reason that it’s the last line on that image is because we immediately took a screenshot just in case the system would halt, but that never happened.

 

There's also that from what I'm aware of (in some of the studies), a stick that produces an error is of some magnitude more likely to produce another error and while that may correspond to other factors, like the degree of utilisation and environmental conditions, one of the most likely conclusions is that the stick is faulty.

 

So, if you're getting a lot of errors in the log, I'd be checking if the stick needs replacing before it can do any more damage.

 

Link to post
Share on other sites

Memory errors can be harmless.  Or not.

 

For example, if in an executable and effectively change a "jump if odd" opcode to a "jump unconditionally", this will be inconsequential if the data being tested is odd.

 

If in data and a 4 (100) is misread as a 5 (101) or 6 (110) but that is then operated on by a (integer) divide by four operation, the result is unchanged -- 1.

 

The first step, of course, is to KNOW if you have memory errors.  Without ECC memory and EDAC, you're operating blind.

 

There have been several large-scale studies using real machines in production environments that all, basically, say "large memory complements REQUIRE EDAC" -- assuming, of course, you are concerned with reliable operation.   (One study claimed most "software bugs" are, in fact, hardware failures traced to unreliable memory).

 

In the products that I've been designing, the system software actively checks memory for errors and retires pages that are exhibiting them.  When "too many" pages are retired, the hardware is deemed unreliable and must be replaced/repaired.

 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×