Jump to content

Is ECC memory still needed?

Hi everyone, 

 

Recently an interesting thing came into my mind after watching one of the OC videos for RTX3080 done by Jayztwocents. In there he mentioned that modern memories often have the auto-correction features so we are no longer see the artifacts in the screen as often as we used to but instead increasing the frequency causes higher amount of autocorrections needed and lowering the overall fps and performance. 

 

I know RTX3080 uses a bit different memories compared to DDR4, but some base auto-corrections in DDR4 is already in place to achieve higher frequencies of DDR4.

 

As such I was wondering if for example DDR4 or even upcoming DDR5 already have some level auto-correcting features to be able to achieve higher frequencies. If so doesn't it practically making special ECC memories for server usage kinda obsolete and pointless to pay extra? ECC memory usually have lower frequencies compared to non-ecc ones, however if there is already some error-correction in non-ecc ones, why have the ecc ones at all? I understand that in past it was helpful as these auto-corrections were necessary, but what about these days with modern chips when memories must have it to achieve high frequencies anyway? 

 

Does anyone ever actually really had need to use DDR4 ECC in servers? Or its nowadays just for the convenience because we used to need it with older memory standards?

Link to comment
Share on other sites

Link to post
Share on other sites

Some error correcting is not full error correcting. When dealing with databases and business critical applications, ECC is still very much "needed", even if it can run on standard UDIMM, there's no reason to not use ECC in server systems for the most part. 

 

The lower frequencies are an acceptable trade off for the stability and data integrity that ECC provides. For businesses, ECC is still seen as a need. For normal systems, it's not and really never has been.

 

I personally switched to a new platform in my own server that utilises 128GB of ECC DDR4.

Link to comment
Share on other sites

Link to post
Share on other sites

ECC in the high end professional Workstations or Server farms will most likely be a thing for years to come. Non-ECC RAM with a parity bit can detect errors but not correct errors IIR.

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Bad5ector said:

ECC in the high end professional Workstations or Server farms will most likely be a thing for years to come. Non-ECC RAM with a parity bit can detect errors but not correct errors IIR.

I see, so basically in case of Non-ECC RAM its not the error-correction thats happing there, but more like error-detection. Hoewever can't just the operation be re-run on original data if error is detected similar to what happens with graphics crads dropping frames? Can't higher bandwidth overcome the amount of re-runs if you are effectively able to almost double the bandwidth without ECC?

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Bad5ector said:

I don't think it works that way my friend. But hey, I'm not a doctor.

However when i think of it more, if there must be some corrections to achieve higher frequencies, then it cannot be simple detections but also corrections, otherwise we would see no difference in performance between higher and lower frequencies and observe tons of crashes without corrections. But surelly systems do crash if you go with high enough with OC. I am no doctor either, just wondering how it is.

 

Link to comment
Share on other sites

Link to post
Share on other sites

DDR5 will have chip level error detection/correction. This is not the same as module level and does not replace it, but it does help increase yield and reduce costs as it allows less perfect chips to be used.

 

This is a problem that is going to increase as we get more and more ram. With server systems you might have 1TB of ram. An error happening in that is much higher than a typical home system of 16GB for example. While there are software techniques you can use to try and detect data errors and correct for them, it will cost performance. And an error in code is just a very bad thing. At best you get a crash so no more damage can happen. At worse, you're moving bad data around and cause more problems. Doing ECC in hardware presumably works out a better choice. This to me is why overclocking a CPU I consider far less risky to data integrity than overclocking ram. Silent data corruption is really scary to me.

 

Yes there is a difference between error detection, and error correction. Detection is simpler, and correction needs more information. A basic technique is called parity. You can look at a binary sequence of bits and count the number of 1's (or 0's). You can then add a bit on the end to make the total number odd or even. Then when you go and check it, you can ensure that condition is correct. If it is incorrect, then you know there must have been an error, but you don't know where. You may also note if you have two errors, they could cancel out and you might still think it is correct. If you add another bit for error correction/detection, it gets more complicated to generate the additional values, but you can do more. It will be better at detecting errors, and start to have capability to help locate where it is. It is quite a bit topic to read up on if interested. 

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×