Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Adrien Angeldust

Is ECC memory still needed?

Recommended Posts

Posted · Original PosterOP

Hi everyone, 

 

Recently an interesting thing came into my mind after watching one of the OC videos for RTX3080 done by Jayztwocents. In there he mentioned that modern memories often have the auto-correction features so we are no longer see the artifacts in the screen as often as we used to but instead increasing the frequency causes higher amount of autocorrections needed and lowering the overall fps and performance. 

 

I know RTX3080 uses a bit different memories compared to DDR4, but some base auto-corrections in DDR4 is already in place to achieve higher frequencies of DDR4.

 

As such I was wondering if for example DDR4 or even upcoming DDR5 already have some level auto-correcting features to be able to achieve higher frequencies. If so doesn't it practically making special ECC memories for server usage kinda obsolete and pointless to pay extra? ECC memory usually have lower frequencies compared to non-ecc ones, however if there is already some error-correction in non-ecc ones, why have the ecc ones at all? I understand that in past it was helpful as these auto-corrections were necessary, but what about these days with modern chips when memories must have it to achieve high frequencies anyway? 

 

Does anyone ever actually really had need to use DDR4 ECC in servers? Or its nowadays just for the convenience because we used to need it with older memory standards?

Link to post
Share on other sites

Some error correcting is not full error correcting. When dealing with databases and business critical applications, ECC is still very much "needed", even if it can run on standard UDIMM, there's no reason to not use ECC in server systems for the most part. 

 

The lower frequencies are an acceptable trade off for the stability and data integrity that ECC provides. For businesses, ECC is still seen as a need. For normal systems, it's not and really never has been.

 

I personally switched to a new platform in my own server that utilises 128GB of ECC DDR4.

Link to post
Share on other sites

ECC in the high end professional Workstations or Server farms will most likely be a thing for years to come. Non-ECC RAM with a parity bit can detect errors but not correct errors IIR.

Link to post
Share on other sites
Posted · Original PosterOP
7 minutes ago, Bad5ector said:

ECC in the high end professional Workstations or Server farms will most likely be a thing for years to come. Non-ECC RAM with a parity bit can detect errors but not correct errors IIR.

I see, so basically in case of Non-ECC RAM its not the error-correction thats happing there, but more like error-detection. Hoewever can't just the operation be re-run on original data if error is detected similar to what happens with graphics crads dropping frames? Can't higher bandwidth overcome the amount of re-runs if you are effectively able to almost double the bandwidth without ECC?

Link to post
Share on other sites
Posted · Original PosterOP
4 minutes ago, Bad5ector said:

I don't think it works that way my friend. But hey, I'm not a doctor.

However when i think of it more, if there must be some corrections to achieve higher frequencies, then it cannot be simple detections but also corrections, otherwise we would see no difference in performance between higher and lower frequencies and observe tons of crashes without corrections. But surelly systems do crash if you go with high enough with OC. I am no doctor either, just wondering how it is.

 

Link to post
Share on other sites

DDR5 will have chip level error detection/correction. This is not the same as module level and does not replace it, but it does help increase yield and reduce costs as it allows less perfect chips to be used.

 

This is a problem that is going to increase as we get more and more ram. With server systems you might have 1TB of ram. An error happening in that is much higher than a typical home system of 16GB for example. While there are software techniques you can use to try and detect data errors and correct for them, it will cost performance. And an error in code is just a very bad thing. At best you get a crash so no more damage can happen. At worse, you're moving bad data around and cause more problems. Doing ECC in hardware presumably works out a better choice. This to me is why overclocking a CPU I consider far less risky to data integrity than overclocking ram. Silent data corruption is really scary to me.

 

Yes there is a difference between error detection, and error correction. Detection is simpler, and correction needs more information. A basic technique is called parity. You can look at a binary sequence of bits and count the number of 1's (or 0's). You can then add a bit on the end to make the total number odd or even. Then when you go and check it, you can ensure that condition is correct. If it is incorrect, then you know there must have been an error, but you don't know where. You may also note if you have two errors, they could cancel out and you might still think it is correct. If you add another bit for error correction/detection, it gets more complicated to generate the additional values, but you can do more. It will be better at detecting errors, and start to have capability to help locate where it is. It is quite a bit topic to read up on if interested. 


Main system: Asus Maximus VIII Hero, i7-6700k stock, Noctua D14, G.Skill Ripjaws V 3200 2x8GB, Gigabyte GTX 1650, Corsair HX750i, In Win 303 NVIDIA, Samsung SM951 512GB, WD Blue 1TB, HP LP2475W 1200p wide gamut

Desktop Gaming system: Asrock Z370 Pro4, i7-8086k stock, Noctua D15, Corsair Vengeance Pro RGB 3200 4x16GB, Asus Strix 1080Ti, NZXT E850 PSU, Cooler Master MasterBox 5, Optane 900p 280GB, Crucial MX200 1TB, Sandisk 960GB, Acer Predator XB241YU 1440p 144Hz G-sync

TV Gaming system: Asus X299 TUF mark 2, 7920X @ 8c8t, Noctua D15, Corsair Vengeance LPX RGB 3000 8x8GB, Gigabyte RTX 2070, Corsair HX1000i, GameMax Abyss, Samsung 970 Evo 500GB, LG OLED55B9PLA

VR system: Asus Z170I Pro Gaming, i7-6700T stock, Scythe Kozuti, Kingston Hyper-X 2666 2x8GB, Zotac 1070 FE, Corsair CX450M, Silverstone SG13, Samsung PM951 256GB, Crucial BX500 1TB, HTC Vive

Gaming laptop: Asus FX503VD, i5-7300HQ, 2x8GB DDR4, GTX 1050, Sandisk 256GB + 480GB SSD

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×