Jump to content

I remember a long time ago LTT mentioned that they were trying to work with someone to develop a test to actually measure and detect when ECC is useful, seems like they never did that.

 

So I was wondering if someone else might know if its possible to just run some loop and detect bit flips that don't get past the OS.

Link to comment
https://linustechtips.com/topic/1298791-creating-a-memory-error/
Share on other sites

Link to post
Share on other sites

25 minutes ago, Rugg said:

So I was wondering if someone else might know if its possible to just run some loop and detect bit flips that don't get past the OS.

If the ECC is working, generally if it's a 1-bit flip, it will get logged by the OS; if it's a 2+-bit flip, the system will crash.

 

25 minutes ago, Rugg said:

develop a test to actually measure and detect when ECC is useful

I might be mistaken, but I think the paid version of one of the Memtest programs actually has that ability.

Main System (Byarlant): Ryzen 9 5950X | Asus B550-Creator ProArt | EK 240mm Basic AIO | 32GB G.Skill DDR4 3600MT/s CL16 | XFX Speedster SWFT 210 RX 6600 | Samsung 990 PRO 2TB / Samsung 990 EVO Plus 4TB | Corsair RM750X | StarTech 4× USB 3.0 Card | Realtek RTL8127 10G NIC | Hyte Y60 Case | Dell U3415W Monitor | Keychron K12 Blue (RGB backlight)

 

Laptop (Narrative): Lenovo Flex 5 81X20005US | Ryzen 5 4500U | 16GB DDR4 3200MT/s (soldered) | Vega II 384SP Graphics | SKHynix P31 1TB NVMe SSD | Intel AX200 Wifi | Asus 2.5G USB NIC | Asus ProArt PA278QV | Keychron K4 Brown (white backlight)

 

Proxmox Server (Veda): Ryzen 7 3800XT | ASRock Rack X470D4U | Corsair H80i v2 | 128GB Micron DDR4 ECC 3200MT/s | 2× Samsung PM963a 960GB SSD / 4× WD 10TB / 4× Seagate 14TB Exos / 4× Micron MX500 2TB / 8× WD 12TB (custom external SAS enclosure) | Seasonic Prime Fanless 500W | Intel X550-T2 10G NIC | LSI 9300-8i HBA | Adaptec 82885T SAS Expander | Fractal Design Node 804 Case

 

Proxmox Server (La Vie en Rose)GMKtec Mini PC | Ryzen 7 5700U | 32GB Lexar DDR4 (SODIMM) | Vega II 512SP Graphics | Lexar 1TB 610 Pro SSD | 2× Realtek 8125 2.5G NICs


Media Center/Video Capture (Jesta Cannon): Ryzen 5 1600X | ASRock B450M Pro4 R2.0 | Noctua NH-L12S | 16GB Crucial DDR4 3200MT/s | EVGA GTX750Ti SC | UMIS NVMe SSD 256GB / TEAMGROUP MS30 1TB | Corsair CX450M | Viewcast Osprey 260e Video Capture | TrendNet (AQC107) 10G NIC | LG WH14NS40 BD-ROM | Silverstone Sugo SG-11 Case | Sony XR65A80K

 

Workbench (Doven Wolf): Lenovo m715q | Ryzen Pro 3 2200GE | 16GB Crucial DDR4 3200MT/s (SODIMM) | Vega 8 Graphics | SKHynix (OEM) 256GB NVMe SSD | uni 2.5G USB NIC | HDMI add-in module

 

Network:

Spoiler
                       ┌─────────────── Office/Rack ───────────────────────────────────────────────┐
Google Fiber Webpass ── Cloud Gateway Max ══╦═ Pro XG 8 ══╦═ Flex 2.5-8 ══╦═ Doven Wolf
                      La Vie en Rose (DNS) ═╬═ Narrative  ╠═ Veda-NAS     ╠═ La Vie en Rose (vmbr)
                                Veda (DNS) ─┘             ╠═ Veda (vmbr)  ├─ Ptolemy (vmbr)
╔═════════════════════════════════════════════════════════╩═ Ptolemy-NAS  ├─ Veda (Mgmt)
║   ┌ Closet ┐      ┌───────── Bedroom ─────────┐                         └─ Veda (IPMI)
╚═══ Flex XG ══╦╤═══ Flex XG ══╤╦═ Byarlant
       (PoE)   ║│              │╠═ Narrative 
Kitchen Jack ══╣└─ Dual PoE ┐  │╚═ Jesta Cannon*
   (Testing)   ║┌─ Injector ┘  └── Work Laptop
     Bedroom ══╝│        ┌─────── Media Center ────────────────────────────┐
     Jack #2    └──────── Switch 8 ────────────┬─ nanoHD Access Point (PoE)
Notes:                                         ├─ Sony PlayStation 4 
─── is Gigabit / ═══ is Multi-Gigabit          ├─ Pioneer VSX-S520
* = cable passed from Bedroom to Media Center  └─ Sony XR65A80K (Google TV)
Link to post
Share on other sites

22 minutes ago, AbydosOne said:

If the ECC is working, generally if it's a 1-bit flip, it will get logged by the OS; if it's a 2+-bit flip, the system will crash.

And without ECC how easy should it be to cause a bit flip? I assume if I just make like a million integer array full of the number zero, they are all going to be zero, every time, or will it sometime not be zero?

Link to post
Share on other sites

9 hours ago, AbydosOne said:

if it's a 2+-bit flip, the system will crash.

Just out of curiority (the wikipedia page doesn't mention this), is this because ECC ram detects the error and shuts down the system? Bit flips on their own may or may not cause a system crash

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to post
Share on other sites

After watching this video (super interesting btw, HerrSmatzeR#7054 put it in the ltt discord), it seems reasonable that if you just fill the ram up with data, its possible you can get a memory error and if you read it back you should get a different result.

 

Link to post
Share on other sites

20 minutes ago, Sauron said:

Just out of curiority (the wikipedia page doesn't mention this), is this because ECC ram detects the error and shuts down the system? Bit flips on their own may or may not cause a system crash

It's technically not the RAM itself that shuts the system down, it's the OS kernel that halts the system in case there is an uncorrectable ECC-error.

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to post
Share on other sites

1 hour ago, Sauron said:

Just out of curiority (the wikipedia page doesn't mention this), is this because ECC ram detects the error and shuts down the system? Bit flips on their own may or may not cause a system crash

On x86 it causes a machine check exception, for which the OS can install a handler. Windows will typically display a blue screen (which is still better then unknowingly working with possibly corrupted data). But on mission critical systems the OS/application could attempt recovery.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×