Jump to content

Creating a memory error

Rugg

I remember a long time ago LTT mentioned that they were trying to work with someone to develop a test to actually measure and detect when ECC is useful, seems like they never did that.

 

So I was wondering if someone else might know if its possible to just run some loop and detect bit flips that don't get past the OS.

Link to comment
Share on other sites

Link to post
Share on other sites

25 minutes ago, Rugg said:

So I was wondering if someone else might know if its possible to just run some loop and detect bit flips that don't get past the OS.

If the ECC is working, generally if it's a 1-bit flip, it will get logged by the OS; if it's a 2+-bit flip, the system will crash.

 

25 minutes ago, Rugg said:

develop a test to actually measure and detect when ECC is useful

I might be mistaken, but I think the paid version of one of the Memtest programs actually has that ability.

Main System (Byarlant): Ryzen 7 5800X | Asus B550-Creator ProArt | EK 240mm Basic AIO | 16GB G.Skill DDR4 3200MT/s CAS-14 | XFX Speedster SWFT 210 RX 6600 | Samsung 990 PRO 2TB / Samsung 960 PRO 512GB / 4× Crucial MX500 2TB (RAID-0) | Corsair RM750X | Mellanox ConnectX-3 10G NIC | Inateck USB 3.0 Card | Hyte Y60 Case | Dell U3415W Monitor | Keychron K4 Brown (white backlight)

 

Laptop (Narrative): Lenovo Flex 5 81X20005US | Ryzen 5 4500U | 16GB RAM (soldered) | Vega 6 Graphics | SKHynix P31 1TB NVMe SSD | Intel AX200 Wifi (all-around awesome machine)

 

Proxmox Server (Veda): Ryzen 7 3800XT | AsRock Rack X470D4U | Corsair H80i v2 | 64GB Micron DDR4 ECC 3200MT/s | 4x 10TB WD Whites / 4x 14TB Seagate Exos / 2× Samsung PM963a 960GB SSD | Seasonic Prime Fanless 500W | Intel X540-T2 10G NIC | LSI 9207-8i HBA | Fractal Design Node 804 Case (side panels swapped to show off drives) | VMs: TrueNAS Scale; Ubuntu Server (PiHole/PiVPN/NGINX?); Windows 10 Pro; Ubuntu Server (Apache/MySQL)


Media Center/Video Capture (Jesta Cannon): Ryzen 5 1600X | ASRock B450M Pro4 R2.0 | Noctua NH-L12S | 16GB Crucial DDR4 3200MT/s CAS-22 | EVGA GTX750Ti SC | UMIS NVMe SSD 256GB / Seagate 1.5TB HDD | Corsair CX450M | Viewcast Osprey 260e Video Capture | Mellanox ConnectX-2 10G NIC | LG UH12NS30 BD-ROM | Silverstone Sugo SG-11 Case | Sony XR65A80K

 

Camera: Sony ɑ7II w/ Meike Grip | Sony SEL24240 | Samyang 35mm ƒ/2.8 | Sony SEL50F18F | Sony SEL2870 (kit lens) | PNY Elite Perfomance 512GB SDXC card

 

Network:

Spoiler
                           ┌─────────────── Office/Rack ────────────────────────────────────────────────────────────────────────────┐
Google Fiber Webpass ────── UniFi Security Gateway ─── UniFi Switch 8-60W ─┬─ UniFi Switch Flex XG ═╦═ Veda (Proxmox Virtual Switch)
(500Mbps↑/500Mbps↓)                             UniFi CloudKey Gen2 (PoE) ─┴─ Veda (IPMI)           ╠═ Veda-NAS (HW Passthrough NIC)
╔═══════════════════════════════════════════════════════════════════════════════════════════════════╩═ Narrative (Asus USB 2.5G NIC)
║ ┌────── Closet ──────┐   ┌─────────────── Bedroom ──────────────────────────────────────────────────────┐
╚═ UniFi Switch Flex XG ═╤═ UniFi Switch Flex XG ═╦═ Byarlant
   (PoE)                 │                        ╠═ Narrative (Cable Matters USB-PD 2.5G Ethernet Dongle)
                         │                        ╚═ Jesta Cannon*
                         │ ┌─────────────── Media Center ──────────────────────────────────┐
Notes:                   └─ UniFi Switch 8 ─────────┬─ UniFi Access Point nanoHD (PoE)
═══ is Multi-Gigabit                                ├─ Sony Playstation 4 
─── is Gigabit                                      ├─ Pioneer VSX-S520
* = cable passed to Bedroom from Media Center       ├─ Sony XR65A80K (Google TV)
** = cable passed from Media Center to Bedroom      └─ Work Laptop** (Startech USB-PD Dock)

 

Retired/Other:

Spoiler

Laptop (Rozen-Zulu): Sony VAIO VPCF13WFX | Core i7-740QM | 8GB Patriot DDR3 | GT 425M | Samsung 850EVO 250GB SSD | Blu-ray Drive | Intel 7260 Wifi (lived a good life, retired with honor)

Testbed/Old Desktop (Kshatriya): Xeon X5470 @ 4.0GHz | ZALMAN CNPS9500 | Gigabyte EP45-UD3L | 8GB Nanya DDR2 400MHz | XFX HD6870 DD | OCZ Vertex 3 Max-IOPS 120GB | Corsair CX430M | HooToo USB 3.0 PCIe Card | Osprey 230 Video Capture | NZXT H230 Case

TrueNAS Server (La Vie en Rose): Xeon E3-1241v3 | Supermicro X10SLL-F | Corsair H60 | 32GB Micron DDR3L ECC 1600MHz | 1x Kingston 16GB SSD / Crucial MX500 500GB

Link to comment
Share on other sites

Link to post
Share on other sites

22 minutes ago, AbydosOne said:

If the ECC is working, generally if it's a 1-bit flip, it will get logged by the OS; if it's a 2+-bit flip, the system will crash.

And without ECC how easy should it be to cause a bit flip? I assume if I just make like a million integer array full of the number zero, they are all going to be zero, every time, or will it sometime not be zero?

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Rugg said:

And without ECC how easy should it be to cause a bit flip?

Pretty easy, see e.g. the Rowhammer-exploit.

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

9 hours ago, AbydosOne said:

if it's a 2+-bit flip, the system will crash.

Just out of curiority (the wikipedia page doesn't mention this), is this because ECC ram detects the error and shuts down the system? Bit flips on their own may or may not cause a system crash

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
Share on other sites

Link to post
Share on other sites

After watching this video (super interesting btw, HerrSmatzeR#7054 put it in the ltt discord), it seems reasonable that if you just fill the ram up with data, its possible you can get a memory error and if you read it back you should get a different result.

 

Link to comment
Share on other sites

Link to post
Share on other sites

20 minutes ago, Sauron said:

Just out of curiority (the wikipedia page doesn't mention this), is this because ECC ram detects the error and shuts down the system? Bit flips on their own may or may not cause a system crash

It's technically not the RAM itself that shuts the system down, it's the OS kernel that halts the system in case there is an uncorrectable ECC-error.

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Sauron said:

Just out of curiority (the wikipedia page doesn't mention this), is this because ECC ram detects the error and shuts down the system? Bit flips on their own may or may not cause a system crash

On x86 it causes a machine check exception, for which the OS can install a handler. Windows will typically display a blue screen (which is still better then unknowingly working with possibly corrupted data). But on mission critical systems the OS/application could attempt recovery.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×