Jump to content

Hi readers,

 

First, this is not a question, this is my experience related to a desktop PC used as a storage server with uptime counted in months. I'm looking to change my storage server and I'm still surprised that small appliances (2-4 bay devices) still don't use ECC RAM. Even in forums, I see home NAS ECC recomendation only for ZFS, but this should be extended to at least encryption.

So here is my story that made me require ECC RAM back in 2013 (and why I would love to use a low-power, compact, 2-bay Celeron J3355 NAS, but won't because of this):

My old server was a Celeron G530 with Linux, truecrypt volumes and no RAID (only 2x2TB drives, with ocasional offline backup).

After some months of uptime, I had to restart it (UPS power was running out), and on startup I noticed fsck was showing errors on some volumes.

I let it correct and checked the files affected and saw slight corruption of files. For video files, 1-5 seconds of corruption, but overall useable. ISO files, had to get them again.

So, I started monitoring and did a restart after 1 month, then 2 months and finally 3 months (where corruption appeared again).

On the 3 months scheduled test (before I knew something was bad), I did the following:

- unmounted the partition (but still unlocked)

- fsck the partition, no errors

- locked the volume

- unloaded all crypto modules

- unlocked the volume

- fsck the encrypted partition again.

Fsck showed errors. And the affected files were all written within the last 2 weeks (this time I had checksums made and only writes in that period were affected).

I don't know if there was a SW bug or indeed RAM issue, but I doubt a SW bug (overwrite) would have such a silent effect (old and new data fine, but once reloaded, only new data was altered).

 

So I've been using an Asus P9D-I with i3-4130 and ECC RAM since then, and I even had a 300+ days of uptime without problems (granted, I switched shortly to LUKS instead of truecrypt).

But I'm anoyed that I can't switch to a more compact layout (thought about Qnap TS-253B, where I can put a PCIe/NVME adapter and replace it's native OS).

Link to comment
https://linustechtips.com/topic/1122615-ecc-ram-for-home-storage/
Share on other sites

Link to post
Share on other sites

If you wont need more than 4 storage drives per system, consider the HP Microservers. I have two of them. The ones I have run low end CPUs, one is AMD, one is Intel, but both support ECC ram. Older ones are quite cheap used.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, MSI Ventus 3x OC RTX 5070 Ti, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Alienware AW3225QF (32" 240 Hz OLED)
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 4070 FE, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, iiyama ProLite XU2793QSU-B6 (27" 1440p 100 Hz)
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to post
Share on other sites

I can definitely understand why your situation would lead to you not wanting non-ecc ever again, however: i'm running a self built j3355 nas and it has been running 24/7 with only 2 reboots in 2019 so far (reboot after updates), so i dont really 'believe' in the need for ecc in home storage servers anymore. But i'm not doing anything mission critical or anything and I dont really know if  ZFS corrected any errors so far, but i'm surprised you suffered so much corruption... I would suspect the (storage)controllers more than ram at that point...

Link to post
Share on other sites

I've never had any problems using QNAP NAS's and I doubt what you had was memory system errors, can equally be a bad disk controller or SATA controller. For ZFS ECC doesn't actually add much as the check sums and data integrity checking it does as part of reads and writes will catch any errors ECC would and more. I still use ECC as it's not really more expensive but I also go with ECC Registered so I can have a lot of it.

Link to post
Share on other sites

Actually, after I wrote this, I decided to bite the bullet and buy a TS-253B.

I still have the old G530 system and a Synology DS114 (both unused), so I'll try to run them in parallel for 3-4 months.

Test will consist of an ISO write every day.

Now to just see how I can keep them on for that long.

 

16 hours ago, porina said:

If you wont need more than 4 storage drives per system, consider the HP Microservers. I have two of them. The ones I have run low end CPUs, one is AMD, one is Intel, but both support ECC ram. Older ones are quite cheap used.

Actually I have a Gen8 Microserver as my manual backup (replaced original G1616T with Xeon E3-1220v2 for AESNI), but I just did some power measurement and 3-4W for iLO only is too much (P9D-I ASMB is similar), considering that router+modem+P9D-I consumption stay at around 40W (with 1HDD active out of 2), and that gives me around 2h30m of estimated power.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×