Jump to content

Synology NAS non-ECC vs ECC, Checksum Mismatch

Takuan

Hi All.

 

I want(ed) to upgrade the memory of my Synology ds3615xs NAS, which is originally shipped with Synology 4GB ECC memory (2x 2GB sticks), but expandable up to 32GB (4x 8GB sticks).

 

January 2018 I searched for RAM. I ran into supply issues as it was fairly impossible to get any DDR3 240-pin RAM at all. The info on which RAM was compatible was also very scarse at best. Synology's prices are just crazy expensive, so even though I took a look there, I quickly disregarded buying from there. I ended up buying Kingston RAM non-ECC. On Kingston's website, I went trough the compatible RAM finder, and they found a RAM which was discontinued. I searched but it was unavailable everywhere. In any case, I ended up buying this RAM but the non-ECC version of the same RAM found on Kingston's website.

 

After installing the RAM I ran a Synology assistant RAM test, and it passed. Actually I ran it a few times and result was always PASSED. So I thought I was good.

 

A while back, I started having "checksum mismatch" warnings on my NAS volume. It happened randomly. No specific scenario. Just to be sure, I checked the files using a binary comparison tool. And true, the files were corrupted or a mismatch to the original file. I did lots of tests. Different drives, one drive vs many, BASIC volume, RAID5, RAID6, RAID10, no raid etc. I even changed the drives completely with a new set I had lying around. All are high end Enterprise Western Digital drives. Same result. I copied data from my laptop to the NAS, 137 files each about 1GB in size. After copying, 5-10 of them were a mismatch compared to the source (used a binary file comparison tool). Only rarely would I get an actual warning from BTRFS filesystem on the NAS. My NAS warned me perhaps about 2 files, but the binary comparison tool found 5-10 files that did not match with the original.

 

I also completely started over, reinserted the RAM, moved the sticks around, removing all partitions on the drives using my Windows machine, reinstalled the drives and reinstalled DSM. I changed the filesystem to EXT4. Same results as above. Using my binary comparision tool I found 5-10 files not being the same as the source. It was never the same files, seemed random, which files came up corrupt or not identical to the souce.

 

To eliminate the fact that file transfers using ethernet can go wrong, I copied all files from my laptop to my NAS, made a comparison, recopied all mismatched files and did a new binary comparison. Did the tests with the result of PASS three times in a row before I was satisfied. Then on the NAS itself, I created 10 more shares and had the NAS internally copy the files to each share all from the same source on the NAS (the source just verified three times using my binary comparison tool on my laptop). Half way through, I did a binary check of the source compared to the original folder on my laptop and it was still a perfect match. I have done many such tests. Renaming files, copy the renamed files etc. All with the same results. Errors came up only rarely from BTRFS and synology logs only warned me occasionally. But when doing a binary comparison with my laptop to each of the 10 extra shares copied by my NAS internally to my NAS, it was clear that the files were not identical. If a mismatch came up, I did another binary check of each of the files individually to make sure, that it was not identical. Sometimes they were actually identical, sometimes they were not. I created 10 extra shares and tested against all. Same results.

 

I had given up, but then it struck me, that I put non-ECC RAM in. I took the original 4GB from Synology and put it back in. Then I did ALL of the tests above again. Actually I have been testing like carzy for nearly 3 weeks straight now.

 

After putting the original Synology ECC RAM that came with my NAS back in, I have not been able to reproduce even one single error or mismatch. So far all binary file comparisons have been a match. No matter I transfer the files and do a comparison or I copy from one share to another on the NAS internally, the binary comparison is still a match, whenever I test against the same files and folders on my laptop. With ECC everything is a match ALL the time. I have not been able to find even just one mismatch.

 

Synology have several NAS which originally ships with non-ECC RAM. And they are cabable of upgrading the memory with ECC and/or non-ECC memory. What I cannot understand, even though I have read allmost "everything" I could find about memory, ECC as well as non-ECC memory, the rates of failure I have experienced is nowhere near the percentages I have read about when comparing non-ECC and ECC RAM. My fails are way above those numbers. And it baffles me even more, that having non-ECC in my Synology NAS would be the culprit, as DSM on other Synology NAS seems to work perfectly fine. I had a DS2415+ a few years ago, and as far as I know that one is equipped with non-ECC from Synology. The 4GB upgrade I did to make it a total of 6GB, I bought from Synology being an original Synology 4GB module, also seemed to me to be non-ECC. I never experienced an error or warning or binary comparison mismatch on the DS2415+. So why am i doing it now on my ds3615xs when using non-ECC?

 

Has anyone any insights to this? Why do I see so many mismatched files after transfering from laptop to my NAS, but more interestingly why are there so many mismatched files even when doing an internal copy by the NAS from one share to another (even though they are on the same volume, meaning same disks) when using non-ECC RAM. When using Synology's original 4GB ECC memory, NO mismatched files whatsoever were produced during copy?

 

My own thoughs have been circling around a check done by Synology so that all non-Synology RAM are setup to fail no matter what, or if this is not the case, then what are the technical reasons for this many mismatched files?

 

I do binary comparison of my files all the time, especially when doing backups. Those are usually done on my Windows machines to external drives. I have NEVER experienced a mismatch on files and folders copied from my laptop to any number of my external drives. NEVER. Not even when I transfer through ethernet connection to my desktop. My laptop as well as my other windows machines all have non-ECC memory.

 

My local network goes through one Netgear switch before reaching the NAS, the same goes for my desktop. Please note, that in order to eliminate the laptop as the culprit, I have also done these tests from my desktop. Same results.

 

When trying to save my files, I had to backup everything. Not easy when i actually had 20+ TB of storage on my NAS. But I did copy directly from my NAS to an external SSD a few TB at a time, empty the external drive and then repeat etc. I found mismatched files after this backup/transfer method as when using ethernet, when comparing the files binary on my laptop. It was a nightmare. I have managed to save my data, so no problem. It just took a long time. Fact is, that it did not matter if I copied over the ethernet or using directly attached external drives, the mismatched files occured either way when using non-ECC RAM.

 

So, before investing in 32GB ECC RAM for the NAS and putting my data back into the NAS, I would like to understand the problem which have occured here.

 

Kingston non-ECC fails to produce identical files during copy (fails with 5-10 files out of 137 of average 1GB in size), Synology ECC produces only successfull identical copies of files even when using ethernet.

 

What is going on with this? Does Synology block other manufacturers and vendors of RAM? Would I be fine changing my RAM from Kingston non-ECC to ECC even though the brand would be different from Synology, or am I stuck with Synology and their astronomical prices because they put in some fail which occur when not using Synology's original RAM?

 

Sorry for the long post, but I think the info was necessary.

Any help and thoughts on the matter would be greatly appreciated. Thanks.

Link to comment
Share on other sites

Link to post
Share on other sites

The normal expected bit error rate when comparing modern non-ECC RAM to ECC RAM is only stated for errors caused inside the RAM module itself. However the ECC memory might also be correcting errors occurring in transport between the memory and the CPU. Regardless, I would personally never convert a system which was working fine on ECC to non-ECC memory.

 

What was the Kingston model number you found that was no longer available? And what is the model number of the Kingston non-ECC RAM that you purchased?

 

My research indicates that this pack should be compatible: KVR16E11K4/32

It is available at multiple places including Amazon.

Looking to buy GTX690, other multi-GPU cards, or single-slot graphics cards: 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Hi @brwainer

I only converted because it was the only available memory I could pay for. In January 2018 all RAM was extremely expensive, and ECC RAM was even much crazier. The kingston RAM you mention, is the exact same that came up when using Kingston's search tool. Buying on Amazon means penalties on tax and imports for me, so a no-go as it would just be even more expensive. At least when I bought the non-ECC in January 2018. In any case, I would like to know why this problem exist on my setup, even though Synology have already many NAS that ships with non-ECC memory. It should not matter in my opionion, unless there is some OS restriction? I have some ECC memory now available to me, but before investing any more in this NAS, I would like to know why so much goes wrong with non-ECC memory and more importantly I would like to know, if I can actually use non-Synology ECC RAM in my NAS and have it working perfectly, or will these RAM be blocked somehow? What I am also asking I guess is, what the problem actually is. Is it because of non-ECC RAM or is it because of non-Synology RAM? Changing to Kingston ECC might produce same problems?

Link to comment
Share on other sites

Link to post
Share on other sites

My guess is that the synology systems that ship with non-ECC RAM have more properly designed motherboards that don’t introduce errors. Or your board in particular is faulty.

Looking to buy GTX690, other multi-GPU cards, or single-slot graphics cards: 

 

Link to comment
Share on other sites

Link to post
Share on other sites

@brwainer

 

If my motherbard is faulty, then it should not be working correctly with ECC RAM, right?

What do you mean by "properly designed"? Is there an improper design?

Link to comment
Share on other sites

Link to post
Share on other sites

16 minutes ago, Takuan said:

@brwainer

 

If my motherbard is faulty, then it should not be working correctly with ECC RAM, right?

What do you mean by "properly designed"? Is there an improper design?

Improper trace routing may bring signal lines too close together or to other unrelated traces that introduce noise via EMI. It’s just one possible reason why Synology might equip some systems with ECC by default and others without. Regardless, I don’t think there is any OS-level recognition of the memory brand, and it would be very illogical for Synology to make third party units exhibit corruption. They would have to recognize the modules as third party, and purposefully introduce errors.

Looking to buy GTX690, other multi-GPU cards, or single-slot graphics cards: 

 

Link to comment
Share on other sites

Link to post
Share on other sites

@brwainer

Well, yes, that was perhaps far fetched. I understand about the designs. Skipping costs in production just sends the bill towards the consumer either way. Well, that part is clear to me now, thank you for clearing that up. I understand about the ECC error correction code, and to me it just seems extremely weird that ECC memory in contrast to non-ECC memory would make any difference when copying as ECC is not supposed to correct that kind of error. Just as you wrote earlier. Actually, as I understand the purpose of ECC, there should be no major differences in day to day computing, but 24/7/365 systems may over time run into problems or cosmic rays causing bits to flip. I don't believe that I live in an area prone to more of this than any other place on earth and as my windows machines without ECC RAM works perfectly, I just cannot understand it.

 

If anyone has any idea, except what brwainer have already listed as a possibility (improper design) why ECC RAM seems to be necessary in order to secure file transfers and file copying on my NAS, I would indeed appreciate it. It just makes no sence, but there must be some logical explanation on a technical level to clarify why non-ECC creates so many mismatched files and ECC does not. The rate of mismatched files does not make sense.

Link to comment
Share on other sites

Link to post
Share on other sites

Does no one have any experience at all with Synology NAS and non-ECC vs ECC upgrades?

Link to comment
Share on other sites

Link to post
Share on other sites

I have a bunch of synology units here at work, the latest one just shipped with 4gb ECC ram, I ordered generic Kingston ECC ram of the same speed and timings and had no problems. Over a year now smooth sailing.

 

I would never change from ECC to non ECC though. Stick with what it shipped with.

Link to comment
Share on other sites

Link to post
Share on other sites

@Lipe123

Great to know that you have Kingston RAM working. That makes me a bit more relaxed.

May I ask to exactly which model of Synology and model of RAM you are running?

Thank you.

Link to comment
Share on other sites

Link to post
Share on other sites

Mine is a bit different I guess, I have the RS2416+  RS2418+ *edit* lmao I was looking at my 2416 and it only showed 2gb ram and started to panic. Then saw its a ddr3 so-dim slot and realized I'm looking at the wrong one. This happens when you have like 4 of these things in a rack next to each other.

 

The ram I ordered was 2x: https://www.cdw.ca/product/Kingston-Server-Premier-DDR4-16-GB-DIMM-288-pin/4360848?

Mfg.Part: KVR24E17D8/16MA 

 

Of course it shows discontinued now. 

 

This might be worth a look: https://nascompares.com/2019/05/09/synology-nas-unofficial-memory-upgrade-guide/

 

Link to comment
Share on other sites

Link to post
Share on other sites

@Lipe123

Thank you very much for your insight. Makes me hopeful for finding a solution as well. Greatly appreciated.

Link to comment
Share on other sites

Link to post
Share on other sites

Can anyone explain the technical details to the problem I have experienced above? Thanks.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×