Jump to content

ZFS Memory requirements

leadeater

I'd like to throw this out there since I see it all the time and can't be bothered fighting it, so here it is from a ZFS developer.

 

Quote

Some well meaning people years ago thought that they could be helpful by making a rule of thumb for the amount of RAM needed for good write performance with data deduplication. While it worked for them, it was wrong. Some people then started thinking that it applied to ZFS in general. ZFS' ARC being reported as used memory rather than cached memory reinforced the idea that ZFS needed plenty of memory when in fact it was just used in an evict-able cache. The OpenZFS developers have been playing whack a mole with that advice ever since.

I am what I will call a second generation ZFS developer because I was never at Sun and I postdate the death of OpenSolaris. The first generation crowd could probably fill you in on more details than I could with my take on how it started. You will not find any of the OpenZFS developers spreading the idea that ZFS needs an inordinate amount of RAM though. I am certain of that

https://www.reddit.com/r/DataHoarder/comments/5u3385/linus_tech_tips_unboxes_1_pb_of_seagate/ddrngar/

 

Quote

A system with 1 GB of RAM would not have much trouble with a pool that contains 1 exabyte of storage, much less a petabyte or a terabyte. The data is stored on disk, not in RAM with the exception of cache. That just keeps an extra copy around and is evicted as needed.
The only time when more RAM might be needed is when you are turn on data deduplication. That causes 3 disk seeks for each DDT miss when writing to disk and tends to slow things down unless there is enough cache for the DDT to avoid extra disk seeks. The system will still work without more RAM. It is just that the deduplication code will slow down writes when enabled. That 1GB of RAM per 1TB data stored "rule" is nonsense though. The number is a function of multiple variables, not a constant.

https://www.reddit.com/r/DataHoarder/comments/5u3385/linus_tech_tips_unboxes_1_pb_of_seagate/ddrh5iv/

 

What makes it even worse is I see those large memory requirements being advised even when deduplication is not going to be used. So please spread the word when necessary :).

 

P.S. I know ram is cheap so 16GB is a fine starting point, just please never use the 1GB-5GB of ram per TB advice ever again.

Link to comment
Share on other sites

Link to post
Share on other sites

wait, i think i missed something about the 1PB video, did linus bring up the ZFS ram thing in there? o.O

 

also, i kinda wanna say "it would be interesting to do testing on how much ram is *actually* advisable"

(and by that i dont specificly mean linus or anyone else, just in general, "it'd be nice if someone did some testing")

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, manikyath said:

wait, i think i missed something about the 1PB video, did linus bring up the ZFS ram thing in there? o.O

 

also, i kinda wanna say "it would be interesting to do testing on how much ram is *actually* advisable"

(and by that i dont specificly mean linus or anyone else, just in general, "it'd be nice if someone did some testing")

Nah just reddit community talk, it's a good read so click the "view the rest of the comments" link and have fun. 

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, leadeater said:

Nah just reddit community talk, it's a good read so click the "view the rest of the comments" link and have fun. 

been digging trough it.. ohboy, the reddit wikipedia specialists are real over there xD

 

EDIT: actually, i may fish some terrible hardware off my shelf and have some fun at some point :D

Link to comment
Share on other sites

Link to post
Share on other sites

I read something similar that was debunking the myth that there is a 8gb minimum. Is requireing ecc ram also a myth?

             ☼

ψ ︿_____︿_ψ_   

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, SCHISCHKA said:

I read something similar that was debunking the myth that there is a 8gb minimum. Is requireing ecc ram also a myth?

In a sense yes that is too, ZFS is built around data resiliency without any hardware requirements to meet it. Every storage operation does a checksum check so the risk of writing or even reading bad data is rare. There is one situation where you can write bad data to disk, remember this would only effect things at a file level not array, and that is uncommon which funnily enough the more ram you have in the system the more likely it could happen.

 

Quote

A 2010 paper examining the ability of file systems to detect and prevent data corruption, observed that ZFS itself is effective in detecting and correcting data errors on storage devices, but that it assumes data in RAM are "safe", and not prone to error. Thus when ZFS caches pages, or stores copies of metadata, in RAM, or holds data in its "dirty" cache for writing to disk, no test is made whether the checksums still match the data at the point of use. Much of this risk can be mitigated by use of ECC RAM but the authors considered that error detection related to the page cache and heap would allow ZFS to handle certain classes of error more robustly.[61]

https://en.wikipedia.org/wiki/ZFS#Limitations

http://research.cs.wisc.edu/adsl/Publications/zfs-corruption-fast10.pdf

 

My take on building storage arrays/servers, always use ECC it doesn't cost much more and it's not just about data resiliency either.

Link to comment
Share on other sites

Link to post
Share on other sites

18 minutes ago, SCHISCHKA said:

I read something similar that was debunking the myth that there is a 8gb minimum

Looks like 1GB minimum with a goal of 128MB.

 

Quote

I would agree with the 1GB system total recommended minimum, although work being done with ABD will allow ZFS to operate comfortably with much less system RAM. The target is 128MB.

 

5fklsG.jpg

Link to comment
Share on other sites

Link to post
Share on other sites

40 minutes ago, SCHISCHKA said:

I read something similar that was debunking the myth that there is a 8gb minimum. Is requireing ecc ram also a myth?

there's some guy on the forum running freenas with zfs on something with like 2 or 4 gigs of ram i think.

Link to comment
Share on other sites

Link to post
Share on other sites

50 minutes ago, SCHISCHKA said:

Is requireing ecc ram also a myth?

I don't have time to dig up the quote as I'm getting ready for work, but yes, an original dev of ZFS has dismissed the myth that ZFS without ECC is a doomsday scenario.  Now, there are certainly reason to use ECC but the reasons are the same on all operating systems and file systems.  It's become a myth that without ECC, ZFS presents a magical doomsday scenario where it will suddenly smash up your data like a cat that found it's way into the bag of toilet paper rolls and this is basically false.

 

Again, not that there aren't reasons to use ECC in a system, there's just no extra reasons if the system is running ZFS over anything else.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, leadeater said:

There is one situation where you can write bad data to disk, remember this would only effect things at a file level not array, and that is uncommon which funnily enough the more ram you have in the system the more likely it could happen.

Which would be a problem on any file system. If you write a file to wam and it is then corrupted in there,  There is nothing you can do other than getting a original copy of the file again, Yet to my knolege this isn't a huge problem, and most desktops don't have ecc, and you don't notice major file corruption happening.

Link to comment
Share on other sites

Link to post
Share on other sites

I did not see who the author was before clicking this topic, and went in thinking "man sooooo many people keep asking this question." Happily surprised. 

Link to comment
Share on other sites

Link to post
Share on other sites

Thanks for posting this - I myself have been guilty of perpetuating this myth, as it was very widely spread, even among ZFS forums, posts, guides, etc. I took that information as valid, since I thought I was getting it from "experts" of that software.

 

As for the ECC thing? I personally recommend it, if you can afford it in the budget, for a NAS/File System build, but it's certainly not mandatory.

 

I looked up research on this, and found info from a White Paper that Google put out about it's data centers, and the end result was that a single-bit RAM error was about as common as once every 1.5 years. Now, this was per module, so the more modules, the higher the risk, but still.

 

And having a multi-bit RAM error was about as likely as shooting a bullet out of the air with a smaller bullet, while doing a one-handed hand-stand on top of a unicycle at high speed (We're talking like one in several million chances here, if not even higher)

For Sale: Meraki Bundle

 

iPhone Xr 128 GB Product Red - HP Spectre x360 13" (i5 - 8 GB RAM - 256 GB SSD) - HP ZBook 15v G5 15" (i7-8850H - 16 GB RAM - 512 GB SSD - NVIDIA Quadro P600)

 

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, dalekphalm said:

And having a multi-bit RAM error was about as likely as shooting a bullet out of the air with a smaller bullet, while doing a one-handed hand-stand on top of a unicycle at high speed (We're talking like one in several million chances here, if not even higher)

Haha love it, picturing that in my mind right now xD. If only Mythbusters were still around to try it.

Link to comment
Share on other sites

Link to post
Share on other sites

22 minutes ago, leadeater said:

Haha love it, picturing that in my mind right now xD. If only Mythbusters were still around to try it.

At the rate they're going, you might just see that in Fast and the Furious 9 or something :P

For Sale: Meraki Bundle

 

iPhone Xr 128 GB Product Red - HP Spectre x360 13" (i5 - 8 GB RAM - 256 GB SSD) - HP ZBook 15v G5 15" (i7-8850H - 16 GB RAM - 512 GB SSD - NVIDIA Quadro P600)

 

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, dalekphalm said:

Thanks for posting this - I myself have been guilty of perpetuating this myth, as it was very widely spread, even among ZFS forums, posts, guides, etc. I took that information as valid, since I thought I was getting it from "experts" of that software.

Yea, the problem is on the fact of it it seems to make a lot of sense. Then to top that off it's so perpetuated everywhere that if you do a quick fact check it checks out. It's not to you really do some in depth research in to ZFS itself, technical papers etc, does the memory advice start to break down. Then you also have to stop and think, what if my array is 800TB? Holy freakin hell this doesn't make any sense at all!

 

I have had the good fortune to have worked with many enterprise storage arrays so I pretty much knew from the get go that this advice was rather shaky. For example a Netapp 8060 controller has 64GB of ram and supports 6PB of raw capacity and you can use the deduplication + compression features (performance load permitting). You can also cluster these controllers together for a maximum of 144PB, flash pool max capacity is much less but that is expected as SSDs can put way more load on CPU due to increased performance but this is a CPU limit not memory.

Link to comment
Share on other sites

Link to post
Share on other sites

Other issue is from the FreeNAS forums CyberJock will fight you to death on the topic. He's often quoted and referred to in write-ups in FreeNAS guides. Since FreeNAS, I believe, brought ZFS into the community's hands (in mass) it's the go-to for information. Even when given an article from Matt (something), a developer of ZFS - he still argues. Every time you trace ZFS memory requirements (ECC or Amount) I'd argue 90% of the time it goes back to CyberJock. He's knowledgeable but he's also very stubborn.

Link to comment
Share on other sites

Link to post
Share on other sites

@leadeater @dalekphalm I'm curious since I am helping someone build a ZFS array right now. I have a few questions.

First of all, I might add I don't care about ECC or non ECC, to me it makes no sense to argue over it.

But I am curious about the behaviour of the filesystem itself, how it can be bottlenecked by certain types of loads and how it will deal with those bottlenecks. If someone can provide or link to a good explanation of different kinds of loads on the filesystem and how it stresses resources, then a mythical rule of thumb won't be necessary.

 

So if one of you can briefly give an example of a load, and how the filesystem deals with it and then what resources it uses for that, then maybe we can get a better idea of how the lack of a certain resource will impact performance.

 

For example if I have no deduplication or compression enabled on a zpool with 2 vdevs of 8x2TB drives in Z2 and no ZIL or L2ARC. And I do a write over iSCSI with 10Gb connection, how does the filesystem handle the writes, and what resources will it use?

Comb it with a brick

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, .:MARK:. said:

-snip-

I'm not really a ZFS expert, only real experience with it is rescuing broken systems then migrating them to something better. By better I mean has vendor support and doesn't have really basic mistakes that lots of people make when building a ZFS storage server, issues not found in enterprise storage arrays.

 

I do have enough storage experience though to evaluate stuff and spot bullshit when I see it :P.

 

Anyway this video likely will do a better job than I can, there is also someone I was talking to ages ago on this forum who really is a ZFS expert so I'll see if I can find him.

 

I think the part you are most interested in is that ZFS is Copy On Write and will always have ARC.

 

Edit:

This is probably helpful too https://community.oracle.com/docs/DOC-914874

Link to comment
Share on other sites

Link to post
Share on other sites

Holy crap that took ages to find, need to follow people so I can find them :P.

 

@biotoxin Should be someone to provide good information, looks like he hasn't been on in a while but we can only hope.

Link to comment
Share on other sites

Link to post
Share on other sites

On 2/20/2017 at 1:43 PM, .:MARK:. said:

@leadeater @dalekphalm I'm curious since I am helping someone build a ZFS array right now. I have a few questions.

First of all, I might add I don't care about ECC or non ECC, to me it makes no sense to argue over it.

But I am curious about the behaviour of the filesystem itself, how it can be bottlenecked by certain types of loads and how it will deal with those bottlenecks. If someone can provide or link to a good explanation of different kinds of loads on the filesystem and how it stresses resources, then a mythical rule of thumb won't be necessary.

 

So if one of you can briefly give an example of a load, and how the filesystem deals with it and then what resources it uses for that, then maybe we can get a better idea of how the lack of a certain resource will impact performance.

 

For example if I have no deduplication or compression enabled on a zpool with 2 vdevs of 8x2TB drives in Z2 and no ZIL or L2ARC. And I do a write over iSCSI with 10Gb connection, how does the filesystem handle the writes, and what resources will it use?

Buffered is more important than ECC IMO

If you can use BTRFS instead of ZFS I'd say go for it unless you absolutely need that data accuracy

 

My servers regularly get hit with either lots of tiny files or single large files and I'll say the tiny files are the one's I worry about. Generally a single large dataset your only concern is having enough ram to hold temporary information until it can be written and in most cases 64 to 128gb is sufficient. The reason though that huge recommendations come for having 1gb per tb + is because of how much will get pulled on every file transfer and when you suddenly have lots of tiny files your ram will absolutely hit those numbers, my newer server runs nearly 1tb of ram for a huge array of 8tb drives and I've seen it hit 90% utilization a couple times already. This is where queue depth becomes important in your chain in making sure controllers and drives can handle the operations so you aren't frequently having to call back and waste precious clock cycles on grabbing tiny chunks that can otherwise be aggregated.

 

drive buffers generally won't matter much so much as bit density and the ability of the head, as long as you can read and write enough data in each pass it should never come up and I've only ever heard of it as a problem never actually experienced it. This is why there's the focus on faster platters though as suggested in the video by @leadeater

 

Compression can be done on the fly though if you have a decent system and I would recommend it but with both dedupe and compression disabled in your scenario, assuming ideal environmental conditions, and no ssd cache, it'll read out some information, start pulling checksums then write the new data and read it back for verification of integrity before or simultaneously with the next write process depending on configuration, in most cases because of how tedious the process is you'll need enough space to store most if not all of the download in ram as the whole read write process kills drive speed even in an all ssd array. Because the system is paranoid it'll read out surrounding tracks as well to make sure no mistakes happened and verify everything surrounding the writes top to bottom. 

 

In short you can usually expect any given process to take 5 times longer than normal and this increases with the number of spindles in the array.

 

With a more specific scenario I might have better details but I'm more experience than theory, I can get you information on thermals, decibels, utilization of system RSS, but in terms of backend mechanics the video already posted and Wendel's video on level1tech 

 

should get you started

 

if you think 2 hours is a lot of time to dedicate just consider the point of these kinds of file systems is to keep all your information in pristine condition for decades across multiple drive failures with no bit rot, it's worth the time it takes to learn.

Spoiler

CPU: TR3960x enermax 360 AIO Mobo: Aorus Master RAM: 128gb ddr4 trident z royal PSU: Seasonic Prime 1300w GPU: 5700xt, 5500xt, rx590 Case: c700p black edition Display: Asus MG279Q ETC: Living the VM life many accessories as needed Storage: My personal cluster is now over 100tb!

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 year later...

Thanks for the info, this raises a question though.

 

If I am not using dedup, is there any benefit at all to having more than 4GB of ram?

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, taltamir said:

Thanks for the info, this raises a question though.

 

If I am not using dedup, is there any benefit at all to having more than 4GB of ram?

Unless a lot of people will be hitting the NAS at once, or you plan to run other packages (Plex/Nextcloud/Whatever) - then not really. RAM = Cache, data in cache can be read faster (IOPS and overall speed). SMB has some overhead, BSD has very little overhead, and the FreeNAS layer has some overhead. 

 

If you have 4gb laying around, try it and see what happens.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Mikensan said:

Unless a lot of people will be hitting the NAS at once, or you plan to run other packages (Plex/Nextcloud/Whatever) - then not really. RAM = Cache, data in cache can be read faster (IOPS and overall speed). SMB has some overhead, BSD has very little overhead, and the FreeNAS layer has some overhead. 

 

If you have 4gb laying around, try it and see what happens.

Thank you. I am running plex, does plex require extra ram?

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, taltamir said:

Thank you. I am running plex, does plex require extra ram?

Plex requires a bit extra, but it largely depends on what kind of files you're playing back. Larger files will require more RAM.

 

But even in that case, 8GB is probably more than you'd need.

 

I would probably check RAM usage while playing back the highest res/largest file you have, and see if you come close to maxing out the 4GB.

For Sale: Meraki Bundle

 

iPhone Xr 128 GB Product Red - HP Spectre x360 13" (i5 - 8 GB RAM - 256 GB SSD) - HP ZBook 15v G5 15" (i7-8850H - 16 GB RAM - 512 GB SSD - NVIDIA Quadro P600)

 

Link to comment
Share on other sites

Link to post
Share on other sites

Doesen't pfsense still advertise that 8GB EEC ram is recomended if you do ZFS install still? That router would pratically become a beast amongs <>< in the ocean :P 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×