ZFS explanation needed

Xtreme Gamer · July 10, 2013

I know nothing about ZFS. But a few forum members and a few youtube vids told/showed me what it is/does. But I still don't know enough.

Now im asking anyone who can answer this. What is it? How does is work? How reliable is it?

looney · July 10, 2013

I'm not really up to date when it comes to ZFS, but I'll mention some members of whom I know that they have a ZFS system.

They will get a notification and should be able to help you. (just in case they don't notice this topic)

@VictorB (his ZFSguru system)

@Eric1024 (his linux mint ZFS system)

@alpenwasser (his arch linux ZFS system)

@algoat (his freenas 8.3 ZFS system)

Glenwing · July 10, 2013

ZFS is an advanced file system developed with servers in mind, with the mentality, do not trust any hardware to compensate for errors. ZFS focuses heavily on error correction and data integrity, using checksums to correct for problems like silent corruption that neither hardware RAID, OS, or HDD can detect. Snapshots is also a very neat feature.

EDIT: http://en.wikipedia.org/wiki/ZFS#Features

alpenwasser · July 10, 2013

What is it?

On the most superficial level it's a combination of a volume manager (software RAID)

and an actual filesystem. The most basic building blocks are your physical devices

(your HDDs, SSDs, USB drives, w/e have you), out of which you build virtual devices

(called vdevs). Those vdevs are then put into a storage pool (zpool), which is the

thing you actually mount and access with your file browser (a bit like a partition

in a normal setup, but that's just a very inaccurate analogy).

EDIT:

Forgot to add: You can create subvolumes inside your storage pool, sort of like different

partitions, and mount those to different locations. For each of those subvolumes you

can then set different policies and quotas. So you don't necessarily need to directly

mount the zpool itself, you can stick to mounting only subvolumes. In that case the

zpool would be your HDD and the subvolumes the different partitions on it.

This is actually what's usually done, but you can mount the entire pool in

one chunk if you want to.

/EDIT

As an example, this is the setup I'm using for ZEUS (my main server):

As you can see, my storage pool is called zeus-tank. It consists of two vdevs (raidz1-0

and raidz1-1). One of those vdevs consists of four WD 2 TB RE4 drives and the other one

of three 3 TB WE Reds.

Each of these vdevs is run as a raidz1 (more or less RAID5). You could also do RAIDZ2

(RAID6) or triple parity RAID. You could also do mirrors (way fastest, but also the

most expensive solution).

An important thing to note is that the vdevs in a storage pool are striped together

as something that is basically RAID0. So, if you loose one vdev you usually loose

your entire pool (at least afaik, I haven't been using ZFS very long, so there may

be stuff I've missed). Therefore, it is important to make the individual vdevs

resilient against failures by giving them some redundancy.

Something which I don't have in my setup is a cache device. Usually you would use

an SSD for that, and ZFS would use it to increase performance. I haven't played

around with that though, so other people probably know more about that topic.

There are lots of very advanced features you can use in ZFS, some of the more notable

ones being compression, snapshots and deduplication. Depending on how well your data

can be compressed and how much CPU power you have compression can actually yield a

rather measurable performance increase. I'm not using it for my server yet though

because most of the data I have is not really compressable. I might experiment with it

a bit when I get the time.

Deduplication basically makes sure that you only have one copy of your data on your

drive, but it uses a lot of RAM, so I don't use it.

As for snapshots, they're pretty self-explanatory and are implemented quite well

from what I've seen.

Another nice thing about ZFS is that you don't need to buy ridiculously expensive

RAID cards, in fact a simple JBOD setup will work best. ZFS doesn't like smart RAID

controllers interfering in its thing (it's a bit like two very smart people wanting

to do the same thing but with different methods... not good :lol: ). And any cheap

SATA controller can give you JBOD.

How does is work?

Internally? I don't have the slightest clue. :D

As an end user, there's really not much you need to do once you have it set up

(there's not even that much to do to set it up). The trickiest part I've found

is to get my head around what ZFS actually is and is not, same as you. :)

How reliable is it?

In and of itself: very.

There are however a few things to keep in mind: ZFS protects against corruption of

on-disk data, as Glenwing mentioned, wich checksums and some rather advanced algorithm

magic. What it does not protect against is corruption of data somewhere else in your

system, most notable in your RAM. Personally I'm currently not using ECC RAM, but if

you build a ZFS server from scratch I would definitely recommend going for that.

If you google around for a bit about ECC RAM and ZFS you'll find lots of conflicting

information and hear-say. Some people claim that you can loose your entire storage

pool if your memory gets a bit-flip during a scrubbing operation of ZFS (ZFS checks

the filesystem's integrity and corrects any errors it finds, so if the RAM suddenly

gets corrupted ZFS could compare against the corrupted data in your RAM and "correct"

the data on disk), other people estimate you only loose the file directly affected.

Personally, I have no idea which is true, but if I hadn't already had many of the

components for my server build when I did that I would definitely have gone for

ECC memory.

Also, as mentioned, you need to make sure your vdevs do not fail, otherwise all the

data you have in your pool will be lost AFAIK. As with any FS you will find stories

of people having lost their data for some inexplicable reason, but from what I know

ZFS is probably the file system most stringently designed with data integrity in mind

and is deployed on large-scale servers, so I think it's pretty reliable.

What's a bit of a shame is that the licensing situation is... not that good. Originally

ZFS was released by Sun with Opensolaris as open source software. But when Oracle bought

Sun they made future releases of ZFS closed-source, saying the would release those

versions delayed as open source. So the ZFS you get with Oracle's Solaris UNIX is quite

a bit more advanced than the open source version currently available. However, there are

people working on adding more features to the open source tree, so all hope is not lost

(most notably, encryption).

This is what I can think of off the top of my head, feel free to ask more questions.

Also, I'll add anything else that comes to mind.

looney · July 10, 2013

@alpenwasser

you don't mind typing do you :p

alpenwasser · July 10, 2013

@alpenwasser

you don't mind typing do you :P

Well I have a keyboard with Cherry blue switches, so no, I definitely do not. :D

VictorB · July 10, 2013

To get an idea what kind of hardware you need for ZFS you can watch my tutorial.

http://www.youtube.com/watch?v=FuFLwmyyfKs

You can use the same tips with new hardware!

IdeaStormer · July 16, 2013

Great write up @alpenwasser. concise and down to earth even though @looney thought it might of been long, beats having a newbie read the whole zfs doc :p

Some add ons are:

* Performance is also dependent on how many systems are accessing the Zpool(s), whether it be NFS, CIFS or direct. This is where more bandwidth (1 Gig card or more, or a 10 Gig card) help and a beefier CPU.

* Many people on here (LTT) probably will use it in a small environment and not see many of the gut wrenching bad things due to heavy users adding/removing a ton of files both small and large which affect things like snapshots, dedupe and other features.

Theo · July 16, 2013

On the most superficial level it's a combination of a volume manager (software RAID)

and an actual filesystem. The most basic building blocks are your physical devices

(your HDDs, SSDs, USB drives, w/e have you), out of which you build virtual devices

(called vdevs). Those vdevs are then put into a storage pool (zpool), which is the

thing you actually mount and access with your file browser (a bit like a partition

in a normal setup, but that's just a very inaccurate analogy).

EDIT:

Forgot to add: You can create subvolumes inside your storage pool, sort of like different

partitions, and mount those to different locations. For each of those subvolumes you

can then set different policies and quotas. So you don't necessarily need to directly

mount the zpool itself, you can stick to mounting only subvolumes. In that case the

zpool would be your HDD and the subvolumes the different partitions on it.

This is actually what's usually done, but you can mount the entire pool in

one chunk if you want to.

/EDIT

As an example, this is the setup I'm using for ZEUS (my main server):

As you can see, my storage pool is called zeus-tank. It consists of two vdevs (raidz1-0

and raidz1-1). One of those vdevs consists of four WD 2 TB RE4 drives and the other one

of three 3 TB WE Reds.

Each of these vdevs is run as a raidz1 (more or less RAID5). You could also do RAIDZ2

(RAID6) or triple parity RAID. You could also do mirrors (way fastest, but also the

most expensive solution).

An important thing to note is that the vdevs in a storage pool are striped together

as something that is basically RAID0. So, if you loose one vdev you usually loose

your entire pool (at least afaik, I haven't been using ZFS very long, so there may

be stuff I've missed). Therefore, it is important to make the individual vdevs

resilient against failures by giving them some redundancy.

Something which I don't have in my setup is a cache device. Usually you would use

an SSD for that, and ZFS would use it to increase performance. I haven't played

around with that though, so other people probably know more about that topic.

There are lots of very advanced features you can use in ZFS, some of the more notable

ones being compression, snapshots and deduplication. Depending on how well your data

can be compressed and how much CPU power you have compression can actually yield a

rather measurable performance increase. I'm not using it for my server yet though

because most of the data I have is not really compressable. I might experiment with it

a bit when I get the time.

Deduplication basically makes sure that you only have one copy of your data on your

drive, but it uses a lot of RAM, so I don't use it.

As for snapshots, they're pretty self-explanatory and are implemented quite well

from what I've seen.

Another nice thing about ZFS is that you don't need to buy ridiculously expensive

RAID cards, in fact a simple JBOD setup will work best. ZFS doesn't like smart RAID

controllers interfering in its thing (it's a bit like two very smart people wanting

to do the same thing but with different methods... not good :lol: ). And any cheap

SATA controller can give you JBOD.

Internally? I don't have the slightest clue. :D

As an end user, there's really not much you need to do once you have it set up

(there's not even that much to do to set it up). The trickiest part I've found

is to get my head around what ZFS actually is and is not, same as you. :)

In and of itself: very.

There are however a few things to keep in mind: ZFS protects against corruption of

on-disk data, as Glenwing mentioned, wich checksums and some rather advanced algorithm

magic. What it does not protect against is corruption of data somewhere else in your

system, most notable in your RAM. Personally I'm currently not using ECC RAM, but if

you build a ZFS server from scratch I would definitely recommend going for that.

If you google around for a bit about ECC RAM and ZFS you'll find lots of conflicting

information and hear-say. Some people claim that you can loose your entire storage

pool if your memory gets a bit-flip during a scrubbing operation of ZFS (ZFS checks

the filesystem's integrity and corrects any errors it finds, so if the RAM suddenly

gets corrupted ZFS could compare against the corrupted data in your RAM and "correct"

the data on disk), other people estimate you only loose the file directly affected.

Personally, I have no idea which is true, but if I hadn't already had many of the

components for my server build when I did that I would definitely have gone for

ECC memory.

Also, as mentioned, you need to make sure your vdevs do not fail, otherwise all the

data you have in your pool will be lost AFAIK. As with any FS you will find stories

of people having lost their data for some inexplicable reason, but from what I know

ZFS is probably the file system most stringently designed with data integrity in mind

and is deployed on large-scale servers, so I think it's pretty reliable.

What's a bit of a shame is that the licensing situation is... not that good. Originally

ZFS was released by Sun with Opensolaris as open source software. But when Oracle bought

Sun they made future releases of ZFS closed-source, saying the would release those

versions delayed as open source. So the ZFS you get with Oracle's Solaris UNIX is quite

a bit more advanced than the open source version currently available. However, there are

people working on adding more features to the open source tree, so all hope is not lost

(most notably, encryption).

This is what I can think of off the top of my head, feel free to ask more questions.

Also, I'll add anything else that comes to mind.

I have no idea what you just said, but im giving you my life for writing all that, did read it, didnt understand it lol

alpenwasser · July 16, 2013

IdeaStormer, on 17 Jul 2013 - 12:25 AM, said:

Great write up @alpenwasser. concise and down to earth even though @looney thought it might of been long, beats having a newbie read the whole zfs doc :P

Some add ons are:

* Performance is also dependent on how many systems are accessing the Zpool(s), whether it be NFS, CIFS or direct. This is where more bandwidth (1 Gig card or more, or a 10 Gig card) help and a beefier CPU.

* Many people on here (LTT) probably will use it in a small environment and not see many of the gut wrenching bad things due to heavy users adding/removing a ton of files both small and large which affect things like snapshots, dedupe and other features.

Thanks, and yes, agreed on both points.

Theo, on 17 Jul 2013 - 12:29 AM, said:

I have no idea what you just said, but im giving you my life for writing all that, did read it, didnt understand it lol

Haha, thank you. It becomes a lot clearer once you've played around with it a bit.

There aren't really any completely new concepts in ZFS' components, the revolutionary

thing about it is how it fuses existing concepts and mechanisms into something new in

a way that hasn't really been done before.

If you want to understand more or less what ZFS is all you really need to do is familiarize

yourself with the most common concepts in storage management and then see how they fit

into ZFS.

If you have any specific questions feel free to ask. I'm not the be-all end-all expert

on ZFS, but I'd be happy to give it a shot. :)

Sign In

ZFS explanation needed

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Topics

Latest From Linus Tech Tips:

He Spent 3 YEARS Begging me for a PC. Good Luck Finding it!

Latest From Tech Quickie:

The NEW Chip Inside Your Phone! (NPUs)

Latest From TechLinked:

YouTube Doubles Down

Latest From GameLinked:

A Bright Future for RPGs

Latest From ShortCircuit:

You Deserve this much OLED - AORUS CO49DQ

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!