Linus Archive Server - Why Gluster?

programster · January 28, 2017

Hey,

I just watched the latest WAN show and heard that Linustechtips is using GlusterFS for its archive server. I just wanted to find out why this was chosen and maybe suggest using Ceph or just a ZFS box might be better. I did a few tutorials on all 3 of these: Gluster, Ceph, and ZFS and they all have their pros/cons (unfortunately I never did a tutorial comparing them against each other). I suspect the reason for the archive server using GlusterFS is because Linus may want replication across multiple servers that multiple users access simultaneously? If this is the case then I would recommend a Ceph cluster with multiple gateways. There is one major advantage to GlusterFS and that is that it is incredibly simple to setup in comparison, but Ceph will probably give you a better package in terms of features and flexibility. With a Ceph cluster, you can not only have a CephFS mounted filesystem (like NFS) but you can set up remote block storage for your VMs and object storage (like Amazon S3).

Electronics Wizardy · January 28, 2017

Well

Zfs isn't a distributed file system

Cephfs for file storage is in beta.

Gluster just works well and fairly high performance. You can also you something like lizardfs or storage spacedistributed.

leadeater · January 28, 2017

16 minutes ago, Electronics Wizardy said:

Cephfs for file storage is in beta.

You can still use RBD volumes and have that HA across multiple servers using pacemaker or pNFS. I prefer Ceph over Gluster mainly due to the fine control you can have.

programster · January 28, 2017

9 minutes ago, Electronics Wizardy said:

Well

Zfs isn't a distributed file system

Cephfs for file storage is in beta.

Gluster just works well and fairly high performance. You can also you something like lizardfs or storage spacedistributed.

Like I said, I was checking why GlusterFS was chosen and wasn't sure if being distributed was a requirement but guessed so, which is why suggested CephFS as well.

Can you link me where Ceph is in Beta? I know the CephFS part was under development last time I was looking into it, but thought that the block and object storage was not in beta, and the object storage would be perfect for archived videos.

leadeater · January 28, 2017

25 minutes ago, programster said:

If this is the case then I would recommend a Ceph cluster with multiple gateways. There is one major advantage to GlusterFS and that is that it is incredibly simple to setup in comparison, but Ceph will probably give you a better package in terms of features and flexibility. With a Ceph cluster, you can not only have a CephFS mounted filesystem (like NFS) but you can set up remote block storage for your VMs and object storage (like Amazon S3).

Yea I sort of wish they went with Ceph but Gluster meets the the most important criteria for them, simple.

Gluster also doesn't mind having disks part of arrays and uses the server itself as a node in the system rather than down to the disk level which is the preferred method in Ceph. This way you could have each server running ZFS and then layer Gluster over the top so you get the extra features like Deduplication, Tiering and Snapshots.

leadeater · January 28, 2017

3 minutes ago, programster said:

Can you link me where Ceph is in Beta? I know the CephFS part was under development last time I was looking into it, but thought that the block and object storage was not in beta, and the object storage would be perfect for archived videos.

He means purely CephFS component not Ceph the entire package. Currently only 1 metadata server is supported in a production deployment which is a single point of failure and a performance bottleneck, things that directly go against the principle of Ceph.

leadeater · January 28, 2017

And to be honest for LMG if they are planing on going with a multiple server distributed storage platform Storage Spaces Direct (S2D) would be a better fit.

It's Windows, plus for them
It's Scale out and Scale Up
It's distributed access and capacity
All SMB 3.x features just work
Requires less CPU than Ceph for high throughput
Designed from the ground up to use NVMe if you wish
Has file level resiliency through ReFS
Tight integration with Hyper-V and Scale out File Server

Electronics Wizardy · January 28, 2017

7 hours ago, programster said:

Like I said, I was checking why GlusterFS was chosen and wasn't sure if being distributed was a requirement but guessed so, which is why suggested CephFS as well.

Can you link me where Ceph is in Beta? I know the CephFS part was under development last time I was looking into it, but thought that the block and object storage was not in beta, and the object storage would be perfect for archived videos.

I personally haven't played with Ceph as much. Gluster has the advantage of being very easy to setup and not having any metadata servers. I can set up gluster in 10 min.

Electronics Wizardy · January 28, 2017

6 hours ago, leadeater said:

And to be honest for LMG if they are planing on going with a multiple server distributed storage platform Storage Spaces Direct (S2D) would be a better fit.

It's Windows, plus for them

It's Scale out and Scale Up

It's distributed access and capacity

All SMB 3.x features just work

Requires less CPU than Ceph for high throughput

Designed from the ground up to use NVMe if you wish

Has file level resiliency through ReFS

Tight integration with Hyper-V and Scale out File Server

And I will strongly disagree with lots of the practices that Linus does in his server room.

Also windows storage Spaces direct needs data center editions and for a few servers that cam easily add up.

Also I'm more of a Linux guy and would personally use gluster and ovirt over windows server any day. But personal preference.

programster · January 28, 2017

7 hours ago, leadeater said:

He means purely CephFS component not Ceph the entire package. Currently only 1 metadata server is supported in a production deployment which is a single point of failure and a performance bottleneck, things that directly go against the principle of Ceph.

Thanks for clearing that up for me.

7 hours ago, leadeater said:

Gluster also doesn't mind having disks part of arrays and uses the server itself as a node in the system rather than down to the disk level which is the preferred method in Ceph. This way you could have each server running ZFS and then layer Gluster over the top so you get the extra features like Deduplication, Tiering and Snapshots.

You can run a Ceph osd over the top of an array of disks as well but its not the preferred method (and for good reasons), as you pointed out. Thinking about it further, using ZFS with gluster would be a good combo. You could switch off replication and just use distributed bricks (not striping) and use raidz2 for redundancy and not eat disks.

programster · January 28, 2017

4 minutes ago, Electronics Wizardy said:

And I will strongly disagree with lots of the practices that Linus does in his server room.

Also windows storage Spaces direct needs data center editions and for a few servers that cam easily add up.

Also I'm more of a Linux guy and would personally use gluster and ovirt over windows server any day. But personal preference.

I think we're all Linux guys here, and so is Wendel (which was great to see him on this WAN show). I just know Linus is a windows guy judging from his past videos. I laughed when Linus mentioned during one of his videos that he was considering calling Wendel for advice, but he would probably give some Linux solution.

Electronics Wizardy · January 28, 2017

6 minutes ago, programster said:

You could switch off replication

bad idea. If one server goes, you lost it all

My big complaint with gluster was the requirement of all the nodes being the same size in the volume, so im personally using lizardfs. It has the advantage of allowing mixed drive sizes and setting redundancy level on a folder level.

programster · January 28, 2017

4 minutes ago, Electronics Wizardy said:

bad idea. If one server goes, you lost it all

My big complaint with gluster was the requirement of all the nodes being the same size in the volume, so im personally using lizardfs. It has the advantage of allowing mixed drive sizes and setting redundancy level on a folder level.

If you lost a server, you would only lose the files that were on that server and when you say lose the server do you mean it got stolen or blew up? If a disk failed, that is handled by the ZFS redundancy layer. Using raidz2 means you can actually have 2 disks fail simultaneously and still be good to go. If gluster crapped out, you could just copy the files off because of using distribution, not striping.

Electronics Wizardy · January 28, 2017

2 minutes ago, programster said:

If gluster crapped out, you could just copy the files off.

half of the files our goal is not to lose any data.

Id personally never do a striped distributed file system.

programster · January 28, 2017

1 minute ago, Electronics Wizardy said:

half of them our goal is not to lose any data.

Explain to me how you would lose half your data? Has someone walked off with the server or have 3 disks failed at once?

programster · January 28, 2017

Perhaps I am missing something and one of the servers is in a remote location?

Electronics Wizardy · January 28, 2017

2 minutes ago, programster said:

Explain to me how you would lose half your data? Has someone walked off with the server or have 3 disks failed at once?

well oops. sorry about that.

You won't lose half you data, but to me the big feature of gluster is HA, and that would take some time to recover.

programster · January 28, 2017

Yeah, I guess the LMG group don't need to worry about the number of disks aspect so much as the potential loss of time if a server disconnected, so they could eat the hardware cost and go with the HA approach. For home users such as myself, I am tempted to start using a distributed gluster cluster as my raidz2 server is nearly at capacity and I can't be bothered to put in the effort to maintain a Ceph cluster just yet.

Electronics Wizardy · January 28, 2017

2 minutes ago, programster said:

Yeah, I guess the LMG group don't need to worry about the number of disks aspect so much as loss of time so they could eat the hardware cost and go with the HA approach. For home users such as myself, I am tempted to start using a distributed gluster cluster as my raidz2 server is nearly at capacity and I can't be bothered to put in the effort to maintain a Ceph cluster just yet.

id also look into lizardfs. It lets you easily plug in drives and add and subtract them when needed and lets you set file redundancy on a folder level.

programster · January 28, 2017

2 minutes ago, Electronics Wizardy said:

id also look into lizardfs. It lets you easily plug in drives and add and subtract them when needed and lets you set file redundancy on a folder level.

I will be sure to start looking into it and putting some tutorials up soon. Thanks for the tip.

leadeater · January 29, 2017

7 hours ago, programster said:

I think we're all Linux guys here

I'm actually a Windows guy . But for storage I'm an enterprise storage array guy and virtualization I'm a VMware guy.

leadeater · January 29, 2017

7 hours ago, Electronics Wizardy said:

Also windows storage Spaces direct needs data center editions

You can use Standard edition unless things changed since the technical preview. There are new features that require Datacenter like Storage Replica but that feature alone isn't worth the price jump when you can use 3rd party tools.

Edit:

Well holy crap they did change it to Datacenter only, that bloody sucks big time.

Electronics Wizardy · January 29, 2017

26 minutes ago, leadeater said:

Well holy crap they did change it to Datacenter only, that bloody sucks big time.

Wow. So according to microsoft thats 6155 US dollars per server with upto 16 cores.

Pay up.

Blake · January 30, 2017

On 1/28/2017 at 7:41 PM, programster said:

Hey,

I just watched the latest WAN show and heard that Linustechtips is using GlusterFS for its archive server. I just wanted to find out why this was chosen and maybe suggest using Ceph or just a ZFS box might be better. I did a few tutorials on all 3 of these: Gluster, Ceph, and ZFS and they all have their pros/cons (unfortunately I never did a tutorial comparing them against each other). I suspect the reason for the archive server using GlusterFS is because Linus may want replication across multiple servers that multiple users access simultaneously? If this is the case then I would recommend a Ceph cluster with multiple gateways. There is one major advantage to GlusterFS and that is that it is incredibly simple to setup in comparison, but Ceph will probably give you a better package in terms of features and flexibility. With a Ceph cluster, you can not only have a CephFS mounted filesystem (like NFS) but you can set up remote block storage for your VMs and object storage (like Amazon S3).

If i recall he said something like "we getting someone from the vendor to install because it's above our heads"

programster · January 30, 2017

@Blake I think they said they were getting help, not sure he said it was above their heads. If Gluster is too difficult then Ceph definitely will be.

After learning from here that GlusterFS does support different size bricks (since 3.6), I decided to test it out with a fresh tutorial and am pretty pleased with it. Now I just need to take it a bit further and document expanding/shrinking a distributed volume. I'm not sure GlusterFS has the logic to gracefully move the data to the rest of the cluster for removing a brick which is a shame.

Quote

Data residing on the brick that you are removing will no longer be accessible at the Gluster mount point. Note however that only the configuration information is removed - you can continue to access the data directly from the brick, as necessary.

Looks like one would have to remove the brick and then manually re-add the files to the cluster.

Sign In

Linus Archive Server - Why Gluster?

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites