My dream FINALLY came True

Needfuldoer · February 9, 2022

3 hours ago, maskmcgee said:

Again, 10 years of storing every video file ever recorded and a grand total of 0 uses of that archive. It's obvious why he dosen't want to spend 10k a year storing this, because it'd never be used.

Which is exactly why a dense, shelf-stable tape library makes the most sense as the deep archive. It's a "just in case we might need it" they keep to occasionally dip into, which could be expanded at relatively little cost.

But even that's a hard value proposition to argue when a hard drive manufacturer can hook you up with tens of thousands of dollars worth of hard drives for free (or a dramatically reduced cost).

Ultraforce · February 9, 2022

7 hours ago, Needfuldoer said:

Which is exactly why a dense, shelf-stable tape library makes the most sense as the deep archive. It's a "just in case we might need it" they keep to occasionally dip into, which could be expanded at relatively little cost.

But even that's a hard value proposition to argue when a hard drive manufacturer can hook you up with tens of thousands of dollars worth of hard drives for free (or a dramatically reduced cost).

It's also worth noting that they've been specifically told when it came to the PetaByte Project that they need to use all of the hard drives they are given for projects or are to return them. Since I think the first petabyte project is what lead to the whole building servers for other youtubers.

alpha754293 · February 10, 2022

10 hours ago, Ultraforce said:

It's also worth noting that they've been specifically told when it came to the PetaByte Project that they need to use all of the hard drives they are given for projects or are to return them. Since I think the first petabyte project is what lead to the whole building servers for other youtubers.

The building and usage of said hard drives for a new server doesn't preclude LTT from building and using a properly managed, operating, maintained tape library system for said archival footage and/or using it to load balance the network server load for their editors.

This is one of things that I've come across with the push for more and more virtualisation where whilst yes, it is great that you don't have to own, maintain, and power so many baremetal boxes anymore, but if you were running two AMD EPYC servers, each with 128 PCIe 4.0 lanes, you get a total of 256 PCIe 4.0 lanes combined between the two servers.

Virtualisation isn't going to give you that.

This is one of the things that you "sacrifice" with running more virtual machines vs. bare metal systems and it's the same for storage bandwidth bottleneck as well.

If they have two Petabyte servers like that (or really, 2.25 PB servers). At least two of them should be clustered together in a HA pair so that if one of the servers goes down for whatever reason, it isn't going to completely stop their editors from working.

That's not how their plan (as described) has been laid out.

With that many hard drives given to them (and also with the upto 4% AFR for Seagate drives, which they should really set a bunch of those up as hot spares), there are a lot better ways to deploy the hardware that they've got that better protects their business.

(i.e. they're also not plunky home lab users anymore that started their YouTube channel over 10 years ago when they were doing a lot of their early stuff from their house.)

I have, probably, all told, about anywhere between 1/20th - 1/10th of the total raw storage capacity and yet, I have more LTO-8 tapes than they do, as a home user. (I'm upto a total of 40 LTO-8 12 TB tapes now.)

I don't understand what's Linus' excuse.

DoctorNick · February 10, 2022

Needfuldoer · February 10, 2022

One of the first steps they should do is spin up that 2U Gigabyte server for all the 'office' documents. Using it as a completely overkill home NAS is good for views, but with all their administrative work (plus how big Creator Warehouse is becoming), editors slamming the media array shouldn't take the entire company down.

5 hours ago, alpha754293 said:

If they have two Petabyte servers like that (or really, 2.25 PB servers). At least two of them should be clustered together in a HA pair so that if one of the servers goes down for whatever reason, it isn't going to completely stop their editors from working.

This is getting out of my wheelhouse (I'm only climbing out of the valley of despair when it comes to at-scale enterprise storage), but if they need to keep that many spinning drives online, I think they should all be one large managed pool instead of spread around an increasingly complicated stack of individual servers.

New^4 Vault could look something like this:

Two controller nodes, both connected to a stack of drive shelves (either by Infiniband or SAS)

VDEVs spread across multiple shelves (so an entire shelf could go offline without taking the cluster down)

Both controllers have a 100gbe connection back to the editor switch and a direct connection to each other

The controllers are set up as a high-availability pair and addressed by a shared virtual IP and machine name

Both Whonnock server and the hypothetical 'office NAS' I mentioned earlier could replicate to the storage pool.

They'd have an even bigger number to point at (3.6 PB raw capacity on the cluster!)

6 hours ago, alpha754293 said:

That's not how their plan (as described) has been laid out.

Their short-term plan (lifeboat the data off each Vault pair one at a time so they can be repaved) is sound for getting back to a stable status quo, but I agree it doesn't address their long-term problem.

6 hours ago, alpha754293 said:

With that many hard drives given to them (and also with the upto 4% AFR for Seagate drives, which they should really set a bunch of those up as hot spares), there are a lot better ways to deploy the hardware that they've got that better protects their business.

(i.e. they're also not plunky home lab users anymore that started their YouTube channel over 10 years ago when they were doing a lot of their early stuff from their house.)

This hits the nail on the head! It feels like the server closet is still in the bathroom at the Langley house, even as the rest of the company has grown and matured. Mixing servers and infrastructure in the same rack is asking for a cable management nightmare. The existing server room is going to run out of space eventually (you could argue it already has); they should have a plan to build a bigger server room elsewhere in the unit and turn the old server room into an IDF (switch closet). Since it's at the core of the "office" part of the facility, it's perfect for that! Move the servers and UPS into a new room with space for a tape library and the above-described hypothetical deep storage SAN.

alpha754293 · February 11, 2022

19 hours ago, Needfuldoer said:

I think they should all be one large managed pool instead of spread around an increasingly complicated stack of individual servers.

Hopefully, I've read and understood what you wrote correctly, but on this point - having distributed storage is not really as much of a problem and as much of a "death knell" like it used to be/like it once was.

Gluster makes that pretty easy.

But even if they didn't want to use Gluster and let's say that they threw on Proxmox VE on BOTH of the storage servers and then used said Proxmox to join each of said storage servers into a HA cluster, my understanding is that Proxmox can handle all of that which would make the setup, configuration, and management of said HA cluster a breeze (relatively speaking - compared to what it used to be and/or compared to what how difficult it can be, if you have a more specialised/custom setup/deployment strategy.)

(But if you're going to be going with a custom deployment strategy, chances are, you have already learned from prior deployments of what to do and what not to do as well as what works for you and what doesn't work for you.)

19 hours ago, Needfuldoer said:

Two controller nodes, both connected to a stack of drive shelves (either by Infiniband or SAS)

19 hours ago, Needfuldoer said:

Both controllers have a 100gbe connection back to the editor switch and a direct connection to each other

Personally, (and perhaps ironically, thanks to Linus), I'd vote Infiniband.

I watched their video that they posted a couple of years ago talking about how 100 Gbps networking isn't all that expensive anymore ($/Gbps/port) and that's what led me to deploy my own 100 Gbps Infiniband setup in the basement of my house. (Still rockin' it!)

It doesn't have to be super expensive. I bought my Mellanox 36-port externally managed switch for < $3000 CAD and I just have a Linux system running the OpenSM subnet manager on it in order to "drive" said switch. But that's so easy to do.

It's too bad that Mellanox doesn't use their VPI technology on the switch side because they can totally do that.

Given that I have said IB switch, and OpenSM running, I recently installed Windows 10 on my AMD Ryzen 9 5950X system and it picked up the Mellanox ConnectX-4 card right away and IPoIB (showed up as an ethernet device even though the ports were set to IB link type mode) worked right away as well.

In other words, the editors could have 100 Gbps connection to the server or like you said, between server can be dual 100 Gbps (yay PCIe 4.0 x16) (and yes, you can bond IB ports like that).

(My LTO-8 tape drive at home is on a system that runs CAE Linux 2018 (which is based on Ubuntu 16.04 LTS) and that's on the IB network.)

NFSoRDMA is a godsend. (No pun intended.)

And if they still have their all NVMe flash storage server, they can also run NVMeoF as well.

An alternative configuation, with the dual port 100 Gbps IB cards is that they can have one of the ports running in IB mode and the other port running in ETH mode and then have both a 100 Gbps IB switch AND a 100 GbE switch as well, that ways, if they don't want to run IB for everything, (ethernet is a little easier to administer), they can run everything through RoCE instead.

The possibilities are endless.

It probably wouldn't be a bad idea for some of the members of the LTT staff to jump onto the Storagereview.com forums and start talking to IT storage professionals to figure out how to best set up the storage servers, drives, etc. that they DO have in order to optimise between R&P, SWaP, and TCO.

casperse · April 17, 2022

I really liked this video, but some questions I hope someone will spend the time answering

What CPU cooler did you end up ordering for this build/MB? (Noctua NH-U9 TR4-SP3? - don't really care about the noise just the performance)
Did you "Change the drive settings" before adding SEAGATE EXOS to the array? (SeaChest_PowerControl) like recommended here LINK
What ECC RAM did you use for the build? (Some brands works but doesn't post correctly on boot)
I also just found that the AsRock RomeD8-2T is the only option if you want PCI 4.0 did you find any alternatives?

I am upgrading my UNRAID home server (Storage/Game server/Dockers/Media server and much more) but the power consumption on old Xeon servers are "bad" especially now when electricity is at a very high price point.

Also I really want new HW to support multiple NVMEe for cache. So EPYC seem to be the way to go!

Sign In

My dream FINALLY came True

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Featured Topics

Topics

Latest From Linus Tech Tips:

I Am Not Buying A Super Computer - WAN Show May 3, 2024

Latest From Tech Quickie:

This Guy BUILT His Own Graphics Card!

Latest From TechLinked:

Microsoft, Give Up Already.

Latest From GameLinked:

Roblox and Walmart... Are One

Latest From ShortCircuit:

Dell Has Destroyed the XPS - Dell XPS 16 (2024)

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!