These Servers are TOO EXPENSIVE

leadeater · March 4, 2019

13 minutes ago, twstdude0to1 said:

Ahh that's unfortunate. In a future video he could turn the storage server into a expandable storage box and Mini-SAS it to the NVME server. This could effectively give him huge amounts of storage to load into the cold tier.

For the amount he's actually spending just buying a EMC/Dell/Netapp storage array would be vastly better. As mentioned by @joshuaauger, Isilon last I got pricing was cheap enough and that storage platform is more in line with their workload needs.

The problem with a lot of the stuff that will just work properly and is designed for this workload is they have a rather high entry cost because of minimum deployment size and these systems in small scale have poor cost per TB, but at 500TB+ start to become much more attractive.

I do disagree with Ceph being a good choice here, SMB gateway performance is not good, I've never seen anyone get the speeds Linus would want. Ceph RBD backing a VM serving out SMB would have a better chance but over all anything to do with Ceph in my opinion is not a good fit with LMG because no one there has the experience as a sys admin/engineer to safely implement it and maintain it. Things are fine with they work but if something breaks or performance drops unexpectedly you need to know exactly what you're doing to resolve it quickly.

Linus is too much of a tight ass to buy any of that though, not sure why he's willing to buy RED cameras and not a proper storage solution though. His server videos are highly entertaining to watch though, in a 'here's not what to do' kind of way lol.

Rudolf Meier · March 4, 2019

Why didn't you try the "Intel Cache Acceleration Software" ?? It's free if you use it together with Intel SSDs and my finding was, that this is the best SSD caching solution (I tried many, including those integrated in raid controllers). No idea if it's an option for you, but it might be. In theory you should be able to combine a drive (or raid of whatever kind) with either an ssd or for example a raid-0 of ssds. I'm using it with "ordinary" U.2 ssds of Intel to accelerate our Hyper-V servers.

Luscious · March 4, 2019

Supermicro will cheerfully sell you a 2U server with 528TB of NVME goodness for a cool $250,000

But I am guessing the poor guy would go bankrupt if he had to spend THAT KIND of money.

Alternatively you could try storage expansion using a ruler server. But even 256TB of NVME like that in a slim 1U chassis will set you back $150K

I would have suggested slapping an Intel P4608 into each editing rig for fast LOCAL caching, but as I understand it you need those files to be sharable between editors from a central fast pool. Maybe you want to change how you share files (ring network between editing rigs)?

If price is the pain point here, your only real option is to source some more SSD's (possibly higher capacity ones) and just expand the system you're already using now.

Acedia · March 5, 2019

1 hour ago, joshuaauger said:

But tiering with what OS? That's the hard part for people, but if you have HPE Apollo Gen 9 or Dell R720xd you can run Qumulo OS on it, which has a lot of enterprise features and things geared towards editorial/ video production workflows.

But what do I know.

The MSA does tiering itself.

Falconevo · March 5, 2019

1 hour ago, leadeater said:

For the amount he's actually spending just buying a EMC/Dell/Netapp storage array would be vastly better. As mentioned by @joshuaauger, Isilon last I got pricing was cheap enough and that storage platform is more in line with their workload needs.

The problem with a lot of the stuff that will just work properly and is designed for this workload is they have a rather high entry cost because of minimum deployment size and these systems in small scale have poor cost per TB, but at 500TB+ start to become much more attractive.

I do disagree with Ceph being a good choice here, SMB gateway performance is not good, I've never seen anyone get the speeds Linus would want. Ceph RBD backing a VM serving out SMB would have a better chance but over all anything to do with Ceph in my opinion is not a good fit with LMG because no one there has the experience as a sys admin/engineer to safely implement it and maintain it. Things are fine with they work but if something breaks or performance drops unexpectedly you need to know exactly what you're doing to resolve it quickly.

Linus is too much of a tight ass to buy any of that though, not sure why he's willing to buy RED cameras and not a proper storage solution though. His server videos are highly entertaining to watch though, in a 'here's not what to do' kind of way lol.

Agreed, Ceph isn't the correct choice here. Getting good IOPS and latency out of Ceph is like trying to get blood from a stone and that's with NVMe HHHL DC grade SSD's up front. Ceph is more of a scale out scale up platform rather than a ball breaker for breaking IOPS records.

@LinusTech Did any disk/IO metrics get retrieved from storage spaces and/or the network during the testing to make sure no 'bottlenecks' or configuration issues became the reason for slow downs? I don't know your exact network layout but can guess a lot from the video's created, drop me a PM or email if you want any input/assistance as unfortunately debugging performance issues with storage and network is my day job

It would probably be more sensible to manage the tiers yourself and use a 'high performance scratch' redundant NVMe virtual disk(s) for editing and a lower performance SSD virtual disk(s) which content is relocated to hot-storage upon finalising the video. You could also drop this to SATA based storage once it has been in this SSD hot-storage for a number of months to save on space/cost per GB. Obviously an enterprise grade SAN is going to have feature sets which take care of this automatically but the purchase and running/support/warranty costs may not be viable long term. A large portion of that can be automated and take away any manual intervention from the end user/editors as no one wants to overly confuse users

jtsoi · March 5, 2019

1 hour ago, leadeater said:

For the amount he's actually spending just buying a EMC/Dell/Netapp storage array would be vastly better. As mentioned by @joshuaauger, Isilon last I got pricing was cheap enough and that storage platform is more in line with their workload needs.

The problem with a lot of the stuff that will just work properly and is designed for this workload is they have a rather high entry cost because of minimum deployment size and these systems in small scale have poor cost per TB, but at 500TB+ start to become much more attractive.

I do disagree with Ceph being a good choice here, SMB gateway performance is not good, I've never seen anyone get the speeds Linus would want. Ceph RBD backing a VM serving out SMB would have a better chance but over all anything to do with Ceph in my opinion is not a good fit with LMG because no one there has the experience as a sys admin/engineer to safely implement it and maintain it. Things are fine with they work but if something breaks or performance drops unexpectedly you need to know exactly what you're doing to resolve it quickly.

Linus is too much of a tight ass to buy any of that though, not sure why he's willing to buy RED cameras and not a proper storage solution though. His server videos are highly entertaining to watch though, in a 'here's not what to do' kind of way lol.

This is my thoughts exactly. My thoughts every time tries to do a video on his storage system, Is why isn't Linus actually investing into a real enterprise storage like the Dell Compellent. It's vastly better than 45 Drive Solution he's using and price point is competitive once you enter the petabyte space. Management of those systems are fairly easy. Expandability is easier and adding performance tiers into the mix is also vastly easier.

unijab · March 5, 2019

1 hour ago, leadeater said:

Isilon last I got pricing was cheap enough and that storage platform is more in line with their workload needs.

I seriously doubt Linus would get the performance he wants for 6 editors from an isilon.

Maybe from an emc XIO.. but never the isilon.

lil_killa · March 5, 2019

Yeah, there's a huge leap between Enterprise setups and consumer one. I would say most SAN solution can get pretty expensive once you start throwing Enterprise Storage into the mix, but @leadeater did say he found some good pricing. However, the big problem still is the lack of knowledge required to properly setup the Storage. There are tons of settings and configurations to get an optimal performing array. I work with Storage Spaces all day long and there are still little tiny things that make it upset.

One solution I've done for someone is just to setup a Storage Space Direct Cluster with enough cache to handle the workloads. Once you have enough HDD storage to handle at least 25% of the "Cold" workload the users tend to not notice because not everyone is pulling from "Cold" storage all at once.

Also, there are commands to pin files/folders to Hot storage. It shouldn't be hard to write some sort of automation to automatically pin critical files and unpin pins that are no longer. Furthermore, I think he should have taken a better look into the setup and see if there are any bottlenecks on the Networking side. Two Optane 900ps' can do about 5GB/s Max, but his networking maxes out around 1GB/s which might of results in Higher Latency.

leadeater · March 5, 2019

44 minutes ago, Falconevo said:

@LinusTech Did any disk/IO metrics get retrieved from storage spaces and/or the network during the testing to make sure no 'bottlenecks' or configuration issues became the reason for slow downs?

Sadly none of the testing from what I could see in the video was valid, the background job to move data between the tiers wasn't run so everything was coming off those few HDDs hence garbage performance.

leadeater · March 5, 2019

35 minutes ago, unijab said:

I seriously doubt Linus would get the performance he wants for 6 editors from an isilon.

Maybe from an emc XIO.. but never the isilon.

The newer generation and the performance nodes would, either the All-Flash or the Hybrid nodes. Certainly the Archive nodes wouldn't but that's not what they are for.

The real important information here is that NVMe class performance is not required here, at all. Standard Server Mixed Use or Write Intensive SATA SSDs would more than do the job, with performance to spare. Seriously even enough HDDs would do the job, I can say that with 100% confidence since we have a media design course here than runs multiple labs off a ProMax storage array with no SSD in it at all. If a 'generic' storage company, that I didn't recommend, can do it all HDD with more active users than LMG has by a lot then basically anyone could. Video editing isn't actually IOPs demanding nor that latency sensitive, the issue with it is it has a critical usability failure point, you either have enough performance or you don't and everyone bitches. Other workloads just tend to go slower if you hit performance limits of the storage but has no direct impact to users, you have to get in to really serious performance issues for say VM datastores to become problematic and not just slow.

S w a t s o n · March 5, 2019

it's almost like he knew this wouldnt work and used the opportunity to release a video to pay for said $8k of NVMe drives

RoseLuck462 · March 5, 2019

Uhhh where can I get a jacket like that???

Thats a sweet jacket : )

Warpigeon503 · March 5, 2019

Watching the video really seems to me like you need a scalable private cloud server. I just got trained on installation and maintenance on infinidat infinibox servers. They might be worth taking a look at

TheNewJR · March 5, 2019

Hi, new guy here, just had to create an account to set some things straight on this topic as presented in the video. Sorry in advance for the length.

First off the error on creating a tiered virtual disk is almost always due to the fact that the Windows GUI goofs up the calculations for the maximum amount of space that you can use for the tiers. I've seen this over and over and the solution is to change the units to GB and then manually select a number slightly smaller than that. Just knock 5-10GB off each number shown for each tier and the creation will go through fine. I've never seen it have anything to do with the number of columns.

With regard to columns, if you use the GUI and want a simple setup, create the initial storage space using the minimum number of drives, which in the case of mirrored is 2 flash and 2 spinning. Then add the remaining drives afterwards. If you don't do this, you will need to add the same number of drives all at once later in order to increase the capacity (in the case of the video, that would be 4 more spinning disks, for example). This can also be solved by being up to speed on Powershell and column concepts. For those of you using Microsoft's System Center Data Protection Manager with storage spaces in a virtual environment, you also will have the same issue. My mode of operation is to create a simple space using one 2TB virtual disk on the SAN (or some other redundant storage) at which point I can add 2TB virtual disks at any time later. Do be aware of how columns work, so to maximize performance for your build, you MAY want to add 4 or more spinning disk at a time.

Next, ALL writes go directly to the flash tier until it is full. The 1GB write cache is a reserved write cache for when the flash tier is full. It is safe to say that even if you increase it to the maximum of ~100GB (don't bother), it is not going to be of much use. Your goal is to always have plenty of flash space available for incoming writes. For video editing, the best solution here would be to have enough flash to hold all of the video data that could be generated for any active projects. If the flash tier and reserved write cache are both full, writes go directly to spinning disk. Don't let that happen!

Windows maintains a heat map for the data blocks. It evaluates this based on a schedule (typically nightly, can be manually set to really anything) and moves data between the tiers. Your goal here is for data to sit in flash until it is very rarely accessed at which point it is moved to spinning disk. Now the problem here is that the cold data would have to be accessed a bunch for the heat map to decide it needs to be moved back to flash. RAM caching can help a little here. The upshot is that spinning disks could still be OK for this if everyone else is working out of the flash tier and there is just the occasional person accessing the cold data. Spindle counts could be an important thing to consider. 4 disks as shown in the video is really not going to be realistic.

So a good solution to the problem presented is to calculate how many writes will be generated over the period of perhaps 2 weeks. That is the minimum amount of space you'll want in your flash tier. You also will need to either gradually start using the system or seed it with the data and allow time for the cold data to be flushed from the flash tier so that you have enough space for new writes. If you fill up the flash tier, you can just forget about performance.

I've used this solution for many builds. It's ideal for backup servers, and also works fine for large file servers. If VMs are involved, you will want to set up a second set of flash for the OS virtual disks (recommended) or save enough space in your flash tier to create a second virtual disk that is not tiered, but flash only (not ideal but works OK). Allowing VM OS data to end up in the slower spinning disks is going to be VERY painful. Just don't do it.

As a side note, deduplication for any data that is not video is also an option and in many workloads can be a huge advantage. Be aware that the Windows implementation of this is actually a combined compression+dedupe. So the data is compressed first, then evaluated for dedupe. There is no way to dedupe the data without compressing it. Retrieving that data requires it to be rehydrated, which impacts performance and latency somewhat. As with any system that does post-processing for dedupe/compression, it is not ideal for certain write intensive workloads, although in the case of the aforementioned backup solution, it works perfectly well for that. Your flash calculation is based on the writes from the daily backups.

Also forgot to say that of course you need enough flash for writes plus your hot data...

Edited March 5, 2019 by TheNewJR

Cyanara · March 5, 2019

I feel Linus' pain. Trying to work out how to set up a customised Storage Spaces array was not my idea of fun. I found the documentation for the PowerShell commands to be somewhat vague, requiring a lot of trial and error. Still, being able to set up a RAID0/JBOD hybrid that could be expanded as necessary (up to 63TB anyway) without downtime and managed within the familiarity of a Windows environment was extremely helpful during the rapid transition from a small to medium business when storage needs were growing faster than the IT budget.

5 hours ago, leadeater said:

For the amount he's actually spending just buying a EMC/Dell/Netapp storage array would be vastly better.

I just had a look at the Dell ones because our office will likely need to upgrade to a professional turn-key solution in the next year or so, and I gotta ask: is it normal for those systems to only offer up to 6TB drives? I thought physical space was typically at a premium in data centres. Even upgrading to 100TB RAID10 with our slightly outdated 12TB drives feels like a formidable number of drives.

I'm guessing it has to do with proven reliability, and possibly the better rebuild times of smaller drives? Strikes me as substantially increasing the risk of drive failure though.

Mark77 · March 5, 2019

Linus and friends:

Please check this video out from Puget Systems:

Basically sounds like what you're trying to do, albeit its on FreeNAS, not Windows Server. Isn't one of your big 48-drive rackmount video servers already running on FreeNAS? They seemed to come out with some pretty decent/impressive results overall, and that might be applicable to the sort of workloads that occur internally at LMG... Completely plug and play as well, no Linux/*BSD/Unix or Windows magic involved either...

leadeater · March 5, 2019

1 hour ago, Cyanara said:

I just had a look at the Dell ones because our office will likely need to upgrade to a professional turn-key solution in the next year or so, and I gotta ask: is it normal for those systems to only offer up to 6TB drives?

No, they should (ours have 10TB disks in them) have larger options. Sometimes it can depend on workload performance too though as the larger drives they have supply of are slower than the 6TB/8TB disks, more disks also means more performance. We also use 60 bay disk shelves.

1 hour ago, Cyanara said:

I found the documentation for the PowerShell commands to be somewhat vague, requiring a lot of trial and error.

Very true, the documentation is extremely poor and there is very little distinction between Storage Spaces and Storage Spaces Direct which are two very different thing built on the same storage subsystem architecture, there are features that are Storage Spaces Direct only for example. There's also the 'New way' and the 'Old way' of doing tiering but the 'New way' from what I can tell is only supported on Storage Spaces Direct even though I have been able to do it on a single server setup, so um? What?

The new way is much better than the old way the relies on scheduled tiering tasks, it's truly dynamic.

leadeater · March 5, 2019

1 minute ago, VegetableStu said:

but I still haven't got to understanding the columns thing ,_,

You typically see them holding up roofs and awnings

dmotles · March 5, 2019

Note that Isilon grinds to a halt if you exceed 86% capacity.

leadeater · March 5, 2019

1 minute ago, dmotles said:

Note that Isilon grinds to a halt if you exceed 86% capacity.

So do Netapp's if you fill the aggregate more than 90%.

dmotles · March 5, 2019

1 minute ago, leadeater said:

So do Netapp's if you fill the aggregate more than 90%.

but not Qumulo!

ahuckphin · March 5, 2019

why is taran working in the library?

Marvin_Nor · March 5, 2019

Hey Linus,

I work with Storage Spaces and Storage Spaces Direct at my work place.

If you'd like some help with setting up a Storage Spaces or S2D (Storage Spaces Direct) environment, let me know.

There are a lot of documented best practices, but there are also some field best practices regarding how to setup Storage Spaces for good performance. You could run into huge bottlenecks when it's not configured as it should be, or using wrong hardware.

There is also some hardware requirements you should meet, especially for disks, so that you don't run into throttle issues. This is due to disk firmware in most cases (been cases where people have seen their performance increase by four times due to a firmware upgrade on disks). Some issues related to disk can be the disk sizes, if you under budget the cache tier, you'll run into performance issues. Rule of thumb is that cache should be 15% of cold data in size (before mirroring/parity).

Also, ReFS vs NTFS is something to look into. ReFS uses Block Cache (in memory cache) ontop of the Cache Tier (SSD or NVMe).

If you're going to invest in new hardware, a recommendation would be 2 x S2D servers (as a minimum) setup in a 2-way mirror.

Also, to shed some light on the way Storage Spaces work with cache: All writes are done to the fastest tier first; NVMe > SSD > HDD.

NVMe will work as write cache for SSD and HDD.

NVMe will only work as read cache for HDD, it's not a read cache for SSD.

SSD will work as both write and read cache for HDD.

Data will be cached based on a usage algorithm. So it will only shuffle files down from cache when they're rarely used, or when a file on the slow tier is consistently used more than a file that is in the cache tier. This also means all new written files will be cached for a while. The cache will by default also shuffle out files when it's filled up by 70%, so that it got free room for new writes.

Storage Spaces / S2D doesn't really tier data either, unless you have a three tier setup; NVMe, SSD and HDD. With only two "tiers" one will be cache and the other will be storage.

In a three tier solution, NVMe will be cache, SSD will be fast tier and HDD will be cold tier.

Mikensan · March 5, 2019

Many ways to skin a cat, but archiving with breadcrumbs was a solution I used at a previous job. Leaving the folder structures in tact, a user would get their project archived and the project folder itself flips to a shortcut. I feel like today you would instead use DFS+PS or maybe even symbolic links to make it even more seemless. Was a solution I had to figure out because engineers were too "busy" to archive their own projects. Was fun in the end though.

leadeater · March 5, 2019

39 minutes ago, Marvin_Nor said:

Storage Spaces / S2D doesn't really tier data either, unless you have a three tier setup; NVMe, SSD and HDD. With only two "tiers" one will be cache and the other will be storage.

Even 2 tier is still tiering, both tiers contribute to total capacity of the provisioned virtual disk unlike caching where the cache does not contribute. It's not really 'caching' either, it acts like caching because writes happen in the fastest tier but you want that, difference is new data will stay in the fast tier until tier high water mark or aged where caching will write the data to the underlying array as soon as possible and that data lives in both.

Storage Spaces only stores the data chunks in a single tier, it's migrated between them so you won't have same data chunk in 2 tiers at the same time. The exception to that is modifying data (which is technically new data) because it's written to the fast tier then replaces the expired chunk.

Most of the new fancy features are ReFS only though.

Sign In

These Servers are TOO EXPENSIVE

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites