Our data is GONE... Again

Snoz · January 29, 2022

I agree with those talking about using a tape Library. Since most of the data is going to be archival, an Linear Tape File System (LTFS) could be used. You can get devices like what used to be called StrongBox NAS - now called StrongLink LTFS. It connects to the backend tape library running LTFS and has a disk cache, so the tape library appears as a NAS device on the network. You can migrate older data from your live file system across to the archival LTFS system. This helps alleviate the pressure of always expanding disk storage as the data grows. You can pick up some older tape libraries which are capable of supporting up to LTO8 on the second hand market - some of them if they come out of large data centers usually have a full slot licence (which is usually a big cost). You just need to replace the LTO4/5/6 drives they are usually sold with for LTO8. LTO8 tape drives alone though may cost more than the second hand library - LTO8 SAS are around US$7500 and LTO8 Fibre Channel are around US$8200.A library with 166 slots and storing LT08 uncompressed at 12TB would give you 2PB archive, or even a 133 slot library would give you around 1.6 PB.

LordHelmchen · January 29, 2022

Did the LTT Team ever considered STORJ as a backup solution. I mentioned this video already in the Storj forum. The Storj support team is for sure willing to help prevent data loss...AGAIN!

BR

Bigun · January 29, 2022

1 hour ago, Nystemy said:

I think you are missing my point.

The majority of the data is dormant and in its own storage pool, it never changes except when new files are added. Ie, there should never be any incrementals to deal with here. Just storing the files as Files in chronological order onto tapes and storing their related file path that they belong to is sufficient as a backup. Since this data itself never changes. To be fair. They could just copy over the project folders from the vault one by one onto tapes. No need to even be fancy with keeping intact file paths. _{(other than keeping file linking in projects intact, but this is rather easily solved by proper folder naming conventions.}~~_{(my own archive of furry yiff isn't much different than this...)}~~₎

The active portion of the database, ie the current projects in the pipe will however see files that do get edits, these are however stored on another server. And do get incrementally backed up onto the Vault on a daily basis and fully backed up every week. It is put into its own storage pool and isn't mixed with the otherwise dormant data. (IIRC)

The active data will however need a more fancy backup system, but this is however only a few TB and LMG has access to 6Gb/s up to the larger internet _{(and 10 Gb/s to the Vancouver internet Exchange)} and could likely store their backup on some webservice where they would then only really need a handful of TBs. _{(Likely only storing a simple mirror of whatever is in the pool on the Vault. it should be sufficiently up to date)}

So how don't I address the RPO if I may ask?

There is two datasets here, one huge one with data that never changes, and one small that handles ongoing projects and these two databases are largely isolated from each other.

"We crashed? No worries, we can restore from tape. Grab literally every tape we got..... yes, even the ones that we first wrote about 4 years ago. What do you mean one went bad? Can't we just ignore that one tape and restore? We can't... then are we... hosed?... Oh... oh no"

This ---^

Cycling through the tapes roots out the bad ones and keep the full tapes updated - decreasing the RPO. I get your point, save on infrastructure and time backing up. However, keeping the full tapes up to date on a monthly (or regular even) basis will save on recovery times and the potential of tapes going bad. Take no chances.

Now if your arguing that infinite incremental tapes won't increase RPO, then we're going to have to agree to disagree.

maskmcgee · January 29, 2022

How many times in the entire history of every one of their channels have they ever used a clip from a previous video which didn't make it into the Youtube video? 0 times?

The 'data' is 100% useless. Never used, never will be.

Bigun · January 29, 2022

8 minutes ago, maskmcgee said:

How many times in the entire history of every one of their channels have they ever used a clip from a previous video which didn't make it into the Youtube video? 0 times?

The 'data' is 100% useless. Never used, never will be.

No they do, just not very often. Unless I missed the sarcasm.

maskmcgee · January 29, 2022

1 minute ago, Bigun said:

No they do

Example?

rhradec · January 29, 2022

DON'T REPLACE ALL THE FAILLING DRIVERS AT THE SAME TIME!!!!!!!!!!!!!!!!

put the drives you guys took out back, and see if the zpools come up again!

The errors reported by "zpool status" doesn't matter at this point... you only got that, because you replaced all fault drives at the same time, specially on the machine where the drives only had erros, but where still online on pool!!

Put everything back the way it was, with the fault drives on they original place, and see if the pools come up.
If they don't try to import then again, with "zpool import -a" and see.
If they do come back up, reboot again just to make sure they come back up on they own again. (don't worry about errors for now) now you replace 1 drive at a time, and let the zpool resilver!!!

Once one drive resilvered, you reboot, check if all pools are online, and replace another... and so on.
NEVER replace all the drivers at the same time!!! Even with unavailable drives, replace one first and let it resilver... that way when you replace the second, it will have the data on the first new drive to work with, putting less strain on all the old drives!!

You have to put the original unavailable/faulting drives back on the pools anyway if you want to dump the data in another storage, or else you won't be able to access much of it.

I had this EXACT problem before with zfs when people replaced all faulting drives at the same time thinking that's was the "safest" thing to do!!

By putting everything back on the way it was and making the pools online again by "zfs import -a" or even by import with the pool ID, I was able to fix the pool by replacing the drives one by one, let then resilver in between. No data was lost, despite the billions of errors zpool status spitted out!

I even fixed a standard linux RAID5 on a Qnap NAS with 3 disks faulting, by doing the same. I put all the 6 drives on a linux machine, replace one by one, with resilver in between.

I take care of servers and storage of 2 VFX studios overseas, and I had my fair share of people getting anxious when storage starts to fail and deciding the best thing is to replace all "broken" drives at once instead of doing it in steps, carefully, waiting the software to do its thing on one drive at time.

I understand It does seem like the safest is to take all the broken stuff out, but you have to go slowly.

By the way, I live in Vancouver, if you guys want some help.

Bigun · January 30, 2022

22 minutes ago, maskmcgee said:

Example?

A couple of random clicks yielded results (@ 1:51):

*edit*

Strike all of that, misunderstood you, but I'm sure they do. Also having the source video to edit it how they want is important. It's their IP after all.

Nystemy · January 30, 2022

1 hour ago, Bigun said:

"We crashed? No worries, we can restore from tape. Grab literally every tape we got..... yes, even the ones that we first wrote about 4 years ago. What do you mean one went bad? Can't we just ignore that one tape and restore? We can't... then are we... hosed?... Oh... oh no"

This ---^

Cycling through the tapes roots out the bad ones and keep the full tapes updated - decreasing the RPO. I get your point, save on infrastructure and time backing up. However, keeping the full tapes up to date on a monthly (or regular even) basis will save on recovery times and the potential of tapes going bad. Take no chances.

Now if your arguing that infinite incremental tapes won't increase RPO, then we're going to have to agree to disagree.

Yet again you totally missed my point.... I fully understand what you are talking about and it is really how one should do things for a lot of database backup applications, but it isn't applicable for this kind of database. _{(Now I wouldn't be surprised if most databases concerned with exclusively accumulating static files have poorly implemented backup systems for the application, it seems to be a very wildly spread implementation that frankly don't fit well for this sort of database)}

I though like how you have forced yourself into thinking of exclusively using a "smart" solution to database backups, completely missing the forest for the trees and seemingly having forgot the most basic of backup methods around... _{(though this is fairly typical of a lot of "professionals".)} All while pointing out all the issues of trying to be "smart", while the "stupid" way of just copying the folders onto another storage device like a plebian _{(though likely through a scheduled backup script to be fair)} is not having any of the issues you bring up, while also offering a selection of solutions to the only problem left.

That problem being, what if a tape goes bad?

No problem, you just don't have the data on that specific tape. _{(If you say, That isn't how it works, then you haven't read anything of what I have said, at all. And I am not surprised.)}

And considering the price of tapes, one can just split a tape in two halves, where one half stores the later half of the prior tape, ie giving you two copies of all the data in the array though at twice the price. This greatly reduces one's risk of loosing data. _{(not that tapes tends to go bad, with a shelf life of 30 years and generally fairly minimal magnetic susceptibility it usually don't bit rot much either. And basic stream error correction when writing to the tape will fix most of those errors.)}

One can use other more fancy tape storage solutions, like using a RAID 5/6 array of 4-8 tapes and keep them in smaller groups if one wants to be more cost effective, this though requires a lot more fiddling to get that to just work, but if one or two tapes in the group fails, no worries. And statistically that isn't likely to happen. _{(Though, technically here one wouldn't rely on RAID, but rather some other error correcting scheme, RAID 6 can technically only fix single bit errors since it can't point out who is wrong on a double bit error but it knows that there is an error. (RAID usually solves this due to two hard drives being "gone" and the others obviously tells the truth.))}

And for a database that only stores static files, there is no concept of "up to date" files, when a file is placed on the storage array, it will never change, never get edited, and never move. Ie, the tape storing a copy of those files will never need to be updated, ever. There is no incremental backups at all to consider... _{(And if you are still convinced that I am talking about an endless string of incremental backups, then read again.... Since I have always talked about the exact same thing, ie straight up copies of the folders, or the more nuanced methods of doing that so that we can add files to folders arbitrarily by instead saving out files chronologically by only taking files added after a specified date into our current backup disk. And this is where scripts really help.)}

So in the end, to clarify.
I am Not talking about incremental backups. _{(How you have managed to miss that for multiple posts is impressive, especially when I clearly state that I am not talking about incremental backups and snapshots of whole databases... But rather taking periodic chronological copies of the files within the database and storing them on another media in a way that preservers the file paths so that dynamically linked files in projects "just work".)}

I should also add, I am not talking about "Best practices for database backups in general", but rather how to efficiently backup a database containing exclusively static files.

Bigun · January 30, 2022

13 minutes ago, Nystemy said:

"No problem, you just don't have the data on that specific tape. _{(If you say, That isn't how it works, then you haven't read anything of what I have said, at all. And I am not surprised.)"}

Oh, got it. I guess Linus shouldn't have had his data on those bad disks, problem solved.

Also, define incremental backups and explain with mental gymnastics how it's any different from what you're proposing.

Also, now that the convo has devolved into ad hominem, I'm done responding to you. Have fun!

Nystemy · January 30, 2022

2 minutes ago, Bigun said:

Oh, got it. I guess Linus shouldn't have had his data on those bad disks, problem solved.

I provided two solutions to that problem, you reflected over neither.

I guess this is a complete waste of my time. But I hope someone found it interesting to know that databases explicitly handling static files can be backed up differently than a database that has a more typical mix.

Ejdesgaard · January 30, 2022

Hello,

Interesting video...

You clearly need a storage solution that can be treated like cattle, not house-pets... Here is my suggestion for such a solution.

ZFS might be that solution, for a pure scale-up(Staying in the same chassis) solution, but when you start to scale out, which clearly is the case here, you need to consider a storage solution that, in your case, is designed for both scale-out AND self-healing.

It's my understanding, from the video, that you have 2 old servers, 2 newer server and plan to buy a few more new servers.

At this scale, it starts to make sense to look into a scale-out orientated storage solution named Ceph https://ceph.io/en/

Why ceph ?

Glad you asked, it's a software defined storage(SDS) solution, designed with 2 main goals in mind.

Data integrity
1. Self-healing
Scalability (many chassis, 1 cluster.)

Here's a promo video from back in 2017, that gives a quick intro to what ceph is.

What does Ceph run on ?

Ceph, as the truly software defined storage solution it is, can run on any x86_64 hardware you decide to buy, or have around (can also run on newer 64bit arm)

What does it cost ?

It's open source https://github.com/ceph/ceph

You can spend a bit of $ on a subscription from one of the companies that offer enterprise support on it, such as Red Hat, 45 Drives and others

I hope you will consider to look into this, before you make a final decision on your next step

My suggestion:
From the info that was provided in the video, I would do something along the lines of:

Have a closer look at Ceph
Assuming it's decided to go with ceph:
1. Have a look at, at least, 3 new chassis, big enough to hold the content of "New Vault", preferably in a replica3 setup, or if pro's/con's for erasure-coding is in EC's favour, then setup the new pool on the cluster in a 4+2* EC configuration, with 2 chunks on each node (must be changed to 1 chunk on each node when all servers are in the cluster)
2. Verify hardware integrity of the "new vault" chassis, followed by joining then to the new ceph cluster
  1. SMR drives are a no-go
3. Move "Old vault" data to the ceph cluster
4. Verify hardware integrity of the "old vault" chassis, followed by joining them to the new ceph cluster
  1. SMR drives are a no-go
5. Buy another nice screen for the office, that shows the live status of the cluster via grafana (included in the Ceph deployment)
  1. Grafana dashboard json's for manual import: https://github.com/ceph/ceph/tree/master/monitoring/grafana/dashboards
6. Ensure that you always have enough free space to allow for 1 node failure.

* 4+2 means that the data will be chopped up into 4 chunks, then 2 aditional chunsk will be calculated (similar to raid6) and those 4+2=6 chunks will then be stored on the cluster, in accordance with the failure domain configuration (2 chunks pr. node to begin with, and when all data is moved over and we have >=4+2+1 nodes in the cluster, then the failure domain configuration can be changed to 1 chunk pr. node)

Windows7ge · January 30, 2022

Hmn...I suddenly feel compelled to add a regular scrub job to crontab...

Bigun · January 30, 2022

22 minutes ago, Windows7ge said:

Hmn...I suddenly feel compelled to add a regular scrub job to crontab...

mdadm?

Windows7ge · January 30, 2022

Just now, Bigun said:

mdadm?

ZFS. I know it's supposed to do weekly scrubs on it's own but I set it up on FreeBSD 13.0 so I don't actually know if it's doing it unless I check zpool status which I haven't seen it reporting regular scrubs.

Think I'll set a weekly manual scrub to air on the side of caution.

William Payne · January 30, 2022

I would really love to know what keeps going wrong in detail for these guys? I love the channel but I do see a lot of “radical” stuff done in the name of content when really they just needed a basic reliable setup that just does the job it needs to do reliably.

Unfortunately though for YouTube catering to a “gaming” audience, reliable and good enough may not be the most exciting thing to pull in viewers.

MultiGamerClub · January 30, 2022

Its sad to think another data loss has happened.. As many others have said, hiring someone from the outside to periodicly check for more spesific backups are indeed running as they should or get someone to come in once a week to check that all systems are running as they should.. Doesnt sound like a bad idea.

I keep seeing ZFS above too, time to google what the hell that is.

dogwitch · January 30, 2022

1 hour ago, William Payne said:

I would really love to know what keeps going wrong in detail for these guys? I love the channel but I do a lot of “radical” stuff done in the name if content when really they just needed a basic reliable setup that just does the job it needs to do reliably.

Unfortunately though for YouTube catering to a “gaming” audience, reliable and good enough may not be the most exciting thing to pull in viewers.

the 1 time thru it was a bad install of a circuit. not there fault. another was a back plan on 1 of the server they had a long time ago. both out of there control. this one thru was .. yeah we should have follow thru.

GodAtum · January 30, 2022

I'm concerned how Linus runs his company. He needs to hire a storage administrator or at least an IT admin. I wonder if they ever deploy security patches to their servers and networking hardware?

Also, he can just backup to AWS Glacier for archive storage.

Rex Hite · January 30, 2022

This opportunity to make new content must outweigh any minor inconvenience of losing some relatively inconsequential data. Seems like a win.

I see that they have done a lost-data video before and used that as an opportunity to show some recovery options.. They also did a video decrying how some other youtubers decided not to let Linus's team build their serves. More a hardware issue there, but still...

At least it's content that doesn't feature toys.

Needfuldoer · January 30, 2022

LMG has grown exponentially since the days of the Langley house, but in may ways it feels like the server room is still in that en-suite bathroom. Sure there's more, faster equipment, but it seems like there's still a lot of tinkering going on in what should be a relatively static environment.

The right tool for a production company the size of LMG is a tape library backed up by an archive management server from someone like Telestream or Masstech. Online, recently used data lives on a regular spinning-drive server, and the management system rotates it out to the tape library. (You can think of it kind of like a bigger, slower version of ZFS, with the spinning-drive server as the ARC/SLOG and the tapes as the zpool.) As files sit on the spinning drive pool unused, they get migrated to cold storage on the tapes. If someone needs footage off the library, they do a pull request from the server and it reads that request off the tapes to the live storage. They can add additional storage by just buying more tapes, but once you have more tapes than slots they'll need someone to feed the machine the tapes it asks for.

That's the way to go for a deep cold storage archive that only needs to be accessed occasionally. Tapes are far more shelf stable than hard drives.

https://spectralogic.com/products/spectra-stack-tape-library/

https://www.telestream.net/kumulate/kumulate.htm

19 hours ago, Nystemy said:

Googling a bit, an LTO 8 drive costs about 5 grand.
And the LTO 8 tape is about 25$/TB, and this is what I can just straight up buy now...

Where did you find that price? B&H has 12 TB LTO-8 tapes brand new for $80 USD each, before any partner discounts. Now that LTO-9 is the new hotness in massive datacenter storage, and the LTO-8 patent bickering is settled, prices dropped.

https://www.bhphotovideo.com/c/product/1662715-REG/fujifilm_16551221_lto_8_ultrium_12tb.html

Yes, the drives and libraries are expensive up-front, but that solution scales far more cost-effectively than spinning drives (and LMG is far beyond that point). They could theoretically match the storage capacity of each Petabyte Project with 100 LTO-8 tapes, for a cost of about $8,000 USD.

Even without a library, that opens up a very affordable backup solution: off-site physical media at a climate-controlled storage facility or archival company like Iron Mountain. 100 tapes will fit in a Legal size file box.

3 hours ago, GodAtum said:

I'm concerned how Linus runs his company. He needs to hire a storage administrator or at least an IT admin. I wonder if they ever deploy security patches to their servers and networking hardware?

Same. Anthony and Jake are pretty good at what they do, but like Linus said server maintenance isn't their primary job. They really need a dedicated IT admin and media manager at this point.

The production server room really shouldn't be a sandbox anymore either, if I'm honest. That's your production equipment, your bread and butter, the core of your business, that everything relies on. Stop messing around with it!!! Yes, "I accidentally brought my server room to the brink of disaster... AGAIN" makes for entertaining (if butt-clenching) content, but it takes what has been described on multiple occasions as LMG's most valuable assets and puts them in jeopardy. As someone who works tech support at a much larger media production company than LMG, that makes me worry. No bueno.

3 hours ago, GodAtum said:

Also, he can just backup to AWS Glacier for archive storage.

At Petabyte Project's scale, they're probably looking at $5,000/mo in storage costs with Glacier. An LTO library would pay for itself within a couple years at that rate.

11 hours ago, William Payne said:

I would really love to know what keeps going wrong in detail for these guys? I love the channel but I do see a lot of “radical” stuff done in the name of content when really they just needed a basic reliable setup that just does the job it needs to do reliably.

Unfortunately though for YouTube catering to a “gaming” audience, reliable and good enough may not be the most exciting thing to pull in viewers.

They've done some serious videos in the past, like the time OG Whonnock died (F.U. hardware RAID) and the time Anthony, Linus, and Jake spent a weekend blowing up their old infrastructure while trying to install a new PFSense firewall. I like those kind of meat-and-potatoes videos, even when things get done in a "knows just enough to be dangerous" kind of way.

I for one, would love to see some more serious IT content, like a collab with someone like Wendell from Level 1 Techs or even Jeff from Craft Computing. You're right that "big number go BRRRRR" inevitably attracts a wider audience, though.

11 hours ago, MultiGamerClub said:

I keep seeing ZFS above too, time to google what the hell that is.

ZFS is basically a magic filesystem. Think of it like nested RAID in software, with some serious caching capabilities to speed it up. But with great power comes great responsibility, and it's easy to build a drive pool that's nowhere near as resilient as you think it is.

dogwitch · January 30, 2022

linus has a direct line to wendell .

my guess linus has ask wendell what the cost where.

idk what jeff does a day job.

Nystemy · January 30, 2022

4 minutes ago, Needfuldoer said:

The right tool for a production company the size of LMG is a tape library backed up by an archive management server from someone like like Telestream or Masstech. Online, recently used data lives on a regular spinning-drive server, and the management system rotates it out to the tape library. (You can think of it kind of like a bigger, slower version of ZFS, with the spinning-drive server as the ARC/SLOG and the tapes as the zpool.) As files sit on the spinning drive pool unused, they get migrated to cold storage on the tapes. If someone needs footage off the library, they do a pull request from the server and it reads that request off the tapes to the live storage. They can add additional storage by just buying more tapes, but once you have more tapes than slots they'll need someone to feed the machine the tapes it asks for.

That's the way to go for a deep cold storage archive that only needs to be accessed occasionally. Tapes are far more shelf stable than hard drives.

https://spectralogic.com/products/spectra-stack-tape-library/

https://www.telestream.net/kumulate/kumulate.htm

Where did you find that price? B&H has 12 TB LTO-8 tapes brand new for $80 USD each, before any partner discounts.

Yes, the drives and libraries are expensive up-front, but that solution scales far more cost-effectively than spinning drives (and LMG is far beyond that point). They could theoretically match the storage capacity of each Petabyte Project with 100 LTO-8 tapes, for a cost of about $8,000 USD.

Even without a library, that opens up a very affordable backup solution: off-site physical media at a climate-controlled storage facility or archival company like Iron Mountain. 100 tapes will fit in a Legal size file box.

I just took the price from the first brick and mortar store near where I live. The statement were rather about "even a bad deal on LTO-8 tape isn't all that expensive." _{(though, the person I replied to talked about a multiple small automated libraries + tapes, so their cost calculation were a bit excessive due to that. While my idea of backup for this specific scenario were simply to stack tapes onto a bookshelf at home manually. (I am not talking about a "general best practice backup for databases" but rather, "something I consider reasonable for LMG's database given the information Linus has provided about their requirements throughout the years."))}

I know LTO tapes are cheap and exceptionally cost competitive against hard drives for pure backup applications where one hopefully won't need to ever read them.

In regards to storing them, yes a climate controlled environment is a decent thing, though putting them in a zip lock back in a closet is good enough dust protection, and humidity can be relatively easily be controlled by adding in a desiccant bag along with the tape _{(they should preferably be kept dry)}. Then it is mostly just temperature cycling left to worry about, and most people keep their homes more or less temperature regulated these days. Going to a climate controlled storage facility is another step up, but usually with a decent price tag attached, and the main thing such facilities offer is theft protection, and if the data isn't valuable in that regard, then it is a bit silly to pay for such security.

And yes, in regards to size, a cheap bookshelf probably stores a good few tens of PB with ease.

For a smaller company like LMG dealing with a fair bit of data that can be fun to archive then I personally don't see any major reason to go full out on having a "proper" backup. Their vault of old content isn't mission critical according to their own statements, it more a "fun" long term project.

Though, they are moving up to the area where a tape library can potentially make sense. But it depends on how they work with their data. HDDs have the major advantage of superior access times compared to tape. Tape is up at minutes, HDDs is ms, take another equal step down in latency and we aren't far away from DRAM access times.

For some applications, a tape library never makes sense, regardless of its cost per GB advantage, access time can matter a lot for some applications. And Adobe's suite of programs is apparently prone to crashing/bug when faced with "long" access times to the point that LMG's main storage server uses FLASH only, and two 25Gb/s connections. _{(or did they switch to 100 Gb/s recently? I don't remember...)} So projects linking to files out in the vault would require fairly low access times. And this is a reason why I don't think using tape in an active fashion is worth while in this situation.

_{A good example of the supremacy of HDDs in online storage servers is Google drive and Youtube, you can access any file, instantly (might though take a second, got to spin up the drive). They simply couldn't use tape for primary storage, and no caching solution would have a hit rate of 100%. It is many exabytes of data, tape is cheaper per GB, but the access times makes it a complete no go. No user is going to accept 10-30 seconds of waiting for the library to get over to the tape, followed by on average another 1 minute of waiting to get to the right place on the tape before they can access their file and that is if a drive is free to read it. But I also wouldn't be the slightest bit surprised to see some impressive tape libraries at Google, since tape is still an exceptionally cost effective cold backup.}

But if LMG do use the vault to store other data other than old video, like graphics design for their merchandise, or various paperwork and other documents, then they should at least have a basic backup. Or at the very least make sure that when an HDD gets flagged as "bad" that it actually alarms them so they can fix it within the same day, preferably having at least 1 hot spare in each storage server as well. Not to mention that one can always backup data on a file level onto tapes and putting them on a bookshelf works fine for the static data, and partly for the active as well (and one can make two copies of all files, tape is sufficiently cheap).

And somehow this post became a book...

Needfuldoer · January 30, 2022

Linus needs to install a telecom rack if he wants to summon Wendell.

1 hour ago, Nystemy said:

For some applications, a tape library never makes sense, regardless of its cost per GB advantage, access time can matter a lot for some applications. And Adobe's suite of programs is apparently prone to crashing/bug when faced with "long" access times to the point that LMG's main storage server uses FLASH only, and two 25Gb/s connections. _{(or did they switch to 100 Gb/s recently? I don't remember...)} So projects linking to files out in the vault would require fairly low access times. And this is a reason why I don't think using tape in an active fashion is worth while in this situation.

That's where having a storage management server (and ideally a human media manager) creates a tiered storage system, a lot like they have now:

Active projects live on New New New Whonnock

Recent (but inactive) projects live on a spinning drive pool

Archived footage lives on tape

When someone needs footage from the archive, they submit a request to the management server. The server then grabs the appropriate tape and restores the files to online, active storage (either the HDD pool or to Whonnock). As inactive projects age out, they get moved to the Vault, then finally archived to tape, freeing up online storage space. The tape library is deep storage in this use case, not a backup. (Being able to back up to redundant tapes that can be cycled out to off-site storage is a nice bonus.)

This does mean they wouldn't be able to just dip into the archive and immediately grab whatever files they want whenever they want them anymore, but I believe the scalability and resilience of a library is appropriate for the kind of workload they have now.

And I think the problems they were seeing with their edit machines stem from working with massive raw 8K footage and how sub-optimal SMB is for editing over. IIRC there are two 100 gig connections between Whonnock and the editors' switch, then each edit system gets a dedicated 25 gigabit link to the switch. That should be more than plenty, especially now that their working storage is a full NVME machine.

And if they aren't doing so already, all their non-video documents (office files, finance, HR, merch projects, even a redundant copy of the video editors' templates, etc) should live on their own server. A 24- or 48-bay 2.5" 2U server full of SAS drives, SATA SSDs, or NVME SSDs would be more than enough for this. It could still replicate its backups and archives to the bulk pool and tape, but the live storage for the non-video operations should get its own server with a data retention, compression, and snapshot policy that's more appropriate for that kind of workload. That way, if the editors fill up (or crash) Whonnock, it doesn't affect the entire operation. Those files are much smaller than raw 8K footage anyway, so they can stay active a lot longer without needing to be flushed out to cold storage as often (if ever).

Nystemy · January 30, 2022

1 hour ago, Needfuldoer said:

Linus needs to install a telecom rack if he wants to summon Wendell.

That's where having a storage management server (and ideally a human media manager) creates a tiered storage system, a lot like they have now:

Active projects live on New New New Whonnock

Recent (but inactive) projects live on a spinning drive pool

Archived footage lives on tape

When someone needs footage from the archive, they submit a request to the management server. The server then grabs the appropriate tape and restores the files to online, active storage (either the HDD pool or to Whonnock). As inactive projects age out, they get archived to tape, freeing up online storage space. The tape library is deep storage in this use case, not a backup. (Being able to back up to redundant tapes that can be cycled out to off-site storage is a nice bonus.)

This does mean they wouldn't be able to just dip into the archive and grab whatever files they want whenever they want them anymore, but I believe the scalability and resilience of a library is appropriate for the kind of workload they have now.

And I think the problems they were seeing with their edit machines stem from working with massive raw 8K footage and how sub-optimal SMB is for editing over. IIRC there are two 100 gig connections between Whonnock and the editors' switch, then each edit system gets a dedicated 25 gigabit link to the switch. That should be more than plenty, especially now that their working storage is a full NVME machine.

It is a good question what is a more appropriate solution.

HDDs aren't all that expensive, and their advantage in access time is a decent value add to consider. Though, their old 4K videos from back when they only really had Brandon in the filming department likely isn't all that much content in terms of data, and that were the case for a good few years, though, they have been recording on RED for a few years and now switched to 12K Sony if I recall correctly.

But in short, a lot of the really old content that gets referenced more seldom these days _{(before they moved to their office)} is likely a fairly small chunk of data compared to what has been produced after. So they would likely still need a fairly huge storage server with spinning disks regardless. _{(and some of the really old video for the first year or three is likely just full HD, but I haven't checked. But it would be fun to see a graph of data usage over time, I suspect it is steadily increasing in regards to TB/day.)}

And it isn't just LTT that pulls out old videos, I have seen both Tech Quickie and Techlinked do it as well, all though more rarely.

Though, most of the times callbacks are made, it is usually to previous appearances, shows/conferences, prior states of a technology, or prior projects they have done in the same topic. And considering how many conferences, guests and projects LTT/LMG has managed to go through throughout the years, it does leave a door open for a callback almost regardless of what is shown, ending up with a more shotgun approach to referencing older content, it is hard to know what is going to be referenced.

But my point is.

It usually isn't worth while the extra complexities of adding another tier to one's storage solution if the higher tier(s) is required to store 75-90% of the content regardless for smooth operation. Not to mention that it can throw a wrench into overall project planning, but this I think is a lesser issue.

Going for a "backup only" approach to tape is adding fewer complexities. Not to mention that one doesn't have to fear the day one's expensive tape library starts overflowing.

Sign In

Our data is GONE... Again

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites