Our data is GONE... Again

rushboarduk · January 30, 2022

I am smiling even as I say this (as the parties involved may likely reply a big, fat NO) but, on a side note, LTT could try and reach out to do a video on 5D storage...

The storage capacity of one of these crystal mediums is 360TB, and will last longer than Humanity, but has its own drawbacks. Such a technology will also, probably, never grace the consumer/commercial market in its current state, though the way it handles a data format like RAW video would be interesting to cover.

jde3 · January 30, 2022

Oh no! - I've been saying this for years that you guys need a full time sysadmin. You guys are great, but I went to collage for this stuff and I've had experience with ZFS since 2006 when it came out in Solaris.

Your data requirements are SO large for your business that 2x parity runs the risk of un-correctable disk errors on the remaining devices preventing you from resilvering the pool that as far as I can tell from your explanation that is the issue you ran into. It sounds like you have a good plan here for recovery hot spares can help with this but you HAVE to cron jobs setup and change them immediately when they fail.

You might be able to recover quite a lot of that but it's going to be tricky. I remember ZFS dev's talking about a way to recover partial data from a failed array but I don't know the status of that or if it was ever implemented. Too bad you're in Canada I'd help.. Actually... Contact Alan Jude, he lives in CA. He writes books on ZFS and he's a good guy. He's the guy I would go to if I ever was ever in the deep waters without a paddle. If Alan can't fix you up, nobody can.

Good luck guys!

Lurick · January 30, 2022

Sounds like ya'll need Pure or Netapp

(and to hire someone to dedicate to this)

erylflynn · January 31, 2022

Have you considered setting up a monitoring server? Free software I love for this is check_MK which I install and use from OMD the Open Monitoring Distro. https://omdistro.org

They have ZFS monitoring baked in I believe. And if you need more some easy python scripting could take care of the rest.

GDRRiley · January 31, 2022

On 1/29/2022 at 10:07 AM, manikyath said:

i cant quote exact figures because it's too long ago and actually sort of sensitive information, but a 24-bay robotic library (because you dont want to swap tapes on a daily basis...) with some matching backup software on the storage server side isnt all THAT expensive in the grand scheme of things, and once it's set up you basicly have a (fairly) flat cost per TB stored to grow the archive.

I'll mention a large IBM mutli cabinet tape library is only in the 1/4 million USD range

not ruining raid Z3 with this large of disks was a bad move from the start. and not updating software, the question was not if but when will it

Sean Amos · January 31, 2022

If you want a managed solution for offsite backups/archives, we use AWS S3 Glacier Deep Archive. It's not listed in the Backblaze comparison and it's $0,00099per GB, so about $1 per TB.

What's the catch, why's it so cheap? To take a file(s) out of archival can take up to 12 hours. It really is designed for highly infrequently accessed data that you want to keep around for long periods of time.

dogwitch · January 31, 2022

3 hours ago, Sean Amos said:

If you want a managed solution for offsite backups/archives, we use AWS S3 Glacier Deep Archive. It's not listed in the Backblaze comparison and it's $0,00099per GB, so about $1 per TB.

What's the catch, why's it so cheap? To take a file(s) out of archival can take up to 12 hours. It really is designed for highly infrequently accessed data that you want to keep around for long periods of time.

am look on that with some core data base stuff and my family pictures. just want a in case of that dirt cheap .

alpha754293 · January 31, 2022

On 1/29/2022 at 12:15 PM, jakkuh_t said:

This is absolutely ridiculous.

You guys have a video where Linus bought a $5500 tape drive

And you aren't even using it.

At 1-2 PB of data, there is absolutely NO reason for you NOT to be using a LTO-8 tape library at this point where you have anywhere between 2-4 tape drives in order to handle multiple backups and retrieve operations simultaneously.

The video above makes a patently INCORRECT/false statement:

"Backing up over a petabyte of data is really expensive. Either we would need to build a duplicate server array to backup to or we could backup to the cloud. But even using the economical option, Backblaze B2, it would cost us somewhere between five and ten-thousand US dollars per month."

The part where this statement is incorrect is that Linus (and the team who wrote this) left out, crucially, the option for using local (and/or offsite) tape backup storage solutions.

This is stupid. (I was going to try and write something nicer, but on second thought, no, this is just really stupid.)

For trying to backup 2 PB of data, you can buy one hundred and sixty-seven (167) 12 TB uncompressed/30 TB compressed LTO-8 tapes for $66.25 each from https://ltoworld.com/collections/lto-media/products/quantum-lto-8 which would total up to $11063.75 USD. (Double it if you want a father-son backup topology. Triple it if you want a grandfather-father-son backup topology.)

As of this time of writing, a dual magazine, 16-slot LTO-8 tape autoloader from Quantum is $4798 USD (https://www.backupworks.com/Quantum-superloader-3-LTO-8-16-Slot.aspx). The ATTO ExpressSAS host bus adapter is $497 (https://www.backupworks.com/atto-expressSAS-12GB-ESAH-1280-GT0.aspx) and the external mini SAS cable going from (SFF-8644 to SFF-8088) is $79.86 (https://www.backupworks.com/HD-Mini-SAS-to-Mini-SAS-2m.aspx).

All said and done, that totals up to $16438.61 USD which is what you would spend in about a-month-and-a-half, trying to backup 2 PB of data.

And that, you can throw it into basically ANY plain old system that you want, meaning if you have a system that has enough PCIe lanes, ANY system that is used to run the tape autoloader will work for you.

(I have an Intel Core i7-3930K (6 cores, 3.2 GHz) system with 4x Crucial 8 GB DDR3-1600 unbuffered, non-ECC RAM on an Asus X79 Sabertooth motherboard, and a Mellanox 100 Gbps Infiniband card (MCX456A-ECAT) that runs my tape backup system at home, albeit I'm only using a single, manual drive (no autoloader).)

Compare and contrast this to the fact that to buy one HUNDRED (100) Seagate Exos 20 TB drives from Newegg.com, that's $524.99 USD per drive * 100 = $52499.00 USD just in the hard drives alone, without the host system. (https://www.newegg.com/seagate-exos-x20-st20000nm007d-20tb/p/N82E16822185011?quicklink=true).

Therefore; like I said the statement "to backup 2 PB of data needing to build a duplicate server" and/or using cloud backup is a patently false statement because it leaves out the local tape backup option, which CLEARLY shows that it is the cheaper option. Even if you TRIPLED the number of tapes from 167 to 501 (for a grandfather-father-son) backup topology, you'd STILL only be out $38566.11 USD.

And I'm sure that Evan Sackstein from Backupworks.com would be able to put together a quote for whatever you needs and budget is going to be along with all of the hardware that you are going to need to get you up and running quickly (i.e. whether you actually WANT all 167 tapes to be able to and be ready to be accessed at ANY moment, or whether you would want to save a bit of money (or quite a lot of money), skip the auto loader, and just manually manage it yourself.)

I mean, that depends on how much you are or am willing to spend on it.

I have said this to people over and over and over again - ZFS is NOT a backup.

This video just proves/shows this point.

The exact failure mode that I have talked about (which is supposed to be rare in probability) is EXACTLY what happened here.

And for an operation like Linus Tech Tips, there is absolutely NO reason why they aren't running a LTO-8 tape backup.

Sure, it's not as fast as being able to have 2 PB of data, ALWAYS ON, and ALWAYS live, but the question is "do you REALLY need all of that data to be live, all of the time, when even, in the video, Linus admits that the team RARELY touches or needs some footage from the archive?"

(And yes, doing the initial backup of 2 PB of data is going to suck big time, but that is the result of NOT having the tape backup system up and running all along and waiting until you have experienced a catastrophic failure like this in order to try and recover your data instead of having deployed said tape backup library system when you got to around the 100-200 TB mark so that you would have been implemented the best practices for data management, archiving, backing your data up, and just general management.)

(Sidebar: I used Ubuntu to compile the LTFS that's needed to run my tape backup system, so if you need help with that (instead of connecting your tape drive over Thunderbolt to a Mac), I'm sure that you should have enough parts to be able to piece together a system that you can use as the system that just runs the tape backup system/drive/autoloader/tape library, and you can run upto 8 drives on a single SAS host bus adapter card. (Again, that's up to you how much you would want to spend in initial capital expediture/investment cost, because you only really need that many drives for your initial backup run of trying to save 2 PB of data all in one go, but after that initial backup, you might only new a handful of drives if you have multiple people trying to pull footage from your archive, at the same time.))

But yeah, this is otherwise, stupid.

And there is NO reason why you guys aren't rocking LTO-8 tape backups already, locally, on site.

*edit*
If you ONLY want to backup around 1.2 PB of data, at 12 TB per LTO-8 tape, you would only need 100 tapes. At $66.25 USD/tape, that would work out to be $6625 USD for the tapes alone.

The cost of the autoloader, the SAS HBA, and the external SAS cable remains the same, so your total would come out to $11999.86 USD.

By comparison, to backup 1.2 PB using the Seagate Exos 20 TB drives at $524.99 USD/drive, and for sixty (60) drives, that would still run you $31499.40.

In other words, the tape solution is STILL the more cost effective solution.

alpha754293 · January 31, 2022

20 hours ago, jde3 said:

You guys are great, but I went to collage for this stuff and I've had experience with ZFS since 2001 when it came out in Solaris.

That's awesome that you had access to the development source code.

I wasn't able to get into ZFS until it was rolled in to the main Solaris 10 6/06 ("U2") release which came after ZFS was released in build 27 of OpenSolaris.

jde3 · January 31, 2022

28 minutes ago, alpha754293 said:

That's awesome that you had access to the development source code.

I wasn't able to get into ZFS until it was rolled in to the main Solaris 10 6/06 ("U2") release which came after ZFS was released in build 27 of OpenSolaris.

I'm the same, Solaris 10. It *feels* like 20 years lol. First system I used it on for serious data was a SunFire X4500 with 48 spindles and it's been part of my job in some form or fashion ever since. Current business uses ZFS on FreeBSD as auxiliary storage next to NetApp.

alpha754293 · January 31, 2022

2 hours ago, jde3 said:

I'm the same, Solaris 10. It *feels* like 20 years lol. First system I used it on for serious data was a SunFire X4500 with 48 spindles and it's been part of my job in some form or fashion ever since. Current business uses ZFS on FreeBSD as auxiliary storage next to NetApp.

15 years ago. 20 years ago. Close enough, right?

Yeah, the first time that I used ZFS also in a serious fashion was I used to have a home built server with an AMD Opteron processor, I think, and an Adaptec 16 port SATA RAID HBA, and sixteen (16) 500 GB drives for a total of 8 TB of storage, which, back in 2006/2007, was a lot.

I finally jumped off the Solaris train when my Solaris VM finally died and I couldn't bring it back up (or not without spending a LOT more time than what I would like to spend trying to spin it back up), so the dumb, web server hosting tasks has now moved onto SLES12 SP1, and I use ZFS now in a TrueNAS Core 12.0 U1.1 server at home.

It's interesting to me whenever I am talking to people about ZFS and they are trying to "educate" me about said ZFS because they don't know about my history in regards to working with ZFS.

(And my most recent attempt at deploying ZFS on Solaris was just prior to the holidays, before I had installed TrueNAS Core 12 on my server because I wanted to see if I could get Solaris going on the system. (It had mixed results.) And the performance wasn't that much better in Solaris vs. TrueNAS, so I switched over to TrueNAS instead.)

If my Solaris VM didn't die recently, I would still be able to claim that I'm STILL using ZFS on Solaris (in some form), but pity that's not the case. (I actually had to set up a Solaris 10 PXE and DHCP server in order to be able to jumpstart the Solaris installation on my dual Xeon E5310 server. THAT was an adventure unto itself.)

isfot · January 31, 2022

You should check out amazon glacier. Storage price is $0.00099 per GB.

That is $990 per PB.

Only downside is that it can take several hours to retrieve files.

https://aws.amazon.com/s3/pricing/

jessejarvi · February 1, 2022

U$5000 to $10 000 per month for backup storage? Bullshit. 2PB on Azure Archive blob storage with a 3y reserved instance? $1200-1600 per month.

It takes maybe 2 hours to implement this with Veeam if starting from scratch.

Simple enterprise IT stuff.

Sean Amos · February 1, 2022

8 hours ago, jessejarvi said:

U$5000 to $10 000 per month for backup storage? Bullshit. 2PB on Azure Archive blob storage with a 3y reserved instance? $1200-1600 per month.

It takes maybe 2 hours to implement this with Veeam if starting from scratch.

Simple enterprise IT stuff.

Exactly. AWS, Azure and GCP have Archive storage designed as a managed alternative to tape storage that are all very competitive in price. They didn't do their research properly.

Needfuldoer · February 1, 2022

Those services are fine for a cold backup, but they get you on access times (and fees) for data retrieval. If LMG wants to maintain the ability to dip into their entire back catalog of footage at any time, an on-premises library makes sense. Then they can have a set of active tapes, and a set of backups at a climate controlled storage facility or document archival service like Iron Mountain.

9 hours ago, jessejarvi said:

Simple enterprise IT stuff.

That's just it, they're big enough now that they need someone with enterprise IT experience as a dedicated system administrator, and they don't have one.

moatmote · February 1, 2022

Using Stablebit CloudDrive and DrivePool, along with several google business accounts, you could've easily backed up all this data to Google Drive for something on the scale of a couple hundred dollars a month (mainly for the multiple g-suite accounts you'd use to get around the 750GB per day upload limit). CloudDrive and DrivePool have built in redundancy options as well, so you wouldn't be completely at the mercy of google drive not randomly replacing your files with old copies when data loss happens. In fact, I emailed you a while back about this when you made a video that mentioned google drive. The absolute gold standard in online backups? No, of course not. But cheap, and better than nothing.

I've used this method to keep a stable plex server of ~100TB for years now.

jde3 · February 1, 2022

What they are doing is fine, they just need to do it the right way. On the hardware they have a ZFS array like that is perfect.. all they need is an experienced storage admin (with FreeBSD experience) and they are good to go.

justTechDev · February 1, 2022

Since you are at petabyte scale and potentially growing with variable needs/speed of access , do look at object Storage solutions like CEPH (its very similar to S3 storage but on your own premises), these is what we used to use in enterprise with high data influx, rapid updates and storage of like > 10 PB (Where I used to work as developer). You would need some one time expertise configuring which I believe floatplane team can help you guys out.

Pros : High Availability data storage on commodity grade hardware ( No special high density CPU requirement), Inbuilt offsite data replication capabilities (config know-how req.)
Cons: Initial technical expertise from cloud capable network storage engineers.

Disclaimer: I am not a network engineer just have worked on products that needed that scale and archival capabilities but open-source.

jessejarvi · February 2, 2022

19 hours ago, Needfuldoer said:

Those services are fine for a cold backup, but they get you on access times (and fees) for data retrieval. If LMG wants to maintain the ability to dip into their entire back catalog of footage at any time, an on-premises library makes sense. Then they can have a set of active tapes, and a set of backups at a climate controlled storage facility or document archival service like Iron Mountain.

Yep, It’s only meant to be an archival solution as an offsite backup.

XWAUForceflow · February 2, 2022

Man... this was a disaster just waiting to happen, though I guess all in all it's not that big of a deal. (At least seeing how cool Linus was about it in the video) I really do hope that Linus takes the intro he made to heart himself. He needs someone whose only job in the company is to do the IT. Manage it, plan it, implement it and MAINTAIN it. Doesn't mean there can't be videos about it, doesn't mean the gang cannot chime in and help, but boy do you need one person that's dedicated to doing this.

Looking in here seeing all the 'solutions' on how to do backup for LTT? NONE of those will work if nobody is actually actively maintaining them. And with the myriad of other things that folks at LTT (and apparently especially the 'IT guys') do, those things will always get lost. And honestly, loosing your archive data is probably the least problematic there. I honestly hope your cyber-security is better maintained than this.

LapsedMemory · February 2, 2022

At the very least, I hope this at least taught you to turn on the "Email notification on HD failure" feature on these arrays. Would have prevented this entire mess, as you could have fixed the issue after the first drive 'failure' instead of catching it after multiple failures.

jde3 · February 8, 2022

LTT and community please read this. (for the love of humanity..)
https://storagemojo.com/2010/02/27/does-raid-6-stops-working-in-2019/

tl;dr: bigger drives = longer rebuilds + more latent errors -> greater chance of RAID 6 failure. If you business depends on it, RAID 7-3 or better now.

I understand LTT is using a striped raidz 6 (like raid 60) but the same thing still applies due to the sector count and failure rate of the drives. I know large storage numbers are attractive but do it right and you won't have any more problems.

googleuser3212 · February 8, 2022

Welp

myxiplx · February 8, 2022

1 hour ago, jde3 said:

LTT and community please read this. (for the love of humanity..)
https://storagemojo.com/2010/02/27/does-raid-6-stops-working-in-2019/

tl;dr: bigger drives = longer rebuilds + more latent errors -> greater chance of RAID 6 failure. If you business depends on it, RAID 7-3 or better now.

I understand LTT is using a striped raidz 6 (like raid 60) but the same thing still applies due to the sector count and failure rate of the drives. I know large storage numbers are attractive but do it right and you won't have any more problems.

That article is over a decade old, and made poor assumptions even at the time, RAID-6 is still fine for the vast majority of use cases and you really can't take the word of Sun's ZFS engineering team as gospel. They were smart guys, but really didn't have a lot of experience building enterprise storage solutions. ZFS is an interesting concept, but it's handling of read errors and recovery abilities are below par. Every single enterprise vendor that's built a product based upon ZFS has needed to do some very heavy lifting under the covers to resolve its problems.

And you can also disprove that article by simply looking at the market, the vast majority of enterprise arrays sold over the last decade or so use RAID-6 or similar (Dell, HPE, Pure, IBM, Hitachi, NetApp). Dual-parity is the de-facto standard across the board for every vendor, and you simply don't get customers losing data due to it. I've been in enterprise storage for well over a decade and can't remember the last time I heard of a customer losing data due to a double disk failure, not for any vendor I've worked for.

The problem ZFS has is that it does not do full scrubs of the drives, so it never fully ensures that drives are healthy. It only scrubs the existing data, and doesn't work closely enough with the underlying disk. And because it's never actively health checking the drives, nor re-mapping their bad sectors, nor proactively failing drives which encounter large numbers of bad sectors it's very, very common for errors to occur during rebuilds with it. ZFS waits until *after* you've started losing data before it begins a rebuild, and that rebuild may well be going on with several other disks having errors that it's also not fixed.

Enterprise arrays simply don't do that. Drives are scanned proactively to ensure they're healthy, they actively monitor the drives to ensure that scenario cannot happen, and they all include ways to alert the user when faults occur at a minimum. Most enterprise storage vendors now include call-home monitoring as standard, meaning as soon as there's any chance a drive has faults that could cause data loss, it gets taken offline and the array automatically raises a support ticket. Heck, a decent number of vendors actually open the ticket before the drive even gets taken offline: They can spot a drive that's starting to fail in advance and have it replaced ahead of time, well before there's any risk whatsoever.

A few vendors support N+3 parity protection, but it's largely a marketing gimmick and is rarely deployed as there's simply no need for it.

jde3 · February 8, 2022

49 minutes ago, myxiplx said:

you really can't take the word of Sun's ZFS engineering team as gospel.

Suit yourself They are only the experts on it.. what do they know?

You can do the math tho. You have a set sector failure rate but you add more and more sectors. What happens to overall reliability? a 1TB drive is not the same as a 12TB drive.

Sign In

Our data is GONE... Again

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites