Jump to content

alpha754293

Member
  • Posts

    209
  • Joined

  • Last visited

Reputation Activity

  1. Like
    alpha754293 got a reaction from Needfuldoer in My Ryzen cluster now has (almost) 100 Gbps networking between each other   
    I have two AMD Ryzen systems where both are with the Ryzen 9 5950X processor, but one as an Asus X570 TUF Gaming Pro WiFi motherboard whilst the other has an Asus ROG STRIX X570-E Gaming WiFi II motherboard.

    Earlier today, I was trying to diagnose an issue with my 100 Gbps network connection between the two Ryzen nodes and the microcluster headnode, where upon running ib_send_bw, I was only getting around 14 Gbps.

    Now that I took the discrete GPUs out from each of the systems, I'm getting 96.58 Gbps on my micro HPC cluster now.

    Yay!!!

    (I can't imagine there being too many people who have Ryzen systems with 100 Gbps networking tying them together.)
  2. Like
    alpha754293 got a reaction from Lurick in My Ryzen cluster now has (almost) 100 Gbps networking between each other   
    I have two AMD Ryzen systems where both are with the Ryzen 9 5950X processor, but one as an Asus X570 TUF Gaming Pro WiFi motherboard whilst the other has an Asus ROG STRIX X570-E Gaming WiFi II motherboard.

    Earlier today, I was trying to diagnose an issue with my 100 Gbps network connection between the two Ryzen nodes and the microcluster headnode, where upon running ib_send_bw, I was only getting around 14 Gbps.

    Now that I took the discrete GPUs out from each of the systems, I'm getting 96.58 Gbps on my micro HPC cluster now.

    Yay!!!

    (I can't imagine there being too many people who have Ryzen systems with 100 Gbps networking tying them together.)
  3. Like
    alpha754293 got a reaction from dogwitch in Dream Has Too Much Money (SPONSORED)   
    I dunno.

    I guess that depends on how much footage their systems is trying to ingest/render simultaneously.

    I guess that's also the advantage of having enterprise-grade NVMe SSDs such that at 30 TB of raw capacity, even 1 DWPD still means that you're writing 30 TB/day.
  4. Like
    alpha754293 got a reaction from kumicota in The $1,000,000 Unboxing. (SPONSORED)   
    So....interestingly enough -- if you actually google/YouTube "NFS vs. iSCSI vs. SMB", there are actually videos (which is supported by evidence) that SMB/CIFS is actually THE more widely supported data transport protocol than NFS. (Sidenote: CIFS is actually a quote "dialect" of SMB. (Source: https://en.wikipedia.org/wiki/Server_Message_Block#CIFS)
     
    More importantly, if you also YouTube NFS (on Windows), you will also find and note that NFS can actually perform WORSE on Windows clients than SMB (because NFS isn't native to Windows, so it was added on as an "afterthought" of sorts) whereas SMB, being a Microsoft product, is in Windows "natively". (Why do you think that *nix servers always have to install Samba ex post facto? (Although to be fair, you also often have to install NFS after the fact as well if the default OS install image doesn't include nfs-common and/or nfs-kernel-server packages with said install image.))

    The backend can be whatever you want it to be in order to be able to facilitate server-to-server transfers. (e.g. I run NFSoRDMA for backend transfers on systems and clients that support NFSoRDMA over 100 Gbps Infiniband at home).

    The front end can also be whatever it is that you want it to be. For my Windows clients at home that either can't or aren't able to support NFSoRDMA on 100 Gbps Infiniband, I use SMB.

    Unless you're going to be deploying 200 GbE ConnectX-6 cards in every single workstation, it would be irrelevant whether you're using SMB or not. (e.g. if said Windows clients/desktop editing workstations are ONLY using 10GbE - SMB vs. NFS - would won't even see/notice a difference here).

    It'd be an entirely different story if they were going to be using SMB Multi-channel and/or SMB Direct.

    But I think that they've already switched over to TrueNAS Core at least once before; so it is not unreasonable to surmise that they are looking to at LEAST go with TrueNAS Core, or perhaps, the potentially better option for them would be to actually move to TrueNAS Scale, once Scale is more stable for production environment usage/deployments. (Which, on that note, they would probably be wise to stick with the more stable TrueNAS Core for production deployment).

    And for the record, as someone who runs NFSoRDMA on 100 Gbps Infiniband (granted, I'm using spininng rust SAS 12 Gbps drives instead of NVMe 4.0 x4 SSDs), it really doesn't matter whether you're using NFS or SMB. The speeds are about the same.

    You're never going to be able to hit the theoretical peak bandwidth speeds anyways (which are all of the advertised speeds).
  5. Like
    alpha754293 got a reaction from igormp in The $1,000,000 Unboxing. (SPONSORED)   
    So....interestingly enough -- if you actually google/YouTube "NFS vs. iSCSI vs. SMB", there are actually videos (which is supported by evidence) that SMB/CIFS is actually THE more widely supported data transport protocol than NFS. (Sidenote: CIFS is actually a quote "dialect" of SMB. (Source: https://en.wikipedia.org/wiki/Server_Message_Block#CIFS)
     
    More importantly, if you also YouTube NFS (on Windows), you will also find and note that NFS can actually perform WORSE on Windows clients than SMB (because NFS isn't native to Windows, so it was added on as an "afterthought" of sorts) whereas SMB, being a Microsoft product, is in Windows "natively". (Why do you think that *nix servers always have to install Samba ex post facto? (Although to be fair, you also often have to install NFS after the fact as well if the default OS install image doesn't include nfs-common and/or nfs-kernel-server packages with said install image.))

    The backend can be whatever you want it to be in order to be able to facilitate server-to-server transfers. (e.g. I run NFSoRDMA for backend transfers on systems and clients that support NFSoRDMA over 100 Gbps Infiniband at home).

    The front end can also be whatever it is that you want it to be. For my Windows clients at home that either can't or aren't able to support NFSoRDMA on 100 Gbps Infiniband, I use SMB.

    Unless you're going to be deploying 200 GbE ConnectX-6 cards in every single workstation, it would be irrelevant whether you're using SMB or not. (e.g. if said Windows clients/desktop editing workstations are ONLY using 10GbE - SMB vs. NFS - would won't even see/notice a difference here).

    It'd be an entirely different story if they were going to be using SMB Multi-channel and/or SMB Direct.

    But I think that they've already switched over to TrueNAS Core at least once before; so it is not unreasonable to surmise that they are looking to at LEAST go with TrueNAS Core, or perhaps, the potentially better option for them would be to actually move to TrueNAS Scale, once Scale is more stable for production environment usage/deployments. (Which, on that note, they would probably be wise to stick with the more stable TrueNAS Core for production deployment).

    And for the record, as someone who runs NFSoRDMA on 100 Gbps Infiniband (granted, I'm using spininng rust SAS 12 Gbps drives instead of NVMe 4.0 x4 SSDs), it really doesn't matter whether you're using NFS or SMB. The speeds are about the same.

    You're never going to be able to hit the theoretical peak bandwidth speeds anyways (which are all of the advertised speeds).
  6. Like
    alpha754293 got a reaction from igormp in The $1,000,000 Unboxing. (SPONSORED)   
    I would urge caution against using GlusterFS for each of the storage nodes to be a Gluster volume brick in order to tie all of the storage nodes back to the cluster headnode.
     
    The process of removing a Gluster volume brick is not-trivial, or at the very least, you can't detach a gluster brick from a gluster volume quickly (following the command to shrink the gluster volume so that you can detach the brick from the volume).
     
    Also, if you are planning on using TrueNAS, please note that as of TrueNAS Core 12.0-U1.1, NFS over RDMA is NOT supported nor operational.

    (See my thread here, more specifically, "edit #2" for specific deployment details: https://www.truenas.com/community/threads/truenas-12-0-u1-1-infiniband-support.90996/)

    Without SOME form of RDMA, you're going to run into issues trying to hit/maximise your available storage bandwidth capacity.
  7. Like
    alpha754293 got a reaction from igormp in The $1,000,000 Unboxing. (SPONSORED)   
    For those that might be interested, Patrick from ServeTheHome actually reviewed an Inspur system about 8 months ago, so if you want to learn a little bit more about the Nvidia 8xA100 system, you can get an in-depth taste for it here:
     
  8. Informative
    alpha754293 got a reaction from Tamesh16 in Why Overheat your CPU on Purpose?   
    The problem with using F@H for testing/benchmarking is that the workload varies significantly depending on the work unit that their servers send to your system/component.

    Therefore; if you are trying to find out how hot your system gets (e.g. testing whether your thermal management solution is sufficient and efficient) and/or whether your system is stable, the variability with F@H makes a cross-platform comparison like this virtually impossible unless you purposely COPY (and retain a copy) of the work unit data so that you can start up the client with the former command-line tool (rather than with its vastly prettier GUI), which forces F@H to work on the same WU over and and over again.

     
     
    According to the OCED PISA results, it is no secret that the general population of many countries are bad at math. There are some countries where they typically fair a little bit better, but there are also many that fair significantly worse as well.
     
    For example, this video mentions that LINPACK solves the linear algebra system Ax=b, but very few people might actually know nor understand that it actually uses the Gaussian solver with partial pivoting to actually perform that solve and there might be even more people who DON'T know that the "rests" in between is actually the system checking the results for the vector, x, by computing the residual, where per the aforementioned linear algebra equation system, Ax=b, Ax-b should equal 0.

    But in computer science (and also numerical analysis and matrices), Ax-b does NOT always give you 0, and the value that it returns is called the residual, and you want to try and minimise that residual as much as possible by balancing performance with accuracy/precision.
     
    This is the range of results that you can get with the AMD Ryzen 9 5950X (on an Asus X570 TUF Gaming Pro WiFi motherboard, with 4x 32 GB DDR4-3200 unbuffered, non-ECC RAM):
     

     
    And here is the same test, on an Intel Core i9-12900K on an Asus Z690 Prime-P D4 motherboard, also with 4x 32 GB DDR4-3200 unbuffered, non-ECC RAM:

     
    And here is what you can get when you overlay the results between the two platforms over top of each other:

  9. Like
    alpha754293 reacted to FliP0x in The Steam Deck is Incomplete   
    Here are a few reasons on top of my head that could be selling points for some:
    - Can be used as Mini PC (connect to a dock and use as a regular PC)
    - Can be used as a Multimedia player
    - Plays full PC games and not console ports
    - Can be used as an emulation machine
    - It's portable
    - Will have a huge community
    - It's customizable
    - Attractive entry price (for what you get)
    - Perfect for Steam in-house streaming
     
  10. Like
    alpha754293 got a reaction from Needfuldoer in My dream FINALLY came True   
    Hopefully, I've read and understood what you wrote correctly, but on this point - having distributed storage is not really as much of a problem and as much of a "death knell" like it used to be/like it once was.

    Gluster makes that pretty easy.
     
    But even if they didn't want to use Gluster and let's say that they threw on Proxmox VE on BOTH of the storage servers and then used said Proxmox to join each of said storage servers into a HA cluster, my understanding is that Proxmox can handle all of that which would make the setup, configuration, and management of said HA cluster a breeze (relatively speaking - compared to what it used to be and/or compared to what how difficult it can be, if you have a more specialised/custom setup/deployment strategy.)
     
    (But if you're going to be going with a custom deployment strategy, chances are, you have already learned from prior deployments of what to do and what not to do as well as what works for you and what doesn't work for you.)

     
     

    Personally, (and perhaps ironically, thanks to Linus), I'd vote Infiniband.
     
    I watched their video that they posted a couple of years ago talking about how 100 Gbps networking isn't all that expensive anymore ($/Gbps/port) and that's what led me to deploy my own 100 Gbps Infiniband setup in the basement of my house. (Still rockin' it!)

    It doesn't have to be super expensive. I bought my Mellanox 36-port externally managed switch for < $3000 CAD and I just have a Linux system running the OpenSM subnet manager on it in order to "drive" said switch. But that's so easy to do.
     
    It's too bad that Mellanox doesn't use their VPI technology on the switch side because they can totally do that.
     
    Given that I have said IB switch, and OpenSM running, I recently installed Windows 10 on my AMD Ryzen 9 5950X system and it picked up the Mellanox ConnectX-4 card right away and IPoIB (showed up as an ethernet device even though the ports were set to IB link type mode) worked right away as well.

    In other words, the editors could have 100 Gbps connection to the server or like you said, between server can be dual 100 Gbps (yay PCIe 4.0 x16) (and yes, you can bond IB ports like that).

    (My LTO-8 tape drive at home is on a system that runs CAE Linux 2018 (which is based on Ubuntu 16.04 LTS) and that's on the IB network.)

    NFSoRDMA is a godsend. (No pun intended.)
     
    And if they still have their all NVMe flash storage server, they can also run NVMeoF as well.
     
    An alternative configuation, with the dual port 100 Gbps IB cards is that they can have one of the ports running in IB mode and the other port running in ETH mode and then have both a 100 Gbps IB switch AND a 100 GbE switch as well, that ways, if they don't want to run IB for everything, (ethernet is a little easier to administer), they can run everything through RoCE instead.
     
    The possibilities are endless.
     
    It probably wouldn't be a bad idea for some of the members of the LTT staff to jump onto the Storagereview.com forums and start talking to IT storage professionals to figure out how to best set up the storage servers, drives, etc. that they DO have in order to optimise between R&P, SWaP, and TCO.
  11. Like
    alpha754293 got a reaction from Needfuldoer in My dream FINALLY came True   
    The building and usage of said hard drives for a new server doesn't preclude LTT from building and using a properly managed, operating, maintained tape library system for said archival footage and/or using it to load balance the network server load for their editors.
     
    This is one of things that I've come across with the push for more and more virtualisation where whilst yes, it is great that you don't have to own, maintain, and power so many baremetal boxes anymore, but if you were running two AMD EPYC servers, each with 128 PCIe 4.0 lanes, you get a total of 256 PCIe 4.0 lanes combined between the two servers.

    Virtualisation isn't going to give you that.
     
    This is one of the things that you "sacrifice" with running more virtual machines vs. bare metal systems and it's the same for storage bandwidth bottleneck as well.
     
    If they have two Petabyte servers like that (or really, 2.25 PB servers). At least two of them should be clustered together in a HA pair so that if one of the servers goes down for whatever reason, it isn't going to completely stop their editors from working.
     
    That's not how their plan (as described) has been laid out.

    With that many hard drives given to them (and also with the upto 4% AFR for Seagate drives, which they should really set a bunch of those up as hot spares), there are a lot better ways to deploy the hardware that they've got that better protects their business.

    (i.e. they're also not plunky home lab users anymore that started their YouTube channel over 10 years ago when they were doing a lot of their early stuff from their house.)

    I have, probably, all told, about anywhere between 1/20th - 1/10th of the total raw storage capacity and yet, I have more LTO-8 tapes than they do, as a home user. (I'm upto a total of 40 LTO-8 12 TB tapes now.)

    I don't understand what's Linus' excuse.
  12. Like
    alpha754293 reacted to Needfuldoer in My dream FINALLY came True   
    Which is exactly why a dense, shelf-stable tape library makes the most sense as the deep archive. It's a "just in case we might need it" they keep to occasionally dip into, which could be expanded at relatively little cost.
     
    But even that's a hard value proposition to argue when a hard drive manufacturer can hook you up with tens of thousands of dollars worth of hard drives for free (or a dramatically reduced cost).
  13. Agree
    alpha754293 got a reaction from WhitetailAni in Our data is GONE... Again   
    This is absolutely ridiculous.

    You guys have a video where Linus bought a $5500 tape drive
    And you aren't even using it.

    At 1-2 PB of data, there is absolutely NO reason for you NOT to be using a LTO-8 tape library at this point where you have anywhere between 2-4 tape drives in order to handle multiple backups and retrieve operations simultaneously.

    The video above makes a patently INCORRECT/false statement:

    "Backing up over a petabyte of data is really expensive. Either we would need to build a duplicate server array to backup to or we could backup to the cloud. But even using the economical option, Backblaze B2, it would cost us somewhere between five and ten-thousand US dollars per month."

    The part where this statement is incorrect is that Linus (and the team who wrote this) left out, crucially, the option for using local (and/or offsite) tape backup storage solutions.

    This is stupid. (I was going to try and write something nicer, but on second thought, no, this is just really stupid.)

    For trying to backup 2 PB of data, you can buy one hundred and sixty-seven (167) 12 TB uncompressed/30 TB compressed LTO-8 tapes for $66.25 each from https://ltoworld.com/collections/lto-media/products/quantum-lto-8 which would total up to $11063.75 USD. (Double it if you want a father-son backup topology. Triple it if you want a grandfather-father-son backup topology.)

    As of this time of writing, a dual magazine, 16-slot LTO-8 tape autoloader from Quantum is $4798 USD (https://www.backupworks.com/Quantum-superloader-3-LTO-8-16-Slot.aspx). The ATTO ExpressSAS host bus adapter is $497 (https://www.backupworks.com/atto-expressSAS-12GB-ESAH-1280-GT0.aspx) and the external mini SAS cable going from (SFF-8644 to SFF-8088) is $79.86 (https://www.backupworks.com/HD-Mini-SAS-to-Mini-SAS-2m.aspx).
     
    All said and done, that totals up to $16438.61 USD which is what you would spend in about a-month-and-a-half, trying to backup 2 PB of data.

    And that, you can throw it into basically ANY plain old system that you want, meaning if you have a system that has enough PCIe lanes, ANY system that is used to run the tape autoloader will work for you.

    (I have an Intel Core i7-3930K (6 cores, 3.2 GHz) system with 4x Crucial 8 GB DDR3-1600 unbuffered, non-ECC RAM on an Asus X79 Sabertooth motherboard, and a Mellanox 100 Gbps Infiniband card (MCX456A-ECAT) that runs my tape backup system at home, albeit I'm only using a single, manual drive (no autoloader).)

    Compare and contrast this to the fact that to buy one HUNDRED (100) Seagate Exos 20 TB drives from Newegg.com, that's $524.99 USD per drive * 100 = $52499.00 USD just in the hard drives alone, without the host system. (https://www.newegg.com/seagate-exos-x20-st20000nm007d-20tb/p/N82E16822185011?quicklink=true).
     
    Therefore; like I said the statement "to backup 2 PB of data needing to build a duplicate server" and/or using cloud backup is a patently false statement because it leaves out the local tape backup option, which CLEARLY shows that it is the cheaper option. Even if you TRIPLED the number of tapes from 167 to 501 (for a grandfather-father-son) backup topology, you'd STILL only be out $38566.11 USD.

    And I'm sure that Evan Sackstein from Backupworks.com would be able to put together a quote for whatever you needs and budget is going to be along with all of the hardware that you are going to need to get you up and running quickly (i.e. whether you actually WANT all 167 tapes to be able to and be ready to be accessed at ANY moment, or whether you would want to save a bit of money (or quite a lot of money), skip the auto loader, and just manually manage it yourself.)

    I mean, that depends on how much you are or am willing to spend on it.

    I have said this to people over and over and over again - ZFS is NOT a backup.

    This video just proves/shows this point.

    The exact failure mode that I have talked about (which is supposed to be rare in probability) is EXACTLY what happened here.

    And for an operation like Linus Tech Tips, there is absolutely NO reason why they aren't running a LTO-8 tape backup.

    Sure, it's not as fast as being able to have 2 PB of data, ALWAYS ON, and ALWAYS live, but the question is "do you REALLY need all of that data to be live, all of the time, when even, in the video, Linus admits that the team RARELY touches or needs some footage from the archive?"

    (And yes, doing the initial backup of 2 PB of data is going to suck big time, but that is the result of NOT having the tape backup system up and running all along and waiting until you have experienced a catastrophic failure like this in order to try and recover your data instead of having deployed said tape backup library system when you got to around the 100-200 TB mark so that you would have been implemented the best practices for data management, archiving, backing your data up, and just general management.)

    (Sidebar: I used Ubuntu to compile the LTFS that's needed to run my tape backup system, so if you need help with that (instead of connecting your tape drive over Thunderbolt to a Mac), I'm sure that you should have enough parts to be able to piece together a system that you can use as the system that just runs the tape backup system/drive/autoloader/tape library, and you can run upto 8 drives on a single SAS host bus adapter card. (Again, that's up to you how much you would want to spend in initial capital expediture/investment cost, because you only really need that many drives for your initial backup run of trying to save 2 PB of data all in one go, but after that initial backup, you might only new a handful of drives if you have multiple people trying to pull footage from your archive, at the same time.))

    But yeah, this is otherwise, stupid.

    And there is NO reason why you guys aren't rocking LTO-8 tape backups already, locally, on site.

    *edit*
    If you ONLY want to backup around 1.2 PB of data, at 12 TB per LTO-8 tape, you would only need 100 tapes. At $66.25 USD/tape, that would work out to be $6625 USD for the tapes alone.

    The cost of the autoloader, the SAS HBA, and the external SAS cable remains the same, so your total would come out to $11999.86 USD.

    By comparison, to backup 1.2 PB using the Seagate Exos 20 TB drives at $524.99 USD/drive, and for sixty (60) drives, that would still run you $31499.40.

    In other words, the tape solution is STILL the more cost effective solution.
  14. Agree
    alpha754293 got a reaction from Needfuldoer in My dream FINALLY came True   
    Yeah...when it comes to the cost optimisation calculus, it depends on whether they want to perpetually pay (i.e. little to no money up front in terms of capital expenditures), but where you are perpetually renting someone else's stuff, or to bite the bullet and spent the capital expediture (once), and then go with the "ownership" model instead of the "renting" model.
     
    I worked out the math in this thread here:
     
    That with one hundred and sixty-seven (167) 12 TB tapes, which can store 2 PB of data, plus the 16-slot tape library, plus the SAS card and the cables, they'd be out about $16-17k USD. And that would be a one-time cost.

    If they only wanted 1.2 PB, then that worked out to be precisely $11999.86 USD.

    (which would be about what the annual cost would be for the Amazon AWS S3 Deep Glacier cloud based storage, except that with the AWS solution, they'd be paying that year over year over year, whereas with the LTO-8 tape backup solution, they'd only have to pay for that once.

    If they want to have a father-son type backup topology, then add another 100 * $66.25 USD = $6625 USD to that. If they want a grandfather-father-son backup topology, then they can add yet another 100 tapes at another $6625 USD to that. In other words, the cost for more tapes is roughly about HALF the price of the AWS S3 Deep Glacier.

    On top of that, you would have to also consider time that it would take to send 1.2 PETABYTES of data to the cloud (1 Gbps fiber connection = around 100 MB/s). 1.2 PB is about 1.2e9 MB. Assuming that you can sustain 100 MB/s for the entire duration of the transfer, trying to upload 1.2 PB of data to the cloud, at 100 MB/s would take you 12 MILLION seconds or 3333.33 hours or 138.89 days.

    Conversely, with my LTO-8 tape system that I have at home, I can send data to the tapes at anywhere between 200-300 MB/s. Therefore; even on the low end, I would take HALF of that time. And that's with one drive.

    If LTT bites the bullet even further, and say they buy a system with FOUR drives instead of just one (which ups their initial capex costs), that will cut their time down to about a QUARTER of what it would take to upload 1.2 PB of data to AWS in terms of time. (And AWS charges a little bit for the data transfer as well.)

    My point is that there are LOTS of options available at different price points and the optimization calculus can be done in order to figure out what would be the better solution for what their needs are.

    (Personally, I would get a system with 4 tape drives or multple 16-slot, single (or dual) drive autoloaders (which again, increases the initial capex costs), but then you'd be able to read and write from multiple tapes at a time and if it means that your workers will sometimes have to pull out some tapes from the magazine to put other tapes in so that you would be able to pull archive footage from it, then so be it. Or you can make one person be the gate keeper of it instead of spending a LOT more money on a robotic tape library that would do that all for you. That way, that person would know what tapes are currently loaded into which magazine, and if they need to swap tapes in the magazines, they would know which magazine they would have to swap it out from. It's amazing to me that you can get a 16-slot, single drive tape autoloader for less money than what I paid for my single, drive that I had to load manually. (Source: https://www.backupworks.com/Quantum-superloader-3-LTO-8-16-Slot.aspx) )
     
    And if they want an offsite backup, they can literally just buy another batch of tapes, and put them in an environmentally controlled, storage facility.
     
    If AWS offered this as an option, it would actually be faster for LTT to buy four tape drives, write the data to the drives TWICE and then ship a set of the tapes to AWS to be the cloud host (rather than uploading it), and they'd STILL come out ahead, in terms of time (because they can then write the third set, on prem, themselves).
     
    The AWS Deep Glacier is just using AWS' tapes rather than LTT's own.
     
    And I'm the literal proof that it's not that difficult to set up because I'm an idiot by many accounts and if I, as an idiot, can set it up, I see little to no reason as to why the team over at LTT, can't do the same.
     
    And I am sure that there are other people who have more knowledge and experience in this area that would be able to help the LTT team out if they need additional help in getting their tape library up and running for their needs, that way, they will actually learn from this (finally, this time around) rather than being doomed to repeat the same mistakes over and over and over again.
  15. Like
    alpha754293 got a reaction from Quackers101 in My dream FINALLY came True   
    And this is one of the reasons why I DON'T use Seagate drives.

    I stay away from the like the plague.

    From the Backblaze 2021 hard drive reliability report (https://www.backblaze.com/blog/backblaze-drive-stats-for-2021/)
     
    Where there are multple manufacturers, Seagate leads in the annualised failure rate for 8 TB, 12 TB, 14 TB, and 16 TB capacities. They were the only manufacturer that Backblaze has or has data for for 6 TB and 10 TB capacities so there is insufficient data for a comparative analysis. And the only exception where Seagate WASN'T the leader in the annualised failure rate, is in the 4 TB capacity. That honour goes to Toshiba's MD04ABA400V.
     
    But for all of the other capacities, Seagate leads in the annualised failure rate.
     
    It's no wonder why Seagate sponsored this video because they are LITERALLY having to give the hard drives away and/or sell them at a discount, if they are selling them to Linus Tech Tips at all.

    And once again, for about $35k in hard drives, you could've gotten your LTO-8 tape backup system for like HALF that price (which you should TOTALLY still do, BTW).
     
    if you don't learn from this experience, then you will be doomed to repeat it again, even with the "better" safeguards that you've put in place this time that you didn't put in place last time.
  16. Like
    alpha754293 got a reaction from Needfuldoer in My dream FINALLY came True   
    And this is one of the reasons why I DON'T use Seagate drives.

    I stay away from the like the plague.

    From the Backblaze 2021 hard drive reliability report (https://www.backblaze.com/blog/backblaze-drive-stats-for-2021/)
     
    Where there are multple manufacturers, Seagate leads in the annualised failure rate for 8 TB, 12 TB, 14 TB, and 16 TB capacities. They were the only manufacturer that Backblaze has or has data for for 6 TB and 10 TB capacities so there is insufficient data for a comparative analysis. And the only exception where Seagate WASN'T the leader in the annualised failure rate, is in the 4 TB capacity. That honour goes to Toshiba's MD04ABA400V.
     
    But for all of the other capacities, Seagate leads in the annualised failure rate.
     
    It's no wonder why Seagate sponsored this video because they are LITERALLY having to give the hard drives away and/or sell them at a discount, if they are selling them to Linus Tech Tips at all.

    And once again, for about $35k in hard drives, you could've gotten your LTO-8 tape backup system for like HALF that price (which you should TOTALLY still do, BTW).
     
    if you don't learn from this experience, then you will be doomed to repeat it again, even with the "better" safeguards that you've put in place this time that you didn't put in place last time.
  17. Like
    alpha754293 got a reaction from Lurick in My dream FINALLY came True   
    And this is one of the reasons why I DON'T use Seagate drives.

    I stay away from the like the plague.

    From the Backblaze 2021 hard drive reliability report (https://www.backblaze.com/blog/backblaze-drive-stats-for-2021/)
     
    Where there are multple manufacturers, Seagate leads in the annualised failure rate for 8 TB, 12 TB, 14 TB, and 16 TB capacities. They were the only manufacturer that Backblaze has or has data for for 6 TB and 10 TB capacities so there is insufficient data for a comparative analysis. And the only exception where Seagate WASN'T the leader in the annualised failure rate, is in the 4 TB capacity. That honour goes to Toshiba's MD04ABA400V.
     
    But for all of the other capacities, Seagate leads in the annualised failure rate.
     
    It's no wonder why Seagate sponsored this video because they are LITERALLY having to give the hard drives away and/or sell them at a discount, if they are selling them to Linus Tech Tips at all.

    And once again, for about $35k in hard drives, you could've gotten your LTO-8 tape backup system for like HALF that price (which you should TOTALLY still do, BTW).
     
    if you don't learn from this experience, then you will be doomed to repeat it again, even with the "better" safeguards that you've put in place this time that you didn't put in place last time.
  18. Like
    alpha754293 got a reaction from jde3 in Our data is GONE... Again   
    15 years ago. 20 years ago. Close enough, right?

    Yeah, the first time that I used ZFS also in a serious fashion was I used to have a home built server with an AMD Opteron processor, I think, and an Adaptec 16 port SATA RAID HBA, and sixteen (16) 500 GB drives for a total of 8 TB of storage, which, back in 2006/2007, was a lot.
     
    I finally jumped off the Solaris train when my Solaris VM finally died and I couldn't bring it back up (or not without spending a LOT more time than what I would like to spend trying to spin it back up), so the dumb, web server hosting tasks has now moved onto SLES12 SP1, and I use ZFS now in a TrueNAS Core 12.0 U1.1 server at home.
     
    It's interesting to me whenever I am talking to people about ZFS and they are trying to "educate" me about said ZFS because they don't know about my history in regards to working with ZFS.
     
    (And my most recent attempt at deploying ZFS on Solaris was just prior to the holidays, before I had installed TrueNAS Core 12 on my server because I wanted to see if I could get Solaris going on the system. (It had mixed results.) And the performance wasn't that much better in Solaris vs. TrueNAS, so I switched over to TrueNAS instead.)
     
    If my Solaris VM didn't die recently, I would still be able to claim that I'm STILL using ZFS on Solaris (in some form), but pity that's not the case. (I actually had to set up a Solaris 10 PXE and DHCP server in order to be able to jumpstart the Solaris installation on my dual Xeon E5310 server. THAT was an adventure unto itself.)
  19. Agree
    alpha754293 got a reaction from Needfuldoer in Our data is GONE... Again   
    This is absolutely ridiculous.

    You guys have a video where Linus bought a $5500 tape drive
    And you aren't even using it.

    At 1-2 PB of data, there is absolutely NO reason for you NOT to be using a LTO-8 tape library at this point where you have anywhere between 2-4 tape drives in order to handle multiple backups and retrieve operations simultaneously.

    The video above makes a patently INCORRECT/false statement:

    "Backing up over a petabyte of data is really expensive. Either we would need to build a duplicate server array to backup to or we could backup to the cloud. But even using the economical option, Backblaze B2, it would cost us somewhere between five and ten-thousand US dollars per month."

    The part where this statement is incorrect is that Linus (and the team who wrote this) left out, crucially, the option for using local (and/or offsite) tape backup storage solutions.

    This is stupid. (I was going to try and write something nicer, but on second thought, no, this is just really stupid.)

    For trying to backup 2 PB of data, you can buy one hundred and sixty-seven (167) 12 TB uncompressed/30 TB compressed LTO-8 tapes for $66.25 each from https://ltoworld.com/collections/lto-media/products/quantum-lto-8 which would total up to $11063.75 USD. (Double it if you want a father-son backup topology. Triple it if you want a grandfather-father-son backup topology.)

    As of this time of writing, a dual magazine, 16-slot LTO-8 tape autoloader from Quantum is $4798 USD (https://www.backupworks.com/Quantum-superloader-3-LTO-8-16-Slot.aspx). The ATTO ExpressSAS host bus adapter is $497 (https://www.backupworks.com/atto-expressSAS-12GB-ESAH-1280-GT0.aspx) and the external mini SAS cable going from (SFF-8644 to SFF-8088) is $79.86 (https://www.backupworks.com/HD-Mini-SAS-to-Mini-SAS-2m.aspx).
     
    All said and done, that totals up to $16438.61 USD which is what you would spend in about a-month-and-a-half, trying to backup 2 PB of data.

    And that, you can throw it into basically ANY plain old system that you want, meaning if you have a system that has enough PCIe lanes, ANY system that is used to run the tape autoloader will work for you.

    (I have an Intel Core i7-3930K (6 cores, 3.2 GHz) system with 4x Crucial 8 GB DDR3-1600 unbuffered, non-ECC RAM on an Asus X79 Sabertooth motherboard, and a Mellanox 100 Gbps Infiniband card (MCX456A-ECAT) that runs my tape backup system at home, albeit I'm only using a single, manual drive (no autoloader).)

    Compare and contrast this to the fact that to buy one HUNDRED (100) Seagate Exos 20 TB drives from Newegg.com, that's $524.99 USD per drive * 100 = $52499.00 USD just in the hard drives alone, without the host system. (https://www.newegg.com/seagate-exos-x20-st20000nm007d-20tb/p/N82E16822185011?quicklink=true).
     
    Therefore; like I said the statement "to backup 2 PB of data needing to build a duplicate server" and/or using cloud backup is a patently false statement because it leaves out the local tape backup option, which CLEARLY shows that it is the cheaper option. Even if you TRIPLED the number of tapes from 167 to 501 (for a grandfather-father-son) backup topology, you'd STILL only be out $38566.11 USD.

    And I'm sure that Evan Sackstein from Backupworks.com would be able to put together a quote for whatever you needs and budget is going to be along with all of the hardware that you are going to need to get you up and running quickly (i.e. whether you actually WANT all 167 tapes to be able to and be ready to be accessed at ANY moment, or whether you would want to save a bit of money (or quite a lot of money), skip the auto loader, and just manually manage it yourself.)

    I mean, that depends on how much you are or am willing to spend on it.

    I have said this to people over and over and over again - ZFS is NOT a backup.

    This video just proves/shows this point.

    The exact failure mode that I have talked about (which is supposed to be rare in probability) is EXACTLY what happened here.

    And for an operation like Linus Tech Tips, there is absolutely NO reason why they aren't running a LTO-8 tape backup.

    Sure, it's not as fast as being able to have 2 PB of data, ALWAYS ON, and ALWAYS live, but the question is "do you REALLY need all of that data to be live, all of the time, when even, in the video, Linus admits that the team RARELY touches or needs some footage from the archive?"

    (And yes, doing the initial backup of 2 PB of data is going to suck big time, but that is the result of NOT having the tape backup system up and running all along and waiting until you have experienced a catastrophic failure like this in order to try and recover your data instead of having deployed said tape backup library system when you got to around the 100-200 TB mark so that you would have been implemented the best practices for data management, archiving, backing your data up, and just general management.)

    (Sidebar: I used Ubuntu to compile the LTFS that's needed to run my tape backup system, so if you need help with that (instead of connecting your tape drive over Thunderbolt to a Mac), I'm sure that you should have enough parts to be able to piece together a system that you can use as the system that just runs the tape backup system/drive/autoloader/tape library, and you can run upto 8 drives on a single SAS host bus adapter card. (Again, that's up to you how much you would want to spend in initial capital expediture/investment cost, because you only really need that many drives for your initial backup run of trying to save 2 PB of data all in one go, but after that initial backup, you might only new a handful of drives if you have multiple people trying to pull footage from your archive, at the same time.))

    But yeah, this is otherwise, stupid.

    And there is NO reason why you guys aren't rocking LTO-8 tape backups already, locally, on site.

    *edit*
    If you ONLY want to backup around 1.2 PB of data, at 12 TB per LTO-8 tape, you would only need 100 tapes. At $66.25 USD/tape, that would work out to be $6625 USD for the tapes alone.

    The cost of the autoloader, the SAS HBA, and the external SAS cable remains the same, so your total would come out to $11999.86 USD.

    By comparison, to backup 1.2 PB using the Seagate Exos 20 TB drives at $524.99 USD/drive, and for sixty (60) drives, that would still run you $31499.40.

    In other words, the tape solution is STILL the more cost effective solution.
  20. Like
    alpha754293 reacted to Kilrah in Is there an easy way to move 4TB of data from a ZFS Volume to a Btrfs Volume?   
    I did for the lulz.
    76000 files totalling 3.5GB.
     
    Direct copy over network: 11min
    Uncompressed archive, transfer, unarchive: 10min. Transfer takes 30 secs, but the drives on both ends take ages to gather and create the small files, as predicted.
     
    Aka useless. And I'm on RAID0'd recent HDDs on both sides. With just a single older drive like most people will be using for a NAS the drive speed will have even more comparative impact than the network transfer, reducing the difference or tipping it the other way.
  21. Like
    alpha754293 reacted to LinusTech in This Server Deployment was HORRIBLE   
    These SSDs are like 20W max. That's like 500W for SSDs, another 200 for CPU and maybe 100 for RAM. Still VERY comfortable lol.
  22. Funny
    alpha754293 got a reaction from GDRRiley in Deploying ANOTHER PETABYTE of Storage!   
    The text was copied and pasted from another location.
     
    Auto is black.
     
    1. See above.
     
    2. It looks like that they're manually load balancing it.
     
    Although, if they wanted to automate that, they just have to run it as either a distributed volume or distributed stripped volume.
     
    I've ran GlusterFS with RAM drives before (tmpfs) as a distributed volume before. Somewhat interestingly and it just might be how GlusterFS works with tmpfs -- it didn't really allow for parallel writes to the distributed volume, and therefore; the maximum throughput that I could get over 100 Gbps was the maximum speed that it could write any single file to the tmpfs volume.
     
    That's probably different when you have actual drives (rather than tmpfs), because it also wouldn't let me create a distributed stripped volume with tmpfs as well.
     
    So at "worst" (if the size of the writes have a significant size difference) then if they're running a distributed volume, one will be loaded up significantly moreso than the other. But if they're running a distributed stripped volume, then they should be able to better load balance both GlusterFS DS nodes better (via striping) so that it shouldn't have load balancing issues.
     
    But again, that also means that if one of the DS nodes goes down, it won't matter that the underlying FS/LVM is ZFS because the striped nature of GlusterFS would automatically mean that they're missing half of the data, which will render the entire array/volume useless to them unless they can get to the other half of the data, WITHOUT GlusterFS -- which, good luck with that one.
     
    (I'm on the Gluster FS mailing list -- have been since August of last year (2019), so I see the emails that come through when and if GlusterFS should fail.)
     
    Without HA and/or their tape autoloader/library, I would NOT have recommended deploying GlusterFS based on the email/mailing list traffic that I see.
  23. Funny
    alpha754293 got a reaction from kirashi in Deploying ANOTHER PETABYTE of Storage!   
    The text was copied and pasted from another location.
     
    Auto is black.
     
    1. See above.
     
    2. It looks like that they're manually load balancing it.
     
    Although, if they wanted to automate that, they just have to run it as either a distributed volume or distributed stripped volume.
     
    I've ran GlusterFS with RAM drives before (tmpfs) as a distributed volume before. Somewhat interestingly and it just might be how GlusterFS works with tmpfs -- it didn't really allow for parallel writes to the distributed volume, and therefore; the maximum throughput that I could get over 100 Gbps was the maximum speed that it could write any single file to the tmpfs volume.
     
    That's probably different when you have actual drives (rather than tmpfs), because it also wouldn't let me create a distributed stripped volume with tmpfs as well.
     
    So at "worst" (if the size of the writes have a significant size difference) then if they're running a distributed volume, one will be loaded up significantly moreso than the other. But if they're running a distributed stripped volume, then they should be able to better load balance both GlusterFS DS nodes better (via striping) so that it shouldn't have load balancing issues.
     
    But again, that also means that if one of the DS nodes goes down, it won't matter that the underlying FS/LVM is ZFS because the striped nature of GlusterFS would automatically mean that they're missing half of the data, which will render the entire array/volume useless to them unless they can get to the other half of the data, WITHOUT GlusterFS -- which, good luck with that one.
     
    (I'm on the Gluster FS mailing list -- have been since August of last year (2019), so I see the emails that come through when and if GlusterFS should fail.)
     
    Without HA and/or their tape autoloader/library, I would NOT have recommended deploying GlusterFS based on the email/mailing list traffic that I see.
  24. Funny
    alpha754293 got a reaction from greenmax in Deploying ANOTHER PETABYTE of Storage!   
    GlusteFS on ZFS is like....you're just begging/asking for data failures to occur.
     
    There are so many points of failure between the two that unless you have someone who has a LOT of experience recovering data from either of those filesystems, otherwise, what you're telling me is that the data is not that important.
     
    Seriously.
     
    Write some data to both of the servers and then disconnect one of them.
     
    And the while you're at it, yank some vdevs, down two pools, and purposely corrupt some data onto the array and then connect the other Gluster node.
     
    Good luck!
  25. Funny
    alpha754293 got a reaction from greenmax in Deploying ANOTHER PETABYTE of Storage!   
    I agree with what GDRRiley wrote below - ZFS is the local LVM/FS. GlusterFS is just installed on top of it so that they can cluster two servers together to get more total storage space across two servers.
     
    GlusterFS works but there are many, many, many caveats to that.
     
    Like so many things Linux, when it fails, it fails in some of the most spectacular ways possible.
     
    Good luck backing up 1.2 PB to tape.
×