Jump to content

Magnetic Tape Storage is making a comeback and could replace hard drives in Enterprise and business storage

Forgotten_Fox

Can someone who knows more than I do about tape storage explain why they have such a focus on compression?  It states compressed or uncompressed size, they often seem (from what I can tell) to have compression built-in in some manner, etc.  This is unlike any other storage format I'm familiar with (optical, HDDs, USB flash drives, floppies, SSDs, etc.).  Those simply store what you ask and state what they have, plain and simple.  If you want compression, the assumption is you'll do it yourself, and they (from what I can tell) never quote how much compressed data they can store, I assume for a combination of reasons including but not limited to:

  • The vastly different compressability of different types of data makes this impossible to state with any meaningful accuracy
  • It's a niche thing, at least for their target markets, which I understand differ quite greatly from the target market of magnetic tape

Solve your own audio issues  |  First Steps with RPi 3  |  Humidity & Condensation  |  Sleep & Hibernation  |  Overclocking RAM  |  Making Backups  |  Displays  |  4K / 8K / 16K / etc.  |  Do I need 80+ Platinum?

If you can read this you're using the wrong theme.  You can change it at the bottom.

Link to comment
Share on other sites

Link to post
Share on other sites

20 minutes ago, Ryan_Vickers said:

Can someone who knows more than I do about tape storage explain why they have such a focus on compression?  It states compressed or uncompressed size, they often seem (from what I can tell) to have compression built-in in some manner, etc.  This is unlike any other storage format I'm familiar with (optical, HDDs, USB flash drives, floppies, SSDs, etc.).  Those simply store what you ask and state what they have, plain and simple.  If you want compression, the assumption is you'll do it yourself, and they (from what I can tell) never quote how much compressed data they can store, I assume for a combination of reasons including but not limited to:

  • The vastly different compressability of different types of data makes this impossible to state with any meaningful accuracy
  • It's a niche thing, at least for their target markets, which I understand differ quite greatly from the target market of magnetic tape

This is almost all speculation, so take with appropriate salinity.

 

I expect the archival aspect is by far the most important reason why compression is more important to tapes than other formats. But I don't actually think I agree with your assessment here.

 

Enterprise drives in general (of any technology) are much more likely to comment on compression though. 

 

The entire premise, and downfall, of the Sandforce controller was (drive-level) hardware compression.

 

Also optical (and VHS before them) drives in real practice talked crazy amounts about compression. Like storing compressed or uncompressed audio on a bluray was a real public controversy with some movies.

 

These days however, the link interfaces tend to be the bottleneck for most ssds. If the link is the bottleneck, it's blatantly pointless (if not detrimental) to do hardware compression on the drive itself, rather than software at the OS.

 

Compression also works worse for small files and many files at once (higher overhead), which tapes already blow at, so it doesn't matter as much.

 

Additionally, modern server processors have so much more available resources that it is not exactly necessary to bother trying to implement  hardware when a 10000x faster chip is sitting a few ms away when you want it. Tape deployments are probably dramatically more likely to connect to machines that are sufficiently old where this isnt a good assumption, and thus dedicated compression is worthwhile.

LINK-> Kurald Galain:  The Night Eternal 

Top 5820k, 980ti SLI Build in the World*

CPU: i7-5820k // GPU: SLI MSI 980ti Gaming 6G // Cooling: Full Custom WC //  Mobo: ASUS X99 Sabertooth // Ram: 32GB Crucial Ballistic Sport // Boot SSD: Samsung 850 EVO 500GB

Mass SSD: Crucial M500 960GB  // PSU: EVGA Supernova 850G2 // Case: Fractal Design Define S Windowed // OS: Windows 10 // Mouse: Razer Naga Chroma // Keyboard: Corsair k70 Cherry MX Reds

Headset: Senn RS185 // Monitor: ASUS PG348Q // Devices: Note 10+ - Surface Book 2 15"

LINK-> Ainulindale: Music of the Ainur 

Prosumer DYI FreeNAS

CPU: Xeon E3-1231v3  // Cooling: Noctua L9x65 //  Mobo: AsRock E3C224D2I // Ram: 16GB Kingston ECC DDR3-1333

HDDs: 4x HGST Deskstar NAS 3TB  // PSU: EVGA 650GQ // Case: Fractal Design Node 304 // OS: FreeNAS

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, straight_stewie said:

I wasn't calculating for volume. I was calculating for areal density.

And my thought experiment was specifically to improve the random read/write performance of tape drives, with no regard for the archival performance. Random read/write performance does not care about volume density, it only cares about how quickly you can get to the data you want, hence, we can't run the tape to-and-fro from reel-to-reel, we must continuously run the entire tape past the head, as disk drives do.

That really has nothing to do with areal density, it won't matter how high it gets for as tape random I/O performance will not increase because that does. Tape is I/O bound because it's not a flat rigid surface with a read head that can be moved to any position in millisecond time frames. Unless you can run the tape end to end in also in milliseconds access times will not improve with areal density.

 

This is why I pointed it out, in terms of space efficiency hard disks are not more efficient than tapes unless you are comparing I/O performance and not areal or volume density for storage capacity. Even if you have 20 tape drives a single HDD is still going to give better access times for a near random access pattern.

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, leadeater said:

That really has nothing to do with areal density, it won't matter how high it gets for as tape random I/O performance will not increase because that does. Tape is I/O bound because it's not a flat rigid surface with a read head that can be moved to any position in millisecond time frames. Unless you can run the tape end to end in also in milliseconds access times will not improve with areal density.

The purpose of my thought experiment had absolutely nothing to do with any real use of a tape deck, and was just an exercise to discover if it might be theoretically possible to get hard disk level random read/write performance from a tape deck with no other considerations. I believe that I outlined a way in which it could be done using multiple heads and a continuous tape rather than a reel to reel tape. You do not want to view the exercise as valid. So, we must agree to disagree and leave it at that.

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, straight_stewie said:

You do not want to view the exercise as valid. So, we must agree to disagree and leave it at that.

It's not so much as it's not valid as that it's comparing metrics that don't relate to the proposed question that is being pondered, thus is not possible to highlight anything that useful.

 

Do it again using I/O performance metrics and it would make a lot more sense, like a typical HDD gives about 80-120 IOPs and a tape, at a pure terrible guess, is something like 0.1 IOPs (really hard to give a number if the next chunk of data is on the other end of the tape).

 

I was questioning it because you are using storage capacity metrics to answer a storage performance question, doesn't make sense.

 

Edit:

Just for clarity an LTO-8 tape is 960m long and requires 208 end-to-end passes to write the entire capacity of the tape, reading it isn't as bad as you can go to the start position directly then run from that point. To make any meaningful improvement to tape access times you'd need 100m long of read/write heads, if not more.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, leadeater said:

I was questioning it because you are using storage capacity metrics to answer a storage performance question, doesn't make sense.

I used the storage capacity metric to identify the physical characteristics of the tape, in other words: Whether or not the tape would be so long as to make it theoretically impossible to build a continuous tape machine. Then, we can additionally find out if there is sufficient space for multiple read/write heads.

In a continuous tape machine, distance to the desired sector means exactly the same thing as it does for a disk drive, because the two are identical in that regard: A hard disk is a continuous loop of stored data in the same way that a continuous tape is a continuous loop of stored data. In other words, there is no scrolling to the desired sector, there is only waiting until it comes back around.

The multiple heads are derived from the fact that the tape is much longer (and presumably moves much slower) than the equivalent disk and therefore, we need multiple locations at which to interact with the tape. In other words, having two read/write heads that are equidistant from each other along both sides of the tape effectively halves the sector wait time.

 

Given this, real numbers are not necessary. Given a long enough tape, there is sufficient room for enough read/write heads along the tapes length such that the wait time to the desired sector is exactly equivalent to the wait time for the desired sector of a disk drive. The assumption is that given a single data storage location already under the head, the read/write time is equivalent between a disk and a tape.

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

WHY IS EVERYONE USING A BOBBING FLASHING HEAD AVATAR!?!  WHAT'S GOING ON!?!

AMD Ryzen 5800XFractal Design S36 360 AIO w/6 Corsair SP120L fans  |  Asus Crosshair VII WiFi X470  |  G.SKILL TridentZ 4400CL19 2x8GB @ 3800MHz 14-14-14-14-30  |  EVGA 3080 FTW3 Hybrid  |  Samsung 970 EVO M.2 NVMe 500GB - Boot Drive  |  Samsung 850 EVO SSD 1TB - Game Drive  |  Seagate 1TB HDD - Media Drive  |  EVGA 650 G3 PSU | Thermaltake Core P3 Case 

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Curufinwe_wins said:

This is almost all speculation, so take with appropriate salinity.

 

I expect the archival aspect is by far the most important reason why compression is more important to tapes than other formats. [...*]

The archival factor makes sense, it's just odd to me though regardless since they're marketed at a knowledgeable, professional audience (IT employees for a company) who would know exactly what behaviour they could expect from their data and would want to know raw figures rather than having a company guess and tell them how much they think their stuff will compress.  Such figures based, frankly on irrelevant speculation, would be nothing more than a distraction to the typical tape customer?  It seems more likely that such a thing would be done if aimed at the home consumer market, the way SD cards, etc. often state how many videos or pictures you can store in addition to the actual byte count.

4 hours ago, Curufinwe_wins said:

Enterprise drives in general (of any technology) are much more likely to comment on compression though. 

Interesting, didn't know this.

4 hours ago, Curufinwe_wins said:

[*But I don't actually think I agree with your assessment here.]

 

The entire premise, and downfall, of the Sandforce controller was (drive-level) hardware compression.

 

Also optical (and VHS before them) drives in real practice talked crazy amounts about compression. Like storing compressed or uncompressed audio on a bluray was a real public controversy with some movies.

I think there's been a slight misunderstanding here.  I'm talking about formats that compress by their very nature - built-in hardware compression, like (I assume) tape does (if it does not then that is a mistake on my part and an important revelation).  To be extra clear, I'd like to differentiate between a HDD, DVD, etc. that has no such built in compression, but may be used to store compressed data like a jpg, mp3, zip, mpeg, etc. and a format that, seamlessly to the user, compresses (or at least tries to) everything that's written, and thus might have a reason to state "on the box" that it (for example) has a 1 TB true capacity and "2 TB compressed capacity" or something like that.

4 hours ago, Curufinwe_wins said:

These days however, the link interfaces tend to be the bottleneck for most ssds. If the link is the bottleneck, it's blatantly pointless (if not detrimental) to do hardware compression on the drive itself, rather than software at the OS.

 

Compression also works worse for small files and many files at once (higher overhead), which tapes already blow at, so it doesn't matter as much.

 

Additionally, modern server processors have so much more available resources that it is not exactly necessary to bother trying to implement  hardware when a 10000x faster chip is sitting a few ms away when you want it. Tape deployments are probably dramatically more likely to connect to machines that are sufficiently old where this isnt a good assumption, and thus dedicated compression is worthwhile.

Good points, although I'm not sure it helps convince me why tapes would report a compressed capacity.

Solve your own audio issues  |  First Steps with RPi 3  |  Humidity & Condensation  |  Sleep & Hibernation  |  Overclocking RAM  |  Making Backups  |  Displays  |  4K / 8K / 16K / etc.  |  Do I need 80+ Platinum?

If you can read this you're using the wrong theme.  You can change it at the bottom.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, straight_stewie said:

A hard disk is a continuous loop of stored data in the same way that a continuous tape is a continuous loop of stored data. In other words, there is no scrolling to the desired sector, there is only waiting until it comes back around.

Ah so not reel to reel then? I guess you could zig zag the tape up and down the cartage to get a longish length but as you pointed to would need way higher density. 

 

1 hour ago, straight_stewie said:

The multiple heads are derived from the fact that the tape is much longer (and presumably moves much slower) than the equivalent disk and therefore, we need multiple locations at which to interact with the tape. In other words, having two read/write heads that are equidistant from each other along both sides of the tape effectively halves the sector wait time.

You would have to change the way tapes are actually written and read for it to work, right now the data is sequentially written therefore sequentially read so multiple read/write heads in different locations wouldn't actually work as you must write sequentially. A very large buffer would be required to sustain the amount of data the extra distance between each head or each head writes a different track on the tape.

 

Problem with tapes is if you write to segment 1 of the tape you can't write to segment 10 without first writing 2,3,4,5,6,7,8 and 9. If the next read/write head is 10 segments away it'll only actually be able to read and never write.

 

The way tapes work would pretty much have to be entirely changed but the problem with that is they are they way the are because it's a length of tape, it being reel to reel isn't the only reason why. Being a flexible tape that can move around a bend has physical limitations like disk platters do so I doubt the density will ever get high enough for the draw backs that would still exist would be outweighed by the capacity of the cartage and the physical volume of space.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, leadeater said:

Problem with tapes is if you write to segment 1 of the tape you can't write to segment 10 without first writing 2,3,4,5,6,7,8 and 9. If the next read/write head is 10 segments away it'll only actually be able to read and never write.

There might be something we could do about it involving synchronizing the activity of the heads. I don't know the algorithm well enough to really think about it at that level. Do you have any good technical or technicalish sources easily on hand?

Or it might be the case that we don't actually have to do sequential writes like that at all, it just works out to be the best solution for the way that we actually use tapes.

It looks like I've got some research to do.

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

23 minutes ago, Ryan_Vickers said:

The archival factor makes sense, it's just odd to me though regardless since they're marketed at a knowledgeable, professional audience (IT employees for a company) who would know exactly what behaviour they could expect from their data and would want to know raw figures rather than having a company guess and tell them how much they think their stuff will compress.

The compression figure of a tape isn't actually an indicator of how much your data will compress. Different tape generations support different maximum compression ratios so if you are doing hardware compression at the tape drive that figure is the maximum achievable, like as seen on TV "your results will vary".

 

LTO-7 6TB raw/15TB compressed

image.png.f77afd14ffde1ee3f0b27d649415f674.png

Not a single tape has gotten 15TB. Not all our tapes, but you don't want the worlds longest single post lol.

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, leadeater said:

The compression figure of a tape isn't actually an indicator of how much your data will compress. Different tape generations support different maximum compression ratios so if you are doing hardware compression at the tape drive that figure is the maximum achievable, like as seen on TV "your results will vary".

 

LTO-7 6TB raw/15TB compressed

 

Not a single tape has gotten 15TB. Not all our tapes, but you don't want the worlds longest single post lol.

Oh interesting, so it's a maximum rather than an expected.  That solves some of the mystery for me, but also creates additional questions.  Why would there be a maximum?  Also, it's thus referring to the algorithm and the tape drive rather than the tape itself, correct? (although indirectly also the tape since they must match the drives, this much I know).

Solve your own audio issues  |  First Steps with RPi 3  |  Humidity & Condensation  |  Sleep & Hibernation  |  Overclocking RAM  |  Making Backups  |  Displays  |  4K / 8K / 16K / etc.  |  Do I need 80+ Platinum?

If you can read this you're using the wrong theme.  You can change it at the bottom.

Link to comment
Share on other sites

Link to post
Share on other sites

 

59 minutes ago, Ryan_Vickers said:

snip

Yes so having been marketed at knowledgeable people, a higher rated compression ratio can generally be understood to be better at compressing data than the lower rated ones (official compression ratings are not generally listed for video codecs, but using mostly-arbitrary samples every new one rates itself against the other benchmarks, like how at the same "quality" HEVC is ~3x higher compression than H264). [The video codec example isn't a great one as modern codec's are almost exclusively lossy, but still.]

 

But Sandforce drives were using built-in hardware compression just like these tapes. Which was my point.

Quote

SandForce controllers did not use DRAM for caching[2] which reduces cost and complexity compared to other SSD controllers. SandForce controllers also use a proprietary compression system to minimize the amount of data actually written to non-volatile memory (the "write amplification") which increases speed and lifetime for most data (known as "DuraWrite").[6] SandForce claims to have reduced write amplification to 0.5 on a typical workload.[13] As a byproduct, data that cannot readily be compressed (for example random data, encrypted files or partitions, compressed files, or many common audio and video file formats) is slower to write... Data is encrypted even if there is no password which makes data recovery problematic; however, hardware encryption (which encrypts the user data as physically stored to flash without any significant performance loss[13]) doesn't replace, but rather complements, the drive lock feature and software-based encryption, which prevent unauthorized access to the drive's contents over the host interface.

The compression ideology was part of the reason Sandforce failures became notorious and (relatively speaking) common.

LINK-> Kurald Galain:  The Night Eternal 

Top 5820k, 980ti SLI Build in the World*

CPU: i7-5820k // GPU: SLI MSI 980ti Gaming 6G // Cooling: Full Custom WC //  Mobo: ASUS X99 Sabertooth // Ram: 32GB Crucial Ballistic Sport // Boot SSD: Samsung 850 EVO 500GB

Mass SSD: Crucial M500 960GB  // PSU: EVGA Supernova 850G2 // Case: Fractal Design Define S Windowed // OS: Windows 10 // Mouse: Razer Naga Chroma // Keyboard: Corsair k70 Cherry MX Reds

Headset: Senn RS185 // Monitor: ASUS PG348Q // Devices: Note 10+ - Surface Book 2 15"

LINK-> Ainulindale: Music of the Ainur 

Prosumer DYI FreeNAS

CPU: Xeon E3-1231v3  // Cooling: Noctua L9x65 //  Mobo: AsRock E3C224D2I // Ram: 16GB Kingston ECC DDR3-1333

HDDs: 4x HGST Deskstar NAS 3TB  // PSU: EVGA 650GQ // Case: Fractal Design Node 304 // OS: FreeNAS

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Ryan_Vickers said:

Oh interesting, so it's a maximum rather than an expected.  That solves some of the mystery for me, but also creates additional questions.  Why would there be a maximum?  Also, it's thus referring to the algorithm and the tape drive rather than the tape itself, correct? (although indirectly also the tape since they must match the drives, this much I know).

So some additional information about tapes and drives. Tape drives support reading of the current and past 2 generations and writing of current and past generation. So an LTO-7 drive can read LTO-5,6,7 tapes and write LTO-6,7 tapes.

 

LTO-1,2,3,4,5 support 2:1 compression and LTO-6,7,8 support 2.5:1. There's that weird crossover where a drive can write to tape media that support different compression ratios. You can't for example use an LTO-6 drive and write an LTO-5 tape at 2.5:1 compression even though the LTO-6 drive support that, backwards compatibility yada yada. Because what if you put that LTO-5 tape written with 2.5:1 compression algorithm in to an actual LTO-5 drive? That ain't gonna work. 

Link to comment
Share on other sites

Link to post
Share on other sites

Can't wait for Gigabyte speed tapes.

Specs: Motherboard: Asus X470-PLUS TUF gaming (Yes I know it's poor but I wasn't informed) RAM: Corsair VENGEANCE® LPX DDR4 3200Mhz CL16-18-18-36 2x8GB

            CPU: Ryzen 9 5900X          Case: Antec P8     PSU: Corsair RM850x                        Cooler: Antec K240 with two Noctura Industrial PPC 3000 PWM

            Drives: Samsung 970 EVO plus 250GB, Micron 1100 2TB, Seagate ST4000DM000/1F2168 GPU: EVGA RTX 2080 ti Black edition

Link to comment
Share on other sites

Link to post
Share on other sites

9 hours ago, comander said:

Fine for large scale backups. 

 

Though HDDs cost very very close to tape these days. 

While that is sort of true, the hardware becomes a problem. You cannot pit too many drives in an enclosure for reasons such as heat and vibration. It is also not easy to extend arrays with more than a certain amount of trays. When we are talking a box that costs £4million You can get a lot more storage per £ when you use tape. In addition, adding extra bays to a tape library (bays the size of a rack or more) you do not increase the thermal load on your data centre and only increase power by a nominal load. This alone makes the choice easy in some centres. This is also where having a HDD cache can be a huge advantage as you get the speed benefit of HDDs and the archival, power and heat benefits of tape. You also find most tape library vendors licence the tape slots. So if you have a library with 4000 slots installed you can start off small and expand as required only paying for the slots you use.

Link to comment
Share on other sites

Link to post
Share on other sites

14 hours ago, comander said:

Fine for large scale backups. 

tape's been used for large scale backups since the 90's, theres no "comeback" to be made there.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, manikyath said:

tape's been used for large scale backups since the 90's, theres no "comeback" to be made there.

One data centre I know of, a new build, the management made an edict that there would be no tape allowed on site as it was classed as "removable media" according to their security. Obviously written by a muppet. Any of the HDDs, the hot swap-able thousands that are on site can be removed far quicker than getting a tape out of a library, and are probably easier to recover data from depending on the RAID method.

Link to comment
Share on other sites

Link to post
Share on other sites

On 3/12/2020 at 4:53 PM, Ryan_Vickers said:

Can someone who knows more than I do about tape storage explain why they have such a focus on compression?  It states compressed or uncompressed size, they often seem (from what I can tell) to have compression built-in in some manner, etc.  This is unlike any other storage format I'm familiar with (optical, HDDs, USB flash drives, floppies, SSDs, etc.).  Those simply store what you ask and state what they have, plain and simple.  If you want compression, the assumption is you'll do it yourself, and they (from what I can tell) never quote how much compressed data they can store, I assume for a combination of reasons including but not limited to:

  • The vastly different compressability of different types of data makes this impossible to state with any meaningful accuracy
  • It's a niche thing, at least for their target markets, which I understand differ quite greatly from the target market of magnetic tape

"Because no one would buy the lower number".

Same with electric vs petrol engines (I noticed in a marine/boat setting recently). An old petrol outboard is around 3-30 hp depending on the size. The oldest, smallest outboard would still be 3hp. But a tiny electric is only like 1/3rd a hp. So no one would ever ever think to buy it (even though it's perfectly fine for an inflatable/fishing boat at times depending on use case). So instead they put "55lbs thrust!!!!!!@@@@@!!!!buyagainandlikeonfacebook!!!!" in the description. ;)

 

Link to comment
Share on other sites

Link to post
Share on other sites

23 hours ago, straight_stewie said:

The purpose of my thought experiment had absolutely nothing to do with any real use of a tape deck, and was just an exercise to discover if it might be theoretically possible to get hard disk level random read/write performance from a tape deck with no other considerations. I believe that I outlined a way in which it could be done using multiple heads and a continuous tape rather than a reel to reel tape. You do not want to view the exercise as valid. So, we must agree to disagree and leave it at that.

It still fails as a thought experiment. You still cannot run that tape through the reader to get the random ops. You may be able to using other methods, but density does not help as the stand alone metric. Thus thought experiment is "I can stand on an air mattress, thus I can stand on air!" error.

Link to comment
Share on other sites

Link to post
Share on other sites

21 minutes ago, TechyBen said:

It still fails as a thought experiment. You still cannot run that tape through the reader to get the random ops. You may be able to using other methods, but density does not help as the stand alone metric. Thus thought experiment is "I can stand on an air mattress, thus I can stand on air!" error.

The reason it fails as a thought experiment is because of the sequential write complication of tapes. Which might not even be a fundamental complication.

People seem to think the only way you can use a tape is reel to reel. This is false, a tape can be a continuous loop just as well as the surface of a disc can be a continuous loop.

As I already explained, density informs the length of the tape, which tells us the feasibility of running magnetic tape in a continuous loop as opposed to running a discontinuous roll of tape reel to reel.

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

45 minutes ago, Phill104 said:

One data centre I know of, a new build, the management made an edict that there would be no tape allowed on site as it was classed as "removable media" according to their security. Obviously written by a muppet. Any of the HDDs, the hot swap-able thousands that are on site can be removed far quicker than getting a tape out of a library, and are probably easier to recover data from depending on the RAID method.

you're right, but also wrong.

 

in a sense, tape is comparable to a pile of DVD's. there's an amount (up to hundreds) of tapes in a library, and only a small number (maybe 8 tops?) of readers.

so essentially, you can remove a tape from the system, and it would only be noted the next time the tape is required.

if someone pulls a hard drive from a server, alarms should go off at the IT department.

 

as for data recovery from a single drive... tape is pretty much written in series (one tape full before it moves to the next), where HDD data is usually on a raid 5/6, which would mean you'd need the majority of drives to be able to recover sensible data from a stolen array.

 

it may be easier to steal data by taking an entire server, than to take only the drives out the front of the server. either way IT is gonna notice immediately, and you'd need to have the majority of the hardware to have the actual data.

 

also, depending on the size of the operation, the preferred way of handling tape would be cycling out tape sets, which means they come in and out the library pretty regularly.

 

as for the "written by a muppet" aspect...

- tape *IS* removable media. it is supposed to be that way.

- while HDD hotswap is a thing, removing a drive creates an error condition. they *can* come out, they are not meant to come out unless if it's for a failure replacement.

 

where i presume management is coming from.. is from the aspect of what a "man on the field" can interact with. it would be easy for a single engineer to stealth-tuck a tape in his pocket to take data out of the datacenter, but it would be relatively hard for a single engineer in the field to surpress the actions following a removed drive. (should note with this one: usually datacenters have monitoring on their raid arrays, which raise an alarm if there is a drive error. healthy drive removed without prior warning presumably initiating a lockdown state.)

 

in short, it's not so much about how "accessible" a means of storage is, it's a matter of how that plays into the bigger picture of an entire datacenter.

 

or to put it in an other way: i have all the intel required to theoretically bust into 3 HIGHLY critical data storage facilities, and make off with sensitive customer data by stealing hard drives. i know where to go, which doors to shim open, where said doors' weak spots are, and which drives contain the nice bits. the complete picture however.. is that security would be right there long before you'd make it out the yard, there'd be douzens of cameras recording you along the way, and IT would be alerted immediately.

 

on the flip side... i could BS my way in, and BS my way back out with a stack of tapes in my pocket, and it'd probably only come out a week or so later, with no papertrail.

Link to comment
Share on other sites

Link to post
Share on other sites

42 minutes ago, manikyath said:

you're right, but also wrong.

 

in a sense, tape is comparable to a pile of DVD's. there's an amount (up to hundreds) of tapes in a library, and only a small number (maybe 8 tops?) of readers.

so essentially, you can remove a tape from the system, and it would only be noted the next time the tape is required.

if someone pulls a hard drive from a server, alarms should go off at the IT department.

That assumes a lot of things. You can pull a HDD from a server and no alarms will go off if SNMP is not setup, or the agents are not installed etc. There are plenty of reasons alarms may not go off. I have seen for instance disks stolen during maintenance windows because a server is shut down. The biggest risk is if Site A is powered down for essential maintenance, such as power or chillers need repair. So the system is running from site B. Often at that point security is at its weakest. So you see where I am coming from, HDD is often less secure as it takes seconds to remove a disk. Removing a tape from a library is actually more complex. The doors are locked and if opened the system has to go into a different mode, at the very least offline. Or you can remove them through the CAP or whatever the model in particular calls it. Again, those are password protected and requires the robot to go offline.

 

42 minutes ago, manikyath said:

as for data recovery from a single drive... tape is pretty much written in series (one tape full before it moves to the next), where HDD data is usually on a raid 5/6, which would mean you'd need the majority of drives to be able to recover sensible data from a stolen array.

HDDs are sometimes in R5/.6, other times in a mirror. It is quite common for instance to mirror the OS disks and R5/6 the data set. It is not as simple as it seems. 

 

As for tapes being serial in nature, not always the case. Systems such as Fujitsu Eternus CS you would actually require the whole system including the database to recover any of the data on the tapes. You cannot simply read strings of data from the tape and expect them to make sense, as the will not. Moreover, the new file system with multiple tapes for speed is akin to a R5/6 setup so the same limitations apply.

 

42 minutes ago, manikyath said:

it may be easier to steal data by taking an entire server, than to take only the drives out the front of the server. either way IT is gonna notice immediately, and you'd need to have the majority of the hardware to have the actual data.

See above.

42 minutes ago, manikyath said:

 

also, depending on the size of the operation, the preferred way of handling tape would be cycling out tape sets, which means they come in and out the library pretty regularly.

I am only talking about large libraries here where tapes are never removed from the system unless they go faulty, which is surprisingly rare. Gone are the old days where you kept a set in a fire safe offsite.

42 minutes ago, manikyath said:

 

as for the "written by a muppet" aspect...

- tape *IS* removable media. it is supposed to be that way.

- while HDD hotswap is a thing, removing a drive creates an error condition. they *can* come out, they are not meant to come out unless if it's for a failure replacement.

Yes, they can and do come out. Usually a hot spare kicks in. Still does not mean an alarm will go off. Even if it does, on a dark site that often results in a call being raised, an acknowledgement by an on-call team then a phone call or two before anything is done. In that time the perpetrator is down the pub on his third pint.

 

42 minutes ago, manikyath said:

where i presume management is coming from.. is from the aspect of what a "man on the field" can interact with. it would be easy for a single engineer to stealth-tuck a tape in his pocket to take data out of the datacenter, but it would be relatively hard for a single engineer in the field to surpress the actions following a removed drive. (should note with this one: usually datacenters have monitoring on their raid arrays, which raise an alarm if there is a drive error. healthy drive removed without prior warning presumably initiating a lockdown state.)

 

in short, it's not so much about how "accessible" a means of storage is, it's a matter of how that plays into the bigger picture of an entire datacenter.

 

or to put it in an other way: i have all the intel required to theoretically bust into 3 HIGHLY critical data storage facilities, and make off with sensitive customer data by stealing hard drives. i know where to go, which doors to shim open, where said doors' weak spots are, and which drives contain the nice bits. the complete picture however.. is that security would be right there long before you'd make it out the yard, there'd be douzens of cameras recording you along the way, and IT would be alerted immediately.

 

on the flip side... i could BS my way in, and BS my way back out with a stack of tapes in my pocket, and it'd probably only come out a week or so later, with no papertrail.

Really, there is little difference between tapes and HDDs. There are always times when security is not up to scratch. Obviously it depends on the data centre, their practices and how well they are enforced. Usually data breaches are done in other ways, stealing physical media is rare. So really, tape is no less secure than any other media in a data centre, it should not be discounted for the reasons of data security.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, manikyath said:

in a sense, tape is comparable to a pile of DVD's. there's an amount (up to hundreds) of tapes in a library, and only a small number (maybe 8 tops?) of readers.

so essentially, you can remove a tape from the system, and it would only be noted the next time the tape is required.

if someone pulls a hard drive from a server, alarms should go off at the IT department.

There are tape libraries with thousands of slots. Want one that are racks themselves and can span multiple of them? Yep those exist. Want one that is a purpose built vault with purpose made robotic arms that run the length of the room? Yep those exist.

 

Tape libraries I use are HPE MSL6480 and support 80 tapes per module and 6 drive per module with maximum of 7 modules, so 560 tapes and 42 drives if I wanted to (not that I would use 42 drives).

 

Step up from what I use are systems like the HPE T950 and TFinity ExaScale Tape Libraries which you talk about in the thousands of tapes, well in to the 10s of, and hundreds of drives.

 

1 hour ago, manikyath said:

so essentially, you can remove a tape from the system, and it would only be noted the next time the tape is required.

Well you'd have to get past a locked fire door in to a concrete room, brake open the rack door lock, input the library password to issue a move operation, get out before the library notification email gets read that a tape has been ejected and then past probably 4 security guards now on their way with one in a comms room watching you on hundreds of cameras across the campus.

 

1 hour ago, manikyath said:

it may be easier to steal data by taking an entire server, than to take only the drives out the front of the server. either way IT is gonna notice immediately, and you'd need to have the majority of the hardware to have the actual data.

Most servers do not have locking HDD bays, flip release leaver and walk away, so much easier than trying to get a tape out of a library. Also all my tape media is software data encrypted before written to tape and multiplexed so you aren't getting anything with a single tape.

Link to comment
Share on other sites

Link to post
Share on other sites

Not gonna happen. Tape is offline storage, and it's lifetime is questionable at best. At least you can scrub a disk pretty easily, when was the last time you scrubbed your tape vault to see what you could read and what you couldn't?

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×