IBM's monster tape will take three days to fill
35TB cartridge poses whole new set of problems
SaaS data loss: The problem you didn’t know you had
IBM Research has devised technology with FujiFilm to create a 35TB capacity tape, but it will take 3 days to write the data at LTO5 speeds.
The new hyper-capacity half inch tape technology has been successfully read and written at a 29.5bn bits/sq in areal density, which means a tape capacity of 35TB according to the researchers. This is said to be 44 times the 800MB raw density of LTO4 tape. From a technological point of view the gee whiz factor is impressive.
The media is FujiFilm's Nanocubic tape, with an ultra-fine, perpendicularly-oriented barium-ferrite magnetic medium that apparently does not use expensive metal sputtering or evaporation coating methods. IBM has developed new servo control technologies enabling a 25X increase in the number of parallel tracks on half inch tape, with a track width of less than 0.45 micrometers.
There is an ultra-narrow 0.2um data reader head and a data read channel based on a data-dependent noise-predictive, maximum-likelihood (DD-NPML) detection scheme developed at IBM Research in Zurich. IBM Research at Almaden developed a reduced-friction head assembly allowing the use of smoother magnetic tapes and an advanced GMR (Giant Magneto-Resistive) head module incorporating optimised servo readers.
The capacity can be increased to the 100bn bit/sq in level according to the IBM researchers. However, one issue that IBM and FujiFilm do not discuss is the time to read or write 35TB of tape data. Using LTO5's tape transfer speed of 140MB/sec it would take 2.89 days (69.44 hours) to write the full 35TB. To write 35TB in the same time that LTO5 writes its 1.5TB of raw data, that's 2.98 hours, would require the tape speed to increase 23.33 times, and that assumes that read/write heads can process the signals passing to and from the tape that quickly.
Accelerating tape speed 23.33X would also increase the risk of tape deformation or breakage and require more electricity for the drive. It seems likely that either multiple-head tape drives or greatly increasing the number of tracks readable by a single head would be needed to be developed to cut the tape read/write times down to more practicable levels. A back of an envelope calculation suggests a 4-head drive or drive which read 4 times as many tracks would cut the 35TB read/write time to 17.36 hours. Another possibility would be to stripe the data across two or more tape drives. A 4-drive setup using such heads would deal with 35TB in 4.34 hours and that starts looking reasonable.
Such striping across multi-headed drives implies a tape library using 35TB cartridges would need more drives and more robotic capability to move cartridges between slots and drives, such that, for example, four cartridges could be delivered to four drives simultaneously. If tape libraries are forecast to sustain their usability because tape storage economics are going to outstrip those of disk for many more years, then changes to allow tape cartridge striping, multi-headed drives, and multiple simultaneous cartridge loading into drives look necessary. ®
COMMENTS
A few answers to some of the questions
<puts on fireproof suit>
@Adam Wheeler - LTO4 stores 800GB, natively, before compression. Don't believe me? - Go and look at http://lto.org/technology/uformat.php?section=0&subsec=uformat for information on LTO3 and 4. If your data can be compressed at 2:1, you will be able to put 1600GB of data on a single LTO4 tape. The native capacities for all the LTO Ultrium generations so far are 100,200,400,800 GB, and you add on compression to those native sizes if your data can compress.
As for the discussion about the number of heads required, versus the speed etc - it may be useful to know that current tape tech routinely writes multiple streams of data to a single tape head anyway, on separate tracks. LTO4 and various other drives write data to the tape in a longitudinal serpentine interleave pattern; What this means is that the drive writes a track from start to end of tape, then moves the head very slightly to one side, and writes a new track from end to start, moves the head a little again, and so on. From memory, LTO4 actually writes 16 tracks at a time, and the tape will go from start to end quite a few times (56 passes, so 28 times start-end-start?). By the time the tape is full there are hundreds of tracks (896 tracks in total for LTO4). As a handy side-effect, the end point of the tape is the same physical wind point as the start, so there's not much in the way of rewind time - just the couple of metres of leader to unspool (takes a couple of seconds at most).
In order to be able to write the tracks close together with the accuracy required, there are special hard-coded tracks put on the tapes at the factory (these are called servo tracks - because the tape head servo mechanism follows them) - degaussing an LTO tape may/would destroy these special tracks and render the cartridge useless.
In terms of speed, when we double capacity of either HDD's or tape, we don't necessarily double speed - in fact, usually only a fraction of that is the case. A 2TB hard drive isn't 4x the speed of a 500GB one, is it?.
I'd be shocked if the 35TB prototype got anywhere near 44x the speed of a current-gen drive... in fact I'd expect it's only a few times quicker. IBM did a prototype of a 1TB drive in 2002, so this sort of thing isn't unprecedented.
Q:Who uses tape any more?
A:People who want to offsite data to a vault, and people with lots of data. One of the largest generators of data I know of is CERN - they generate about 15PB of data a year at the present time, and yes, the majority of their data is stored on tape (~45PB at the moment?); it's not historical either as they've recently been adding new kit. Tape doesn't use power when you're not actually accessing it, and doesn't usually require the same level of raised flooring or expensive cooling systems to run. Over a lifecycle, it can be cheaper - which is why CERN are proponents of it. They also use disk, because tape isn't good at everything.
I hope this has been helpful for anyone who can be bothered to look past the rubbish people spout (and if I got anytihng wrong.. let me know and I'll correct it if I can)
CM
29.5bn bits/sq in
Yuck. Can't you express that in Bibles per nanoWales or something?
I make it about 150 nm square per bit.
Err...
What sort of disk array are you using? I have never come across bandwidth as a problem, or re-creation time for a raid set as a problem.
Need more bandwidth? Slap in another fibrechannel or two.
Need your RAID sets to re-create faster? Make multiple smaller RAID sets rather than single monolithic sests.

IT infrastructure monitoring strategies
Agentless Backup is Not a Myth
Top 10 SIEM implementer’s checklist
Steps to Take Before Choosing a Business Continuity Partner
Enabling efficient data center monitoring