Plane or train? Tape or disk? Reg readers speak
Disk speed versus tape economy and removabiity
You the expert Plane or train? We asked four Reg-readers with storage smarts to say where and when we should use disk-based data protection and where we should cross the line and use tape. Three did just that. The fourth identified a fourth use-case for tape and added a salutary reminder that it has to be managed; it is absolutely not a start-backup-run-and-forget option. The consensus was that neither disk nor tape on their own are sufficient.
Disk is in because its faster to backup to disk and restore from it, but tape is not out, not at all IT has substantial cost advantages, holding much more data for less money, and it can be stored off-line, even off-premise, making it a better insurance against disaster striking a data centre. In general disk has not replaced tape, and probably won't.
H Wertz - Freelance system administrator
I think the availability of disk-to-disk backup and virtual tape library (VTL) systems has reduced the need for tape, but tape still has an important role for archival purposes.
There are uses where a disk-to-disk backup or a VTL excels. One of these is in cases where frequent retrievals are expected, such as users frequently deleting or overwriting files. (On a side note here, VMS had/has a versioning file system, which would inherently keep older copies of files available for easy retrieval, and just remove the oldest ones as the disk filled up. But this is a very unusual feature. I have not heard of another system that has this.) When I was a student in the late 90s, it'd take me about 30 minutes to pull a file off the departmental DDS-3, and it would take closer to 4 hours (and a hefty fee) for ITS to pull a file off the backup of main university systems.
Retrieval from a VTL would have taken a minute or two tops to find and retrieve a file. The big disadvantages of VTL? Since it is really an array of disks, software faults, hardware faults, or administrative faults could all render the library useless. In addition, the VTL would ordinarily be on-site.
Tape is still quite important for archival purposes, both for compliance and especially for disaster recovery purposes. A tape is written, then packed away and stored, so, unless tapes are re-used, those tapes provide an immutable record of what is on the system up to that point. Once it's ejected it won't be accidentally overwritten, erased, or modified. In addition, the tapes can be stored off-site, so in case of disaster the tapes won't be destroyed as the VTL could be. There are a few disadvantages, primarily "bit rot", the obvious retrieval speed disadvantage, and the cost of moving and storing tapes.
There are several technologies that could reduce the role of tape. First, internet backup allows for off-site storage without having to physically transport anything. However, like VTL it could allow for backups to be modified or deleted. Additionally, if you're using a service provider (instead of your own second site), it'd be a very good idea to verify if they have a robust set-up. One or two in the last 10 years have had a single hardware failure knock them off the face of the earth. Bandwidth is also a big issue: how long will it take to restore the entire system?
Second, systems that use removable hard disks. These provide the advantages of tape (disk can't be accidentally modified if it's not plugged in, can be stored off-site, and so on. In addition, it speeds up file retrieval; someone still has to insert the right disk, but then retrieval is nearly instant.
The disadvantages? Potentially reliability (although I've had great luck hooking up disks that have sat doing nothing for years). The big one? Price; tapes cost about 1/10th the cost per byte of hard disks.
Mainframes are a special case. In terms of disaster recovery, they have supported synchronisation of services and storage between multiple sites for over 10 years, so in case of a disaster a backup system can be ready to go. Also, mainframes have extensive journaling, so in case of a problem the system can be rolled back to an earlier state. However, if the goal is to eliminate the potentially thousands of tapes, a VTL is also required, as mainframe software and procedures assume at least one tape drive.
In conclusion, tapes may not have the high profile they used to, but they are still important for archival and disaster recovery purposes.
Henry Wertz graduated from the University of Iowa in 2000. He has been a Linux user since 1994 (loading off floppies makes a CD install seem like luxury!), and into cars (both fast ones, and ones that are highly efficient). He is currently doing freelance computer work. He is also a regular commenter on Reg stories.
A few things
First interview person Henry Wertz: "The big one? Price; tapes cost about 1/10th the cost per byte of hard disks."
1.5TB Maxell LTO5 tape from Amazon: $67.95. 1.5TB Western Digital Elements external HDD from Amazon: $78.62. If you want to get really technical, you can get one of those HDD docking stations (similar to requiring a tape drive for tapes [which run about $2,600 for LTO5 btw]) and buy raw disk drives: Western Digital Caviar Green 1.5TB from NewEgg for $59.99. If you want to get really picky, you can assume no compression on the hard disk and an optimal 2:1 compress for the tape to achieve the 1.5/3.0TB capacity, then you have to get a HITACHI Deskstar 3TB (NewEgg $139.99), but mind you, compression on disk is quite easy and 2:1 is by no means difficult to achieve using even low on-the-fly streaming compression. Back up a video or JPEG library and you'll only see 1.5TB out of the tape. Therefore, even worst case (no compression for disk and optimal 2:1 for tape) lands at 2.06x the cost of tape. Best case is only 88% the cost of the LTO5 tape for like capacity. So no, not 1/10th the cost. Sorry. Especially when you factor in the $2600 tape drive vs a moderately priced Cavalry EN-CAHDD2BU3-ZB disk dock (for instance) at $64.99 @NewEgg.
Second interviewee Evan Unrue: "but also, disks keep spinning, so doing this comes with a larger physical footprint in the datacenter and a larger power bill. Tape scales by adding cartridges which don’t spin when not being use and don’t take up space in the IT room as they scale"
Why is it that everyone assumes that a disk-based solution mandates the drives are always on? Sure, the first target in the D2D2T or D2D2D will be required to spin, but not the last stage. Disks would work as removable medium just as effectively as tapes in this regard. I would suggest that disks are less vulnerable to environmentally-caused "bit rot" as well, due to the platters not prone to going brittle as tape has a tendency to do (at the very least it can withstand being in a less-than-ideal storage location better [think attic of IT Director's house or the like] if necessary).
I applaud the third interviewee Chris Evans for pointing out some of the shortcomings of tape solutions. Granted, disk has disadvantages too, and as Chris said, it comes down to finding a balance between the two based on your RTO/RPO requirements. The key is finding the best spot to use the appropriate medium. For enterprise environments with hundreds (or even tens) of TBs to backup, you can't beat a tape library for convenience. For anyone with 3-6TB or less for a full backup set, anything more than tape drive or external HDD is likely overkill, especially for the sub-1TB market.
As always, check your logs on your backup jobs frequently. If that's too much of a pain, find a way to have the results emailed (same as paged nowadays) to you upon completion/failure. For those willing to roll up the sleeves (such as the ZFS/CopyFS commenter above), there's plenty of methods you could employ to produce a better setup for your organization than BackupExec or the like could provide, and using HDDs just makes that solution even easier and more feature-full.
Tape is more cost effective for Small Business users? erm no.
weirdly i have been looking at backup solutions today for a friends start up company of around 15 users 1mail 1 file and 1SQL server. they would like to remove the backup and take off site and also recover deleted files quickly.
They can achieve their requirements for around £1500 New system total but it will actualy only cost them around £600 as they only need for symantec backup exec they have the rest lying around as spares.
Server or PC
Windows Server OS
2x icybox removable storage drives
2x 1TB HDD or 2TB HDD as they grow
Backup with remote agents from the other servers to removable HDD's
1 of them for offsite backup that they can take out everyday. the other for incremental archiving.
This offers fast and reliable backup solution
as they arnt a big company they wont use more than 1TB of data for the foreseeable future and no staff are waiting to long for someone to restore data for them.
yes its not as cheap as a tape drives but time is money right? if you can save time you earn more money! i can see bigger company's will still use tape drives for archiving huge amounts of data but small business's... really?
Versioning File Systems
From the article: "On a side note here, VMS had/has a versioning file system, which would inherently keep older copies of files available for easy retrieval, and just remove the oldest ones as the disk filled up. But this is a very unusual feature. I have not heard of another system that has this."
The author clearly hasn't looked very hard or he would have found CopyFS - a fuse based copy-on-write versioning file system. Stack this with a deduplicating file system like ZFS, and you have a very efficient continuously versioning file system. Combine that with lsyncd to acquire on-close remote copy capability and snapshotting on the remote copy, and you have a solution that alleviates the requirement for any tape solution, except in cases where external bandwidth is insufficient.