Got it taped: The business of tape-based disaster recovery

Taking a risky - or risk free? - hike up Iron Mountain

Top three mobile application threats

The quick and the undead

Not all of Hal’s clients can afford to implement a mirrored SAN in a different physical location with the comms links, and all that that protocol entails – SRDF runs across pretty expensive hardware. If you’re prepared to wait longer than a few milliseconds, then just have your target servers and storage sitting there ready, waiting. When the time comes, send tapes over via courier, restore them and you should be up and running within as many hours or days is quoted in the Service Level Agreement (SLA).

A lot of companies will consider this approach as it's certainly cost-effective. After all, DR is just an insurance policy; even though there are risks involved, disaster definitely doesn’t strike every company. Tape might not be the quickest option to get you up and running again, but for a policy you might never claim on, it’s excellent value. Not everyone needs the courtesy car after a crash.

Iron Mountain security camera

Keeping an eye on things at Iron Mountain

Regarding those SLAs, while you could be back in business in hours with tape, those quick turnaround deals can be scuppered on rare occasions, as Hal explains.

“If the tape is at Iron Mountain and we need an emergency recall, it should be with us in two hours. The only time that that falls down is if the tape has just left our site and is on its way up to Bristol. You can’t just recall the van. That doesn’t happen as other client tapes obviously need to get back to the vault.”

You can’t insist the van driver puts his foot down either, as Iron Mountain has speed limiters on its fleet that won’t permit driving above the UK maximum speed limit of 70mph. The vans are tracked too, with a Green Roads sensor system. It’s the type of tracking system that’s been tested on young motorists as a way of proving they're not boy racers, thus lowering their insurance premiums in the process. Given their precious payloads, Iron Mountain van drivers need to be saints on the road. Lloyd explains how the monitoring is utilised.

“It’s live tracking and sends back information on whether the driver is speeding, sharply turning corners, changing lanes and if the vehicle is accelerating or decelerating quickly. If the safety score is too high we re-educate them. Our in-house driver trainer will go over the safety score, discuss any issues and refocus attention. Overall the system reduces wear and tear, increases MPG and decreases accidents. We don’t want accidents – it impairs our image.”

Not that everyone would know they’d collided with an Iron Mountain van. The folks at High Fibre insist on unmarked vehicles to ship its precious tapes from site to site. It's a continuous cycle of daily and weekly refreshes along with the monthly archives. Iron Mountain manages the physical handling of this routine, providing new tapes along with off-site security for the company data.

Fabric design

In general, Windows Server 2003 or 2008 deals with the client software side of things that Hal encounters. He describes the data centre set-up:

"On the bigger libraries such as the Quantum Scalar i500 that we use for larger installations, you might have two, six, eight or more LTO-4 fibre tape drives that are hot swappable, each with a 4Gb/s fibre channel connection at the back. They will be connected with a fibre cable into the SAN ‘fabric’ for that particular client. So the storage is in the SAN for that client (it might be dedicated or shared) and those SAN switches, typically referred to as the fabric, are the centrepoint for all the fibre cabling, and this doesn’t change once it’s in place, it normally just works.

The backup server would have HBAs (host bus adapters) with fibre cables going into the SAN switches. The client’s main servers (database and application servers) would similarly have fibre connections to the SAN switches. So they’re all interconnected with fibre which typically runs at 4Gb/s – more modern ones run at 8Gb/s, possibly higher – but we generally have everything at 4Gb/s. So that obviously optimises the flow of data from the SAN storage arrays through the SAN switches to the tape drives.

Quantum Scalar i500 tape library

Big business: Quantum Scalar i500 tape library

For people like me, it’s the shared SAN, but we formally refer to it as the combined array and fabric. The servers connect to it, the tape drives connect to it and it’s managed by a different group: the SAN team. So they create storage areas in the arrays, then present them to individual servers and tape drives. The way that’s done is through a unique number on the fibre channel interface, which is known as the World Wide Name (WWN) – it’s like a MAC address. So every fibre channel adapter, for instance the HBAs in the server, the tape drives, the connections in the SAN switches, they all have unique WWNs.

Typically, there’s nothing more an operator has to do unless a storage area needs to be increased or reduced or, in certain SAN systems, copied to another area in the SAN. An administrator would allocate a particular area and present it and generally that’s their job done. SAN guys are not involved with the tape library.

On a normal backup the libraries are just pulling data off the SAN and putting it onto tape. For a restore they’re writing it back to SAN. These operations go on within the data centre as you’d need a very fat pipe if done remotely. However, we do have replication between data centres down dedicated links where that data is, for example, copied from a production SAN to a DR SAN.

We run our client backups at night is because there are fewer users on the system, so you can expect a bandwidth improvement, especially if going on a main public interface, although there might be a separate backup interface. By mid-morning the backup administrators in India identify the tapes that need to come out of the libraries. They’ve been monitoring them and can see if they’ve all finished. They list those tapes and send that info off to our smart hands people. These guys are in the same building as me and they’ll go around all the libraries, take out the specified tapes for secure storage and put fresh ones in.

Quantum Scalar i40 tape library software

Quantum Scalar i40 tape library software shows the drives and all slots including IE and cleaning status

Library machines don’t need routine maintenance and, in my experience, the Quantum ones are very reliable compared to others that I’ve dealt with in previous companies. The worst was an HP StorageWorks MSL6060 (a model dating back to 2003), mechanically it was horrendously unreliable, the robotics were a nightmare. The arm got stuck, the tapes got stuck, we'd be calling for an engineer every week. Working with Quantum libraries is a different world. That said, we've just had a Scalar i40 become faulty. It doesn't get beyond the start-up tests and the LCD reports that initialisation failed.

Occasionally, a drive calls for cleaning. That’s obviously a function of the library software that can be relayed out to the backup software. It could be via a GUI message, an error code in the library or a physical alert such as a little amber light. We would ask the backup admins to look into it. And they might reply, requesting we put a cleaning tape in. Some of the larger libraries you might keep it in there in a specific slot, with smaller ones, typically you don’t."

Combat fraud and increase customer satisfaction

Next page: Rack space

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Kingston DataTraveler MicroDuo: Turn your phone into a 72GB beast
USB-usiness in the front, micro-USB party in the back
AMD's 'Seattle' 64-bit ARM server chips now sampling, set to launch in late 2014
But they won't appear in SeaMicro Fabric Compute Systems anytime soon
Brit boffins use TARDIS to re-route data flows through time and space
'Traffic Assignment and Retiming Dynamics with Inherent Stability' algo can save ISPs big bucks
Microsoft's Nadella: SQL Server 2014 means we're all about data
Adds new big data tools in quest for 'ambient intelligence'
prev story


Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
SANS - Survey on application security programs
In this whitepaper learn about the state of application security programs and practices of 488 surveyed respondents, and discover how mature and effective these programs are.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.