Mirror, mirror on the wall, who has the best TSM backend of all?
Isilon or GPFS for TSM scale-out beasts
A respected EMC blogger and IBM are arguing over who has the best scale-out backend storage repository for Big Blue's TSM backup and archive software: EMC's Isilon or Big Blue's GPFS.
IBM says that TSM – Tivoli Storage Manager – products "provide backup, archive, recovery, space management, database and application protection, and bare machine recovery and disaster recovery capabilities
Stefan Radtke's blog says "Isilon is a perfect target for TSM backups."
He says he checked this out with a customer deployment, starting with four TSM instances running on Windows 2012 backing up to a pair of 18TB NetApp arrays and a couple of TS3500 tape libraries, each with eight LTO4 drives, across a SAN.
This was changed to have the same TSM instances backing up to a 3-node Isilon NL400 cluster with 432TB raw capacity (260TiB usable) across a LAN and a single 8-LTO-4 drive TS3500 tape library.
Radtke's TSM and Isilon configuration
With the NetApp setup, Radtke says: "Backup, archive and migration jobs ran at 100-150MB/s throughput until next day, sometimes until noon ... [and] archive jobs ... ran between 8 and 16 hours."
After the switchover to Isilon, and looking at just one TSM instance: "Throughput increased to ~400 MB/s ... the archive throughput ... as well as the backup and restore throughput ... already increased and as a result finished several hours earlier."
TSM was then modified to use more threads and "throughput increased to 800MB/s [with archive] throughput increased from ~150MB/s to ~750MB/s and the runtime went down from ~16 hours to ~2,5 hours."
More data was stored on the Isilon arrays so only one tape library was needed.
Radtke's summary says that, with Isilon:
- Average backup and archive throughputs have been increased by a factor of ~5.
- Runtimes have been reduced by the same factor (12 hours to 2.5 hours).
- Complexity has been reduced since all TSM servers share a single file system.
- No more SAN components between TSM servers and Isilon (so no more volumes, LUN-masking, SAN-Zoning, device class definition changes…)
- Restores will be much faster in general.
Over to IBM and Andre Gaschler and Nils Haustein who discuss running TSM servers on GPFS storage.
They explain the GPFS us "a high-performance cluster file system optimised to provide concurrent high-speed file access to applications executing on multiple nodes in [a] cluster."
The two "conducted a series of tests with TSM on IBM System x GPFS Storage Server (GSS). The GSS system provides standard GPFS file systems which are configured on GPFS native RAID devices (GNR). The TSM server software runs on servers connected to the GSS files system via high speed network connections." The link between the GSS box and the two TSM servers was 56Gbit/s InfiniBand.
The two write:
- Peak backup performance using multiple sessions for a single TSM server is 5.4GB/sec.
- Peak backup performance using multiple sessions for two TSM servers is 4.5GB/sec per server or 9 GB/sec in total.
- Peak restore performance using multiple sessions for a single TSM server is 6.5 GB/sec.
- Peak backup performance using a single session for a single TSM server is 2.5 GB/sec.
They say: "These performance measurements clearly indicate that the TSM server performance scales linearly on GSS [GPFS Storage Server)."
The two GPFSers conclude: "The superior GSS performance combined with operational simplification represents a perfect storage environment for TSM. Multiple TSM instances can scale-out in multiple dimensions in an elastic – GPFS based – storage cloud."
The peak TSM/Isilon throughput was 800MB/sec while the TSM/GPFS throughput was 5.4GB/sec (5,400MB/sec) – almost seven times faster. It's not an apples for apples comparison, but it clearly shows that Isilon is not the only fruit and GPFS could be a more flavoursome fruitstuff. ®