New IBM storage chief Ambuj Goyal: I like all-flash and I cannot lie
We'll use mutant hybrids when it's not urgent, says new broom
SaaS data loss: The problem you didn’t know you had
Just two months into the job and IBM's newest storage general manager Ambuj Goyal is putting his stamp on the business.
He told El Reg that Big Blue plans to move all transaction data away from disk to all-flash arrays; that he's not that keen on object storage; and that he envisages an IBM that sells "less storage".
He gave Vulture Central storage desk the run-down in an interview which covered flash and storage consolidation and object storage. Goyal has been making an introductory round of meetings with customers, business partners, analysts and hacks after his appointment.
Returning to the data centre flash storage topic, flash is not the death knell for IBM's primary data-storing disk drive arrays that it might seem, said the new storage chief.
Goyal said that applications running in data centre servers - be they mainframes, PowerPC or Intel servers - should not have to change to accommodate a new storage medium like flash. Instead the new medium should be accessed behind the familiar controllers so the applications see nothing different, except faster data access, said the storage chief.
Flash for transaction data
Goyal has a clear focus when it comes to the IBM data centre heartland of transaction processing. The new storage chief says, "unless you understand that, you can't design the best flash systems." He added: "Our play is the all-flash array for transaction processing, not hybrids (arrays combining disk and solid state storage)." But the flash array has to fit in seamlessly from the application's viewpoint.

Ambuj Goyal, general manager IBM storage.
Goyal painted a picture of a high-end Unix server, a mainframe and a VMware server - each of which may connect to an IBM disk drive array such as a DS8000 or Storwize V7000. He said: "The flash array has to fit in transparently. It has to be behind the DS8000 controller," or the V7000 one come to that - meaning, in effect, an SVC (SAN Volume Controller), a SAN storage virtualising front end. We might envisage the flash array being a set of TMS RamSans.
He also suggested another use case involving a SharePoint server accessing a NAS array for files, and says IBM can put a decent controller - the SVC again - with an all-flash array behind it and by so doing eliminate the need to make changes to the application. He said: "Anything that's rip-and-replace takes a long time to be successful in the market place."
IBM will continue to provide traditional storage arrays, like the DS8000, and XIV we suppose, with all-disk or hybrid disk and solid state drives, for non-transaction data (data which requires less urgent access).
Three steps to selling less storage
Goyal says IBM wants to enable customers to get more value from their storage. The storage market in general has too much of a storage focus and generally wants to sell more storage. He counters this; "We are going to build our strategy on selling less storage and delivering more value. We'll make a storage consolidation play." For example, IBM will talk to a customer with a 3PB storage estate and cut it in half.
He envisages a three-step process:
- Understand what storage a client has and whether it can be handled more efficiently. This will be done using Butterfly software.
Big Blue bought UK-based Butterfly Software in September last year and the price wasn't disclosed. When we wrote about Butterfly back in March 2011 it had a product to convert backup data from one format to another. Since then Butterfly has developed storage planning software and migration tools for data centre infrastructures to help businesses to use their storage more efficiently, saving data centre power, floor space and management time - meaning saved cost.
A key Butterfly expertise area is the ability to analyse storage use and efficiency; it has an analysis engine with agentless discovery and analysis software. According to IBM, businesses can use the software to discover and understand storage infrastructure attributes like retention policies and backup processes far more quickly than the months that might be taken by a manual audit. The other key focus centres around its Storage Migration Engine which can move data automatically. The acquired company is now part of IBM's software group and Goyal wants to make use of it to aid customers' storage estate consolidation.
- Apply non-disruptive consolidation so the applications don't have to change. A key way of doing that is to insert an SVC controller transparently into data paths and have block and file storage mediated through it.
- Manage the transition by using a product and portal to understand what is happening - the Virtual Storage Centre (VSC). Read a VSC white paper here (registration required). VSC is about moving to a virtual storage environment, in the same way, roughly speaking, as servers have moved to a virtual server environment.
The white paper says:
The IBM SmartCloud Virtual Storage Center solution comprises a storage virtualisation platform, a comprehensive set of storage virtualisation management tools, and application-aware snapshot backup and restore capabilities.
Our old friend the SVC is the virtualisation platform. Tivoli Storage Productivity Center is the management component. Goyal says this three-step process works both with IBM and third-party storage.
File storage and objects
The storage chief says customers have "lots of file storage all over the place with different strategies for archive, backup and recovery." The files typically do not hold transaction data.
Customers need help to make the management easier and a rip-and-replace cure strategy won't work, he maintains. Goyal went on to explain his trickle-feed idea: "What if local application use of filers doesn't change but data is trickle-fed into a place where you can centrally manage it?"
The central data would be analysed so the overall risk of the file estate could be reduced along with its cost and size, and the trickle feed would be two-way. Goyal said: "We will build a strategy on that basis."
This will not involve file virtualisation, he insisted.
Goyal was not enthusiastic about object storage. He said he likes the idea of unified storage with both a block and file interface but can't see a role, it appears, for object storage.
We're left thinking that, in his view, transaction data should move to flash but be accessed through familiar controllers, like the SVC. File data storage is a mess and needs managing better to bring discipline and rigour and efficiency to it. Object storage has little or no role.
He said: "We have a disk business, a tape business and a flash business. ... I don't care about the media."
The new storage GM said clients need service level agreements, performance, capacity, management, etc from their storage facilities and: "I'll use the right media and technology for it." The choice is defined by the desired business outcomes and not by a desire to sell more storage product, he added. ®
COMMENTS
Re: So long, SAN
I thought a SAN was there to consolidate storage -- DAS has traditionally carried with it very low utilization rates.
On top of that the emergence of virtualization and the ability to move VMs between physical servers online has required shared storage of some kind for a while. Relatively recently some folks have come up with ways to do that now with DAS -- though you still have the availability problem -- you can't migrate the VM in the event the host that VM is down. So your back to good HA shared storage (of course not all shared storage is created equal) for the more mission critical things at least. Not only that but there is a massive amount of overhead if your having to transmit the data of a VM from one host to another.
SANs also offer things like snapshots -- for me I use this feature a lot - snapshot a LUN(e.g. database), present a writable version of that LUN to another host (or another VM - as opposed to say VMware's own snapshots which I find mostly useless) so it can do things with it. The snapshot consumes almost no space(only deltas of course), so provides a valuable means to improve data flexibility vs doing full on replication. I can wipe the snapshot out and refresh it from the master LUN in a matter of say 90 seconds (95% of that time spent doing host-based tasks to ensure consistency).
Being able to re-stripe existing data online (w/o application impact) over new resources is also a feature that is not found in the vast majority of DAS platforms. Also not available in say Linux LVM either (last I checked).
Certainly the idea of DAS has returned to some extent especially in the crappy cloud providers, the whole concept of everything is throwaway has come back for some. But for many others their applications are not designed to handle that(and I don't see that changing for the vast majority of cases - most of the time it is vastly cheaper and much simpler to solve a problem in infrastructure vs application architecture) so they need higher reliability and that often means SAN.
Some folks can even turn DAS into a makeshift SAN, though at the end of the day I'd still consider that a SAN - even if it's a shitty one.
There is certainly a place for DAS - it is most useful in situations where you have good knowledge of what the workload is, and have a predictable growth pattern. Or if you are using leading edge applications that handle fault tolerance and the like at the application layer.
For the rest of folks - the more traditional HA storage arrays are here to stay for some time to come (at least a decade I'd wager).
Re: Mainframes again?
Check the IBM share price vs, say, HP. The z Series has worked out amazingly well ever since the death of the mainframe was proclaimed.
Re: So long, SAN
"...massive amount of overhead if your having to transmit the data of a VM from one host to another." - replication of storage and servers is still required when you deploy a SAN. That's when you guy a second SAN and put it in a second data room/centre and replicate using Dataguard or SRDF. The overhead is already there, I just don't see the need for a second fibre network when we have 10GEth already.
"crappy cloud providers" - I agree with you here, the cloud providers, albeit hideously expensive, have their business success only because departments are not getting any new IT resources from their CIO. Now, the business has to continue, so the stationary budget is burned for buying some cloud servers.
Yes, SAN had it's advantages 10 years ago. But when I see the internal cost charging, like, £1/day for 1GB of Tier1 storage, this simply cannot stand. A 300MB MS Exchange inbox is just pathetic. Today, we have 2/3/4TB harddrives, it is easier than ever to buy high density, modular systems. And if you compare the costs of a £500K SAN system with just a £60K piggyback to add HDD/SSD storage to the existing Xeon servers, then SAN had its day.
So, btw, has the network, but you don't want to say this too loud unless you get 500 downvotes: when you have 60 - 200 VMs in one physical server, you have a great deal of network in there. And it's faster, too, as it transmits packets via the memory. So, the router is only between the physical machines, for which you use your 10GEth or Etherchannel. And with that you have enough bandwidth for your DR replication.

IT infrastructure monitoring strategies
Agentless Backup is Not a Myth
Top 10 SIEM implementer’s checklist
Steps to Take Before Choosing a Business Continuity Partner
Enabling efficient data center monitoring