SwiftStack CPO: 'If you take a filesystem and bolt on an object API'... it's upside down
Joe Arnold on Ethernet accessed object drives and more
Interview Open source OpenStack-focused, object storage startup SwiftStack has has early involvement with Seagate’s object storage-focussed Kinetic disk drives, the ones needing server-resident software to manage their IO.
We had an email conversation with Joe Arnold, the founder, president and chief product officer of SwiftStack, about this and other matters.
Kinetic drive use is indistinguishable from normal disk drive use for users according to Joe. Here are his views explaining that:
El Reg I’d like to understand how SwiftStack users can use Kinetic (and other direct-addressed key:value disk drives).
Joe Arnold To the user of SwiftStack object storage, using either standard HDDs or Kinetic drives is virtually the same experience. SwiftStack was the first software company involved with Seagate specific to writing to the Kinetic API before it was launched in October 2013 and made open source in May 2014. The SwiftStack product is the only software today that supports management of Kinetic drives in a scale-out storage cluster.
We just got back from the Linux Foundation Kinetic PlugFest. SwiftStack was one of the few who is up and running with Kinetic API drives from Seagate, Toshiba and WD. So now that the device compatibility has been worked through, the next stage is to deliver full solutions. To this end, Seagate and SwiftStack have commercial opportunities, which are in progress.
El Reg Does SwiftStack address raw (non-Kinetic) disk drives or talk to them via file system. If so will Kinetic drive access by SwiftStack be faster than standard HDD access?
Joe Arnold SwiftStack consumes raw block devices (HDDs) and creates a storage system around the pool of storage. Individual Kinetic drives respond to many API commands from Swift in much the same way a SwiftStack node with non-Kinetic disk drives does.
El Reg Why should SwiftStack users choose Kinetic drives rather than standard HDDs?
Joe Arnold SwiftStack believes that users should have freedom of choice. This in includes servers, drives, and even drive technologies. In fact, we architected SwiftStack so that Kinetic and standard drives can be used within the same cluster and namespace. This is a huge enabler for adoption because it means that users can implement this new technology incrementally.
I have written about and presented on Kinetic before – about how the technology makes new storage topologies possible. There are near-term efficiencies which allow the necessary compute and networking to be much more easily tuned for the storage workload. Instead of running on a standard server filled with HDDs – the storage devices can be connected to a network via an enclosure with an embedded switch. This means a smaller footprint of servers can be used to run the storage services.
El Reg What do you think of the idea that object storage is a feature and not a product?
Joe Arnold If you take a filesystem and bolt on an object API, you have the architecture upside down!
Object storage with a unified namespace can serve as a foundation for filesystem access, but not the other way around; filesystems do not infinitely scale. It reminds me of the saying – you can’t turn a sausage into a pig.
Sure, an object gateway can be layered on top of a filesystem to provide an object API. But the API isn’t the point. Object storage is an architecture – not just an access method. The architecture enables high-throughput scale-out, high-capacity workloads with a unified namespace, and the leveraging of metadata. In this way, traditional file access via CIFS or NFS is an access feature for scale-out storage with a unified namespace.
2015 proved many traditional applications in backup have refactored to support object APIs (e.g. Veritas NetBackup 7.7) for both public cloud storage and scalable on-premises storage under a unified namespace. Like us, other object storage vendors provide filesystem access as a feature, but the industry at large has not been able to find a category name for what we do other than “object storage”. Maybe El Reg can coin a better term for all of us!
El Reg Aren't Amazon S3 and file-level access driving object storage development more than open source?
Joe Arnold Application workloads are driving development. Object storage wouldn’t exist if there wasn’t a need for it – applications would still be using filers! File-level access is for the applications of yesterday and today, and object API access is for the applications of today and tomorrow.
OpenStack contributor numbers growth
Yes, consumption at scale did start with Amazon with their launch of S3. And there have been a whole host of applications that have been built to support S3. In the early days of SwiftStack, we had “web” customers who wanted alternatives to S3 (eBay/Paypal, Ancestry.com, etc.), but wanted on-premises storage in a pay-as-you-grow way on standard hardware that other object storage solutions could not offer.
Now we’re starting to see other industries need the same ease of consumption, pay-as-you-grow, and scale of public cloud. Some applications can’t move all their data to public cloud. Media & entertainment, healthcare, life sciences, government, financial services, etc. It’s in these industries where filesystem access comes into play. The new workloads are using object, but they still need to interop with their existing applications that were built assuming a filesystem API.
(By the way, SwiftStack loves the S3 API. As you have previously written SwiftStack has very complete compatibility and is compatible with applications like Veritas NetBackup, CommVault Software, Avere, etc. all through the S3 API.)
El Reg Isn't open source object storage very much a minority interest when compared to Caringo, Cleversafe (IBM), Cloudian, EMC (Atmos, ECS), HGST (Amplidata), HDS, NetApp (StorageGRID), Scality and other proprietary object storage products? Why should that status change?
Joe Arnold Don’t confuse a development process with implementation. Swift, the open source community we are a part of, is enormous and continues to grow with hundreds of active developers contributing. This eclipses the development teams of any other proprietary object storage product (if not many of them combined!), so by this measure, any single proprietary product is a minority interest.
Supplier involvement with Swift
Why open source? Once dismissed by their proprietary competitors as “immature,” open source operating systems, middleware, application frameworks, and databases are now standards in enterprise and Web infrastructure. In fact, the open source model has so completely and fundamentally transformed the infrastructure tier in the data centre that not many proprietary infrastructure platform technologies have a sustainable advantage any longer.
Analysts agree that AWS S3 is the de facto standard API for object storage, and the OpenStack Swift API is the open standard with governance. The products listed above have all but abandoned their own proprietary APIs in favour of the APIs that developers and ISVs actually support or want to use. Many of them have added Swift API support to their products. In this way, the status has already changed.
It seems good sense for a startup like SwiftStack to jump on a new disk drive technology early on, so that, as its use grows, SwiftStack will receive an encouraging tailwind.
The Ethernet-accessed disk drive vendors will also be encouraged by OpenStack-related interest. Enterprise users though will, we think, want to have data showing that a SwiftStack data storage system, including software, servers, enclosures and disk drives, performs as well as, if not better, than an equivalent traditional storage array-based system, and has better costs.
Perhaps a disk drive and array vendor like Seagate or WDC will be able to show this. ®
You can view an OpenStack infographic here.