Reg Vulture grills Xen hypervisor daddy on latest storage upstart
Coho Data techie gets technical on flash-disk rivals
Interview Like other storage startups, Coho Data has had to swim upstream* against prevailing storage orthodoxy. Its CTO and co-founder, Andrew Warfield, was one of the original authors of the Xen hypervisor, and we recently took the opportunity to ask him some questions about the firm's technology, product positioning and progress.
Some background: Warfield's work on Xen took place as a PhD student at the University of Cambridge, but he has since done research in virtualisation and high availability. At XenSource and Citrix Systems, he was the technical director for storage and emerging technologies.
Now he has a startup to develop and new technology to push. Let's dive in with some tough questions:
El Reg: What does Coho Data sell?
AW: Coho currently sells a scalable NFS target. Our initial release aimed at VMware in particular, with both UI and performance focus on virtual environments. We've since added support for Openstack and generalised the implementation of NFS to applications other than VM image hosting.
We have supported files (over NFS) since the beginning, and increasingly see customers using our storage for non-virtual workloads.
El Reg: Coho Data is one of several hybrid flash/disk array suppliers such as Nimble Storage, Tegile and Tintri. These include late-stage startups and post-IPO businesses (eg, Nimble). What marks Coho out as having a better product than these suppliers and why should customers risk their business by buying kit from an immature startup?
AW: This is a spectacularly loaded question. Please bear with me as I try to tease it apart just a little bit.
Coho Data co-founder and CTO Andrew Warfield
AW: Coho builds a storage system that uses a mix of media in order to get good performance per cost. We happened to use flash/disk in our first generation products because that combination achieved an economic advantage for workloads with data that is not uniformly hot (in other words, pretty much all applications I've ever worked with).
While the role of spinning disks is likely to shift more toward a capacity/archive role over the next few years, building systems out of a mix of media is not going away: enterprise SAS flash is about 50 cents a gig now, NVMe is closer to two dollars, and NVDIMMs are still in the 40 dollar range.
New storage system designs have more tools at their disposal than old ones did, and their designers are taking advantage of them. The "hybrid flash/disk" categorisation of storage products strikes me as similarly suspect [as] Gartner's characterisation of AFAs as a separate category.
Two of the three hybrid suppliers you mention have announced all-flash offerings, and we will do the same shortly. Our all-flash offering will continue to be hybrid: It will combine NVMe flash to absorb overwrites and avoid the need for NVRAM-based write buffers, and SAS SSDs for storing colder data at a lower cost per capacity.
[On the issue of existing vendors being late-stage startups or post-IPO businesses.]
AW: I think that there are (at least) two macro factors driving innovation in storage: (1) technology and data centre architecture is changing really fast right now, and (2) so are the application use cases that people are buying storage for.
I'm not trying to be lofty or pedantic with these observations, I just want to point out that as long as these environmental parameters keep changing, there will be room for new companies to build different and interesting storage products. I do not believe that the current batch of "late-stage" startups that you mention arrived during the one golden window to supplant the previous incumbents.
[On the question of what marks Coho out as having a better product than other suppliers]
AW: This invites a long marketing-y answer that I'm not going to indulge in, so let me be as brief as I can: Our product is easy as heck to install, grow, manage, and evolve across heterogeneous hardware versions. That fact, more than any of the gory technical stuff that I usually talk about, resonates with our customers a lot.
Now regarding that gory technical stuff... I think that we are frankly just doing a lot more interesting innovation than any of the other companies that you are asking me to compare to. We use SDN switching in the product to help achieve scale. We've also written two research papers that talk about other detailed technical aspects of the stuff that we are building.
The bottom line with these papers is that building storage for emerging high performance nonvolatile memories is different. Software needs to get out of the way more than it ever has. The IO path is getting fast enough that storage systems are starting to face problems from other domains: efficient data movement between PCIe devices at these rates is traditionally the remit of software routers, not storage systems, and so we are borrowing tools and techniques from network data paths (Intel's DPDK, for example) to build storage.
Similarly, managing allocations of fast NVMe or NVDIMM memory looks more like managing RAM than it does like disks. We've made some significant algorithmic contributions to efficiently tracking application miss ratio curves to help efficiently address these issues on storage class memories.
El Reg: Why should customers risk their business by buying kit from an immature startup?
Clearly, some customers feel that we offer a benefit that make this risk worthwhile. Conservative customers that want storage systems that have been proven over more years than we have been in business have a valid and defensible position. They are, by that definition, not part of our addressable market right now. However, from this perspective, my addressable market is growing year on year.
This aspect of growing a storage company is hard. It's not just technology, it's relationships and trust. To be very candid with you, finding customers who believe in our story and who are willing to place a bet on us is one of the things that I have enjoyed the most about building this company so far. Those customers bet on us because they believe in our technical vision, because we are hungry, and because our survival as a business depends on us not letting a single one of those customers fail to successfully deploy and use our product.
El Reg: Coho Data positions its MicroArray against VMAX, which is the acme of an enterprise-class enterprise array with a solid set of proven data management and protection services. What level of data management does Coho have in comparison and what data protection services does Coho offer?
AW: With a specific mind to the VMware use case, which has been our (pragmatic) focus to date, we have: thin provisioning, auto-tiering, scheduled snapshots at a per-VM granularity, completely automated cluster scale-out (with data and connection rebalancing), node eviction, and dynamic load balancing of both active client connections and stored capacity.
Getting into a race for feature parity with VMAX is a fool's errand
We integrate with Vsphere and OpenStack for per-VM workload reporting, provisioning, snapshot operations, and storage APIs such as VAAI.
Regarding data protection, we currently offer configurable n-way replication. Erasure coding will follow, more on this in a sec.
We don't currently implement deduplication or compression. As with erasure coding, we elected to delay building features that risked compromising the levels of PCIe/NVMe (and soon to be NVDIMM) performance that we wanted the architecture to be able to achieve. We do a good job in terms of the economics of data reduction in the current system by merit of both auto-tiering cold data to cheaper media, and by supporting efficient data sharing on snapshot/clone operations. That said, richer data reduction will appear with the release of our initial all-flash product.