Reg comments1

Startup grind is over: Now Primary Data must compete with storage giants

Might sound crazy but it ain't no lie... they need 'em to buy buy buy

Analysis Startups arrive with fanfares of new tech and product surprise and then face the long grind to grow their business to newsworthy market status while adding bells and whistles to the basic product.

Getting the first product out of the door is a validation of all the technology trend analysis and development struggle involved in getting from concept to company formation, funding and then product development, alpha and beta testing.

But now you have to compete for sales against established vendors, with no stealth startup glamour and company founder attention to alpha and beta customers. And, every time a potential customer says no, those doubts locked away in the past might return. Did we read the technology tends correctly? Did we make the right technology choices? Is the product good enough?

Take Primary Data and its primary and secondary data silo converging DataSphere storage software, which unites file, block and object storage. Part of the reason for its inception is that operating and managing multiple primary and secondary data silos across file, block and object storage is horribly complex and difficult. There must be a better way.

Similar reasoning was involved in the inception of database virtualiser Delphix, copy data manager Actifio, and secondary data silo convergers and protectors Cohesity and Rubrik. Catalogic is also active in the copy data management sphere. There's also Axcient converging silos in the cloud.

Which one of these suppliers has the silver bullet to meet customers' needs in the data and storage sprawl area? Has Delphix got the right answer for primary data sprawl? Will Actifio walk away with the secondary data sprawl market? Or Cohesity or Rubrik? Has Axcient got it right with its cloud focus? Or will Primary Data, erecting its DataSphere abstration layer umbrella over both primary and secondary data win the prize?

DataSphere_diagram

All of these companies are now actively selling product and it’s becoming a bit of a race to see which can grow fastest and outpace the others.

How is Primary Data doing so far?

At a press briefing this month in Silicon Valley, we were told of it making substantial progress, with some two dozen large enterprise customers, and demonstrating million dollar-plus savings/year for them.

The DataSphere product went GA at VMworld. CEO Lance Smith told us sales cycles are taking three or more months initially and the company is going after large accounts in the media and entertainment (M&E) vertical, also in the oil and gas area. Target customers have a petabyte or more of data, often several petabytes. They also have excessive storage costs, particularly with primary, tier one data storage, and access problems due to finding it hard or practically impossible to locate and/or fix access bottlenecks.

DataSphere, he said, provides the visibility and control to fix these problems and enable substantial savings.

Media and entertainment

One example customer was a global M&E leader and its calculations showed that, without using DataSphere its storage costs would be $13,280,000. With DataSphere it is $4,831,567, a 64 per cent saving. This saving came from:

  • Reduced over-provisioning of tier one storage by moving cold data to second tier
  • Implementation of a global name space and the capability to move movie assets to a long-term storage tier, such as disk, tape or the cloud.
  • Reducing the cost of adopting new storage

Because of DataSphere it was able to load-balance across the servers, and enable 600-700 artists working on rendering their part of a movie at the same time, using high-performance NVMe white box storage hardware.

An aspect of its business is a tie-in movie-related products. When a movie can be refreshed in a year, as with a prequel/sequel; then existing product-related data, like movie clips, design stuff, etc, needs to be brought back from cold storage for re-work. Primary said DataSphere made this practicable and easier.

Primary Data CEO Lance Smith says the company is talking to the leaders in M&E. It doesn't run into Quantum, and its StorNext offering in its accounts. Perhaps it's a broad vertical with plenty of accounts, or perhaps Primary Data is hitting larger players than Quantum and there is no overlap in their respective markets.

Smith says the company has contacts in every virtually every M&E customer as most prospective customers still have Fusion IO kit, with which, of course, the Primary Data founders were involved with as they founded and/or were involved with FusionIO and its flash card hardware and software. Primary Data also has channel partners active in the M&E vertical.

A second example is described as a global travel leader. Its costs without Primary Data were $5,073,771 and, with Primary Data, $2,576,800, meaning a 50 per cent saving. The savings came from:

  • Reduction of tier 1 storage (again)
  • Moving old VMs (and data) to cloud storage
  • Reduced over-provisioning by removing storage silos

There were other customer stories showing the same sort of savings; Primary Data saves enterprises bug bucks - that's the message.

Customers and markets

Smith says some customers are fast adopters, others fast followers while others want everything nailed down and will talk to their peers. Primary Data wants to leverage off early customers to expand into new ones. It gets good life-time recurring revenue from its customers. The annuity return is fantastic and Primary Data's growth path this way is quicker than if it were to hire a larger sales force, build a larger channel, and go after volume sales faster.

He mentioned four verticals that were attractive to Primary Data: M &E, oil and gas, service providers and finance. Companies in finance are said to be generally slower to move into decision-making and production. But generally, Smith says, customers trying SW out in test/dev get pleased and hurry it into production. “We're building up allocations in budgets and 2017 is looking good for us.”

He said: “We can double the performance of existing clusters; we see every node and have telemetry in them. … We can do it better than Isilon. … We can distribute load between nodes and redistribute it dynamically based on telemetry coming back.” Operational speed increases come from putting metadata in NVMe flash and optimise metadata look-up protocol and have pre-created objects on the storage pool, ready to be allocated.

NFS v4.2

Primary Data's Parallel NFS (pNFS) contributions have been accepted into the NFS v4.2 standard, and DataSphere provides native NFS 4.2 support.

These contributions, developed as part of its DataSphere product development, include pNFS Flex File layout enhancements so clients can provide stats on how data is being used and the performance of the storage resource serving the data. These things have been integrated into the Linux kernel by Trond Myklebust, a Linux NFS client maintainer and Principle System Architect at Primary Data.

These contributions means Primary Data customers doesn't have to modify thousands of clients for every new bit of HW and SW they have to deal with. Instead the providers of these make changes compliant with what's now in the kernel.

Parallel NFS introduced control and data plane splitting. Primary Data says it introduced extensions to support synchronous mirroring. “When you split control and data planes you can't tell what's going on in a telemetry sense on the data plane. Now you can as every IO is monitored and counts towards aggregate stats.”

Red Hat Enterprise Linux v7.3, the latest release, has Flex Files support "to simplify management of pNFS clusters" by providing support for clustered server deployments. Primary Data says it enables scalable, parallel access to files distributed across multiple servers. There is native support for a global namespace, data mobility and management across different storage types - file, block and object - and help meet protection, cost and performance objectives.

Other NFS 4.2 enhancements include:

  • Server-side clone and copy - file cloning and snapshots by storage server
  • Application IO - apps can tell storage server about expected IO behaviour
  • Sparse Files - space-efficient files with placeholders rather than zeros
  • Space Reservation - storage servers can reserve space without writing data to protect against unexpectedly running out of capacity
  • Application Data Block - enables block on file storage management implementations
  • Labelled NFS - so clients can enforce access rights
  • Layout enhancements - to better support data mobility and IO stats per file

Comment

On the technology front Primary Data can wow CIOs and their tech staff with data/control plain divisions, data virtualisation, data and metadata flows and operations, and use white board wizardry to explain its tech prowess.

This is working. We can say that, basically, Primary Data has accumulated 24 enterprise-class customers since beta testing and then GA in August. Let's say, then, 14-18 new customers in three months. That's not bad, healthy we might say, and Smith says the pipeline to 2017 is looking good.

So ... Primary Data has emerged from the product gate and is progressing well. Its product needs a CIO-level sell, not a line of business one and so far, given that its early days, it's diagnosis of CIO-level storage problems is looking realistic. But it will probably take at least 12 and possibly 24 months before we see if Primary Data is outstripping its general competitors, or being outstripped. ®

Sign up to our Newsletter

Get IT in your inbox daily

Biting the hand that feeds IT © 1998–2017