Evolving elephants: Hortonworks trumpets its '3.0 vision' of global data management

CTO Scott Gnau on open source, partnerships and simplifying Hadoop

Hortonworks – once known simply as a Hadoop-flinger – is these days pushing itself as a modern data architecture company.

At its annual gabfest, this year held in Berlin, the message was that the biz had reached its 3.0 phase – but that's an evolution, not a reboot, said CTO Scott Gnau.

"3.0 isn't like a pivot," he told The Register at the event yesterday. "It's the next phase that's built on the foundation of what's come before."

The first phase was investing in the core Hadoop aspects to make it enterprise-ready, he said, while the second was adding the notion of data-in-motion.

"In moving into 2.0, we moved from being a Hadoop vendor into managing the whole life cycle of data," Gnau said.

Based on that "really great foundational substrate" Gnau said the third phase was to "make the modern data architecture function by offering a common set of services that enable customers to understand their entire data landscape".

The concept is pegged on the seemingly endless stream of data being generated – execs at the event emphasised the growing interest in connected devices and sensors – and impending regulations that will require companies to have a better grip on their data.

Hortonworks unfurls tool to cut grunt work, let firms spend more time rolling in juicy data

READ MORE

Gnau used his own keynote to sell the idea that every company's business strategy should be its data strategy, which should also be its cloud strategy – in the Internet of Things, the three are inextricably linked.

This is also a handy way of making Hortonworks' offering appear more relevant to the C-suite that it's been traditionally harder to reach due to the complexity (perceived or otherwise) of the underlying Hadoop technology.

This subtle transition away from Hadoop is also apparent in the name of the conference, which used to be the Hadoop Summit and is now DataWorks, along with the PR pitches of Hortonworks' competitors.

Asked about the switch, Gnau linked back to the change in the amount of data generated and what people want to do with it.

"When I think about the growth in data volume and analytics, it's going to be an IoT world, and in that world, storying and analysing data is really only half the problem; capturing it and moving it is the other half," he said. "So it felt like a much more complete footprint for us to expand into that space."

This is central to the firm's latest big release – DataPlane Service, launched last September. The data management platform gives customers a single view of all their data, whether in motion or at rest, and regardless of where it's stored, be it on-prem, in the cloud, or in a hybrid or multi-cloud environment.

On top of this platform, the biz is adding a series of what Gnau describes as widgets, which will offer specific functionalities.

"I almost think of it like the OS on your phone; it becomes ubiquitous to what you're doing and then you plug in more widgets over time," he said.

One such widget – Data Steward Studio for data identification, security and governance – was launched this week, and Gnau said he expects a new one to be rolled out about once a quarter.

However, it's too early to tell whether the messaging is resonating with customers: none of the four companies represented on a customer panel during a press and analyst day before the main conference were using any of the DataPlan Services.

Gnau said that about a dozen beta customers signed up to the DataPlane Service at the end of last year "because they had some very discrete needs and requirements, and were very interested in the tech".

He couldn't offer any more specific use cases or case studies – on the basis of NDAs having been signed – but said that of the admittedly small sample, "it's not just one industry", which is encouraging "because it's a very generic market need".

It's still Hadoop underneath

Good marketing aside, there's no escaping the fact the underlying technology is Hadoop, and despite having been around for a decade or so, the problems that plagued it from the outset have yet to disappear. Using the open-source framework in enterprise requires a lot of skill and it can be difficult to explain or sell to management.

Gnau acknowledged this is still the case. "As with anything, [you need] simplification through bundled solutions, quick implementation, referenceable solutions," he said.

But he argued that Hortonworks has made strides in making it more approachable.

"If you think about the core Hadoop assets, it's not one thing, it's 26 different projects – and it looked like 26 different projects," he said.

"In our last two major releases we've made a great deal of progress in improving the user interface; making it consistent, with the same look and feel, the same layout. So at least traversing from one project to another project, you don’t have to learn a different way to interact."

At the same time, though, the company has had to up its support for companies that don't have the skills. Initially, it offered one-off help to get set up, but as of last month it has begun an operational services programme.

Shuffle, image via Shutterstock

Hortonworks reshuffles C-suite, gets third COO in 12 months

READ MORE

"It's targeted at medium-sized companies – although I have to say we have some very large customers – that may not have really large IT infrastructure or data centre operations," Gnau said.

"We removed the need for them to have any specialised knowledge of how to manage, operate or keep healthy a cluster. We do all the software management behind the scenes."

Gnau added that, because there's a lot of automation built in, the more customers that use the service, the better it will get as they'll be able to apply a fix for an issue on one cluster to other customers.

Proving open source is profitable

Steps like these will be important if the company wants to make a success of – and a profit from – open-source technology. Does Hortonworks have something to prove, compared with vendors who are edging away from 100 per cent open-source?

"Yes, and we're doing it," said Gnau. "We're definitely proving a new model. I really believe in the purity of the model. When you think about making open-source models work, you require reasonable scale and volume of opportunity, because you're foregoing software licences and only monetising the support. You're choosing to take less money."

That means the community has to be very active, he said, driving innovation in a consistent way, while companies need to put back in to ensure that continues.

"I think over time, the advantages will continue to be more and more clear in the model we've chosen, versus a hybrid model that frankly just creates friction inside of a business," Gnau said.

"It creates the same vendor lock-in that the open-source community basically tells everyone to avoid. 'Don't do a hybrid model, except mine' doesn't seem very credible."

The biz has also inked a partnership with IBM – one of the more interesting moves in the Hadoop space in 2017 – that has the potential to boost revenues with access to a bigger user base, especially since the deal will see IBM's BigInsight users migrate to Hortonworks.

But Gnau stressed that this move was going to be "very gradual".

"We don't want to downplay the relationship; it is important, it is strategic," he said. "But the flip side is that it's not a step function where all of a sudden we've acquired X number of customers and there's a step function in revenue... they're not all going to behave in the same way; some of the conversations are going to take time."

Gnau also emphasised the businesses' partnerships with other firms: while vendors like Amazon, Google and Microsoft are selling self-service offerings that could compete, Hortonworks doesn't want to see it that way.

The CTO argued that his firm's business model is "more pure than others' out there" – and that this means "we're not competing with our partners".

Hadoop graphic

Hortonworks: Woo! We're breakeven – just don't focus on the $46m operating losses

READ MORE

It's important to realise there's going to be partner ecosystem, Gnau said, because "all our stuff is open... we collaborate with Microsoft, Teradata, IBM – anything we do in our code is open".

"But we also know there's going to be other models for software out there, and some instances where non-open is required – if you're doing fraud detection – so it's not a one-size-fits-all thing," Gnau said.

From his perspective, they need to remove friction stopping customers from using their products and then get "pull from the larger market as part of an open ecosystem, even though that larger market includes non-open-source software".

The firm seems confident of its tech – emphasising the 100 per cent open-source image and commitment to the community, which will win brownie points if nothing else – and execs had their macromessaging down pat.

That confidence appears to have grown after a year that saw two COOs come and go, amid growing pressure to get spending under control. The firm hit its long-standing target of being cash-flow positive in the last quarter, four years after going public.

However, that doesn't mean it's profitable. In the quarter ended 31 December 2017, it had a positive operating cash-flow figure of $6.4m, but operating losses of $48.5m. Although this was $50m less than the previous year, there's plenty more to be done. ®

Sponsored: Minds Mastering Machines - Call for papers now open


Biting the hand that feeds IT © 1998–2018