Feeds

ClearStory uncloaks with big data visualization vision

Teaching elephants to draw pictures for CEOs

Top 5 reasons to deploy VMware with Tegile

Yet another big-data company has come out of stealth mode, and this one, called ClearStory Data, has its sights set on making it easier for companies to mash-up and visualize data sets rather than just focusing on data-munching itself.

The founders of ClearStory – Sharmila Mulligan, John Cieslewicz, and Vaibhav Nivargi – all worked together at Aster Data Systems, the columnar clustered database maker that was acquired by data warehousing pioneer Teradata for $263m a little more than a year ago. They are all well aware of the big data challenges facing companies in terms of the volume and types of data that they want to chew on and mix up to gain insight into their businesses and customers.

But rather than come up with cleverly named Hadoop distribution or yet another kind of data warehouse, ClearStory is taking on the data integration and visualization jobs with its forthcoming ClearStory Data Service. As the name suggests, it will be cloudy and offered as a service.

ClearStory founders

ClearStory founders John Cieslewicz,
Sharmila Mulligan, and Vaibhav Nivargi

The idea behind the ClearStory Data Service, CEO Mulligan explained to El Reg, is to create what she calls a "self-driven data exploration tool," something that knows how to integrate with various public data sets – such as DataSift, a partner of Twitter's that gives data analysts access to the full Tweet feed, Microsoft's Dallas data service for its Azure public clouds, or various government data repositories – and that can also be used to merge information with multiple data sets generated by transaction processing and web-log systems.

The problem that ClearStory will attack first is the integration of these disparate data sources. "These services all expose their own data APIs, but they are not for your average user," says Mulligan.

Each data set has its own eccentricities and APIs, and that can make mashing them up difficult. So the first thing that the ClearStory Data Service will do is mask the differences from those who want to mix data sets, and try to find correlations across many layers of data. You can think of it as a universal API-translator of sorts.

The second task that the ClearStory service will tackle is creating the data layers and presenting them graphically to data analysts. This visualization part of the service uses various – and unspecified – tools to display overlaid data sets inside of a normal web browser. (Well, if there were such a as a "normal" web browser.)

Cieslewicz, whose title was not revealed, said that the company is also not revealing what technology it will use, but that the visualization is being done through a combination of HTML5, cascading style sheets, and JavaScript, and will rely on Hadoop and other open source technologies to create the cloud that back-ends the ClearStory data exploration cloud. (So maybe it really is another funky Hadoop distribution after all, eh?)

"Our big focus is on how you do the blending of the data," says Mulligan.

The idea, says Nivargi, the third cofounder, is to allow the ClearStory Data Service to run on public clouds and to interface with public data sets. Nivargi says that Amazon EC2, Eucalyptus, OpenStack, Microsoft Azure, and VMware vCloud public clouds are all moving towards compatibility. (El Reg is ever-supporting of standards, but disappointed with the level of standardization for Unix, Linux, and blade servers, just to name three.) Hope for future compatibility springs eternal, however, and ClearStory intends to get its service running on multiple clouds.

And for those companies that are anxious about letting their data outside of the firewall, ClearStory will also peddle a private cloud version of the data-exploration stack that can be run internally.

ClearStory was founded last September after a few months of kicking around some ideas among the three cofounders. The company is located in Palo Alto, California, and has just signed up Google Ventures, Andreessen Horowitz, and Khosla Ventures for its first round of venture capital funding, an amount it did not disclose.

Some of that first-round dough came from private investors, including Andy Rachleff, founder of Benchmark Capital; Anand Rajaraman and Venky Harinarayan, who are SVPs at Wal-mart Global e-Commerce and cofounders of Junglee and Kosmix, respectively; Tim Howes, cofounder of Rockmelt and ex-CTO at Netscape Communications; and Nitin Donde, a former executive at EMC, 3PAR, and Aster Data.

Mulligan says that ClearStory is signing up customers for early access to the product in late summer, and expects for it to be generally available by the end of the year. The company currently has ten employees, but now that it has cash, it is hiring. ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
NASA launches new climate model at SC14
75 days of supercomputing later ...
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016
Flash WORM has a serious use for archived photos and videos
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
prev story

Whitepapers

Designing and building an open ITOA architecture
Learn about a new IT data taxonomy defined by the four data sources of IT visibility: wire, machine, agent, and synthetic data sets.
How to determine if cloud backup is right for your servers
Two key factors, technical feasibility and TCO economics, that backup and IT operations managers should consider when assessing cloud backup.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Website security in corporate America
Find out how you rank among other IT managers testing your website's vulnerabilities.