HP buys Vertica for data analytics

Apotheker starts making software moves

Hewlett-Packard's new president and chief executive officer, Leo Apotheker, has made his first move in software by having HP acquire an in-database analytics startup called Vertica Systems for an undisclosed sum.

Vertica is located in Billerica, Massachusetts, and was founded in 2005 by database giant Michael Stonebraker, who was behind the Ingres, PostgreSQL, Aurora, C-Store, Morpheus, and H-Store databases. The first two are early and popular relational databases, while Aurora was commercialized in 2003 as the StreamBase database for streaming data (such as sensor data or stock market data feeds). Morpheus is a data-integration platform that has been commercialized as Goby, while H-Store – which aimed to make an in-memory, distributed database for high-speed transaction processing – was commercialized as VoltDB in 2009.

The C-Store project founded by Stonebraker was a column-oriented database running on parallel-linked servers created for data warehousing. Its aim was to cut down on the amount of I/O that takes place as SQL queries chew through rows of data in a traditional data warehouse. Because it thinks in columns, it can chew through data in a serial fashion quite fast, but it is worse than a traditional database for doing random transactions that often update as well as read data in a row.

The database runs in parallel across multiple machines, but has a shared-nothing architecture, so the query is routed to the data and runs locally. And the data for each column is stored in main memory, so a query can run anywhere from 50 to 1,000 times faster than a traditional data warehouse and its disk-based I/O – according to Vertica.

The Vertica Analytics Database went from project to commercial status very quickly – in under a year – and has been available for more than five years. In addition to real-time query functions, the Vertica product continuously loads data from production databases, so any queries done on the data sets is up to date. The data chunks are also replicated around the x64-based cluster for high availability and load balancing for queries. Data compression is heavily used to speed up data transfers and reduce the footprint of a relational database, something on the order of a 5X to 10X compression.

While Stonebraker is a serial inventor and entrepreneur (you'd expect that from a database guy, I suppose), he doesn't usually run the companies he seeds. Christopher Lynch, formerly of networking companies F5 Networks and Cisco Systems, was tapped to run Vertica as president and CEO. In January, Vertica announced that it had more than doubled its customer count in 2010, to 328 customers, more than doubled its revenues, and doubled its headcount, to over 100 employees. Being privately held, Vertica did not divulge its revenues or reveal whether or not it was profitable.

Twitter, Zygna, Verizon, AOL, Comcast, Mozilla, Bank of America, and Sunoco are among Vertica's brand-name customers. Last month, coupon giant Groupon said it would be using the Vertica Analytics Platform (the in-memory database plus tools for managing data and applications that hook into it) to analyze Groupon subscriber behavior. Groupon plans to run its Vertica databases out on Amazon's EC2 compute cloud.

Bessemer Venture Partners, Highland Capital Partners, Kleiner Perkins Caufield & Byers, and New Enterprise Associates have kicked $23.5m in two rounds of venture funding into Vertica.

HP did not say what it paid for Vertica, but did say it expected the deal to close in the second quarter of its fiscal 2011 year, which ends in April. ®

Sponsored: 10 ways wire data helps conquer IT complexity