Feeds

LexisNexis open sources Hadoop challenger

Behold! Thor and Roxie

Secure remote control for conventional and virtual desktops

A super-computer architecture that crunches big data for banks, police, and spooks will soon be open sourced as a super-fast alternative to the Googlesque Hadoop.

LexisNexis Risk Solutions is opening up its High Performance Computing Cluster (HPCC), a system written in C++ that it claims is four-times faster than Hadoop when running data-intensive queries on ordinary Linux servers.

LexisNexis will release a virtual machine for testing, full binaries, and the source code in the next few weeks, the company announced Wednesday.

The company has not yet announced which open-source license it will use, but it will be under a copy-left the company said, permitting for derivations and improvements bearing the HPCC name.

LexisNexis is in talks with Amazon to make HPCC available on the etailer's cloud while also planning to offer its own cloud to customers.

The company – better known for its media database and medical data services – will offer the HPCC code in two flavors: a free Community Edition that comes with the free platform software, and an Enterprise Edition with support and access to "more advanced" modules and features.

HPCC uses LexisNexis' own data-centric declarative programming language, known as ECL. Developed 10-years ago, it compiles to C++. HPCC includes two data-crunching platforms: the Thor Data Refinery Cluster and the Roxie Rapid Data Delivery Cluster.

LexixNexis senior vice president and chief technology officer Armando Escalante says Thor is analogous to Hadoop, while Roxie is the component that Hadoop is currently missing. Since it's written in C++, he says, the system is also faster than Hadoop, which is written in Java.

"We been 10 years perfecting it," Escalante said, "and we tweaked it up the wazoo to get all the performance we can. We can add more use cases and make it better."

"We are four faster than Hadoop on the Thor side. If Hadoop needs 1,000 nodes we can do it with 250 – that means less cooling and data center space."

According to Escalante, HPCC is also more tightly coupled than Hadoop, further boosting performance of complex queries. Nodes talk to each other individually and via Thor, using one master switch that supports 1500 Ethernet ports without blocking - opening up the full bandwidth available to large data packets.

This means there's no single choke point for data, Escalante claimed. The architecture can run queries in memory, on disc, or concurrently on both for fast speeds. Queries that are written in ECL and compile to C++ can be front-ended with JSON and SOAP.

"Hadoop has more of a concept of the racks. It's a little more loosely coupled and you need lots of nodes. When we first saw Hadoop we liked it but you lose a lot of performance because the nodes are connected to distributed switches that then connect to a central switch. We looked at that and said: 'That's lot of congestion'."

When LexisNexis offers its own service, Escalante said, his company will target ordinary business customers – not the kind of super data users have been LexisNexis Risk Solutions customers until now. LexisNexis has built, delivered, and supported Thor and Roxie systems for a telcos who check on customers' credit history to see what service plan they can afford and for law enforcement officers trying to track down a criminal's network of assets as part of an investigation. It also works with the investigation units of insurance giants that are investigating customer's claims. The Thor and Roxy part of the risk business is worth $10m a year. ®

Providing a secure and efficient Helpdesk

More from The Register

next story
Microsoft on the Threshold of a new name for Windows next week
Rebranded OS reportedly set to be flung open by Redmond
Business is back, baby! Hasta la VISTA, Win 8... Oh, yeah, Windows 9
Forget touchscreen millennials, Microsoft goes for mouse crowd
SMASH the Bash bug! Apple and Red Hat scramble for patch batches
'Applying multiple security updates is extremely difficult'
Apple: SO sorry for the iOS 8.0.1 UPDATE BUNGLE HORROR
Apple kills 'upgrade'. Hey, Microsoft. You sure you want to be like these guys?
ARM gives Internet of Things a piece of its mind – the Cortex-M7
32-bit core packs some DSP for VIP IoT CPU LOL
Lotus Notes inventor Ozzie invents app to talk to people on your phone
Imagine that. Startup floats with voice collab app for Win iPhone
'Google is NOT the gatekeeper to the web, as some claim'
Plus: 'Pretty sure iOS 8.0.2 will just turn the iPhone into a fax machine'
prev story

Whitepapers

A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Beginner's guide to SSL certificates
De-mystify the technology involved and give you the information you need to make the best decision when considering your online security options.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.