The Register® — Biting the hand that feeds IT

Feeds

Apache Foundation embraces real time big data cruncher 'Storm'

Does for real time processing what Hadoop did for batch processing

Free ESG report : Seamless data management with Avere FXT

The Apache Foundation has voted to accept the “Storm” real time data processing tool into its incubator program, the first step towards making it an official part of the Foundation's open source offerings.

Storm aims to do for real time data processing what Hadoop did for batch processing: queue jobs and send them off to a cluster of computers, then pull everything back together into usable form. Nathan Marz, poster of the Storm GitHub repository believes “The lack of a 'Hadoop of real time' has become the biggest hole in the data processing ecosystem.”

Storm tries to fill that hole with software that “... exposes a set of primitives for doing real time computation. Like how MapReduce greatly eases the writing of parallel batch processing, Storm's primitives greatly ease the writing of parallel real time computation.”

Without Storm, Marz writes, one would have to “manually build a network of queues and workers to do real time processing.” Storm automates that stuff, which should mean better scaling: Marz already claims “one of Storm's initial applications processed 1,000,000 messages per second on a 10 node cluster, including hundreds of database calls per second as part of the topology.”

All of which should get high performance computing folks excited.

The Apache Foundation's incubation process isn't technical. One goal is to ensure any software offered with its feathered logo conforms to its preferred license, which should not prove problematic as Storm is currently offered under the Eclipse Public License. The Foundation also likes to ensure proper communities nourish software it offers, and again that should not be a struggle given Storm already has enthusiastic users including Yahoo!, Twitter and business-to-business tat bazaar Alibaba.

Once the Foundation adds its imprimatur to the list of testimonials from current Storm users, that community will doubtless grow. And be joined by Big Data marketers who have run out of things to say about Hadoop, although they'll doubtless soon assert Storm means sizzzling business insights are magically available in real time with just as little justification for that assertion as for the oft-repeated proposition that Hadoop+data=highly profitable insights in your inbox every afternoon. ®

5 ways to reduce advertising network latency

Whitepapers

5 ways to reduce advertising network latency
Implementing the tactics laid out in this whitepaper can help reduce your overall advertising network latency.
Supercharge your infrastructure
Fusion­‐io has developed a shared storage solution that provides new performance management capabilities required to maximize flash utilization.
Avere FXT with FlashMove and FlashMirror
This ESG Lab validation report documents hands-on testing of the Avere FXT Series Edge Filer with the AOS 3.0 operating environment.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Email delivery: 4 steps to get more email to the inbox
This whitepaper lists some steps and information that will give you the best opportunity to achieve an amazing sender reputation.

More from The Register

next story
Dedupe-dedupe, dedupe-dedupe-dedupe: Flashy clients crowd around Permabit diamond
3 of the top six flash vendors are casing the OEM dedupe tech, claims analyst
Disk-pushers, get reel: Even GOOGLE relies on tape
Prepare to be beaten by your old, cheap rival
Hong Kong's data centres stay high and dry amid Typhoon Usagi
180 km/h winds kill 25 in China, but the data centres keep humming
Microsoft lures punters to hybrid storage cloud with free storage arrays
Spend on Azure, get StorSimple box at the low, low price of $0
WD unveils new MyBook line: External drives now bigger... and CHEAP
Less than £0.04/GB, but it loses the Thunderbolt speed
VMware vSAN test pilots: Don't panic but there's a chance of DATA LOSS
AHCI SATA controller won't play nice with Virtzilla's robo-storage beta
Pure poaches NetApp preacher
Stewart dumps disk array drama to fluff flash
StorNext gets revamp, Quantum claims 5x data throughput boost
Multi-threaded code, flash, metadata redesign and Infiniband support
prev story