Feeds

HPC 2.0: The monster mash-up

When Big Data gets big, data centres should get nervous

Bridging the IT gap between rising business demands and ageing tools

Blog This is the second of a three-part series on the convergence of HPC and business analytics, and the implications for data centers. The first article is here; you’re reading the second one; and the third story is coming soon.

The genesis of this set of articles was a recent IBM analyst conference during which the company laid out their HPC strategy. Much of the material and ensuing discussion was about the worlds of HPC and business analytics coming together and what this means for citizens of both worlds, particularly when it comes to dealing with the explosive growth of data. Big data is – well – damned big, as it turns out.

IBM’s Dave Turek took us through the process of analyzing large data sets and the challenges it will present. Not surprisingly, there are a lot of factors to take into account when building or adapting an existing infrastructure to support enterprise analytics.

First, it’s important to realize that the most time-consuming task in processing big data is simply moving the data around. This means getting it onto storage arrays where it can be read by systems, processed, and then the output is stored back onto the arrays.

This looms large when you consider that most analytic processes aren’t just a single workload where data flows in and answers flow out; there are steps performed by different applications on separate systems.

Some will say that this is the case for many business applications already, and our fast networking and fast storage arrays work fine – so what’s the big deal? The big deal is big data and the need for speed.

Data sets range from hundreds of terabytes into the petabyte range – and are growing fast. This isn’t data that’s just going to be sorted and used to build reports; this data needs to be analyzed in near real- time in order to guide decision making.

The weak link is bulk transfers from spinning drives, which are limited to about 1Gb/s or 128MB/sec real-world speed, at best, per spindle. Moving 250TB of data will take almost 5.69 hours using 100 drive spindles or about 40 minutes using 1,000 spindles. The time it takes to move this amount of data multiple times from storage to system, then system to storage adds up – even with thousands of spindles working in concert.

One way to get around this problem is to have data directly transferred from one system to another, which will eliminate the multiple loads and saves from disk storage. With this kind of solution, your overall performance will be limited to the speed of your network – which is probably around 1Gb/s (about the same as a single drive) or maybe 10Gb/s. With large datasets, this is still slower than it could and should be.

So what’s the right answer? We’ll talk about that in Part 3 of this series ... ®

Build a business case: developing custom apps

More from The Register

next story
BBC goes offline in MASSIVE COCKUP: Stephen Fry partly muzzled
Auntie tight-lipped as major outage rolls on
iPad? More like iFAD: We reveal why Apple ran off to IBM
But never fear fanbois, you're still lapping up iPhones, Macs
Nadella: Apps must run on ALL WINDOWS – PCs, slabs and mobes
Phone egg, meet desktop chicken - your mother
HP, Microsoft prove it again: Big Business doesn't create jobs
SMEs get lip service - what they need is dinner at the Club
ITC: Seagate and LSI can infringe Realtek patents because Realtek isn't in the US
Land of the (get off scot) free, when it's a foreign owner
Samsung threatens to cut ties with supplier over child labour allegations
Vows to uphold 'zero tolerance' policy on underage workers
Dude, you're getting a Dell – with BITCOIN: IT giant slurps cryptocash
1. Buy PC with Bitcoin. 2. Mine more coins. 3. Goto step 1
There's NOTHING on TV in Europe – American video DOMINATES
Even France's mega subsidies don't stop US content onslaught
You! Pirate! Stop pirating, or we shall admonish you politely. Repeatedly, if necessary
And we shall go about telling people you smell. No, not really
prev story

Whitepapers

Seven Steps to Software Security
Seven practical steps you can begin to take today to secure your applications and prevent the damages a successful cyber-attack can cause.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.