Feeds

There’s gold in them there data mountains

IBM on Big Data

Bridging the IT gap between rising business demands and ageing tools

Interview Enterprises have a problem with data: its volume, velocity and variety is growing at an alarming rate.

Analyst IDC says that there is already more data in the world than there is space to store it, and according to IBM every two days we create as much data as all the data that was created before 2003.

According to The Economist, the US retail giant Wal-Mart, for example, handles more than “one million customer transactions every hour, feeding databases estimated at more than 2.5 petabytes – the equivalent of 167 times the number of books in America’s Library of Congress.”

If enterprises have a problem coping with their own structured data, then the huge volumes of unstructured data they receive has the potential to cause an even bigger headache.

This could be video, or audio, email correspondence, internet traffic, diagnostic or spatial data – a myriad of data types that do not arise from transactional activities or fit into a relational database in a data warehouse.

Mark Thomas, data warehousing Infosphere specialist at IBM Software Group, Australia/NZ, comments: “Organisations know they have to do something with it, but no one is quite sure what they should do or how they should handle it.”

Another fine mess

Earlier this year a MarkLogic survey found that 86 per cent of respondents said that unstructured data is important to their organisation, but only 11 per cent had clear procedures and policies in place for managing unstructured data.

In addition, 80 per cent of respondents know that the amount of unstructured data will rise in the next three years, but only 24 per cent of respondents believe their current infrastructure will be able to manage it.

There is also the odd salmon in there that makes the fishing worthwhile

According to Thomas, IBM’s customers say they often don’t know what should be analysed, and find it prohibitively expensive to integrate large volumes of unstructured data.

“An awful lot of unstructured data will prove to be of no value whatsoever – the proverbial red herring. But there is also the odd salmon in there that makes the fishing worthwhile,” he says.

It is unrealistic for organisations to expect to be able to develop fixed infrastructure around unstructured data that doesn’t conform to business requirements and whose business case value has yet to be determined, he says.

Rather what is needed is a flexible Information management foundation that allows organisations to combine their structured and unstructured data, and enable intuitive and statistical analysis to create competitive advantage from a bigger range of data attributes.

The bigger picture

IBM calls this approach Big Insights. It makes use of open-source tools such as Hadoop, as well as IBM technology such as the newly acquired data warehouse appliance Netezza, to deliver a highly robust and flexible data warehouse.

The company aims to create a Big Data platform which can handle the volume of data being created.

“It’s all about extracting value and insight from the vast amount of data available,” says Thomas.

In contrast with a traditional approach, with business users determining what questions to ask and IT structuring the data to answer these questions, IBM’s Big Insights allows for iterative and exploratory analysis on structured and unstructured data.

The idea is for IT to enable creative discovery and business users exploring what questions could be asked, according to Thomas.

“You can’t create something that is based on a business requirement because the business requirement is either unknown or changes so quickly. Twitter or Facebook weren’t really well known even a couple of years ago.

“Our customers tell us they don’t know what we want from unstructured data – and it might not even be the ability to gather and store it for long, but they want the capability to extract value and insight from it and add it to what they already have.”

Winning combination

Most companies still struggle to get the most out of their internal structured data, but there are some shining exceptions, says Thomas, such as government agencies and telcos.

He also cites instances where the marrying of structured and unstructured data can be used to deliver an additional edge

For example, it can be used to combat customer churn: a business can use text analytics to identify word patterns or sentiments from unstructured data to give additional insight to predictive churn modelling.

And real-time sentiment analysis, a currently hot area of research, of political speeches through Twitter could allow a speech to flow in a number of different ways. ®

Build a business case: developing custom apps

More from The Register

next story
BBC goes offline in MASSIVE COCKUP: Stephen Fry partly muzzled
Auntie tight-lipped as major outage rolls on
iPad? More like iFAD: We reveal why Apple ran off to IBM
But never fear fanbois, you're still lapping up iPhones, Macs
Nadella: Apps must run on ALL WINDOWS – PCs, slabs and mobes
Phone egg, meet desktop chicken - your mother
ITC: Seagate and LSI can infringe Realtek patents because Realtek isn't in the US
Land of the (get off scot) free, when it's a foreign owner
HP, Microsoft prove it again: Big Business doesn't create jobs
SMEs get lip service - what they need is dinner at the Club
Samsung threatens to cut ties with supplier over child labour allegations
Vows to uphold 'zero tolerance' policy on underage workers
Dude, you're getting a Dell – with BITCOIN: IT giant slurps cryptocash
1. Buy PC with Bitcoin. 2. Mine more coins. 3. Goto step 1
There's NOTHING on TV in Europe – American video DOMINATES
Even France's mega subsidies don't stop US content onslaught
You! Pirate! Stop pirating, or we shall admonish you politely. Repeatedly, if necessary
And we shall go about telling people you smell. No, not really
prev story

Whitepapers

Seven Steps to Software Security
Seven practical steps you can begin to take today to secure your applications and prevent the damages a successful cyber-attack can cause.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.