Feeds

Digging into the future of data mining

Crystal ball gazing

Beginner's guide to SSL certificates

Comment The first thing to appreciate about data mining is that it should be thought of as R&D. That is, you do a bunch of research, some of which (but by no means all) is then deployable in the business. Moreover, some of it becomes so well established that it becomes a mass market product. For example, market basket analysis (which products have relationships to others) was once regarded as being as esoteric as anything else in data mining but is now so mainstream that it is embedded in all sorts of other environments. This is a trend that will continue, with techniques moving out of data mining R&D and into conventional deployment.

Historically, this move out of data mining has been to call centres, CRM, fraud and other standard applications. However, as complex event processing (CEP) engines take greater market share then we are likely to see increasing synergy with data mining. After all, CEP is essentially about identifying patterns and then detecting anomalies, which is exactly what data mining does.

There is a lot of hype about predictive analytics as opposed to data mining. If we take the case of market basket analysis, this is essentially saying that once we have identified that the sale of nappies is associated with beer sales (even if that is an urban myth) then we can make predictions about one based on the other. Useful, and certainly an increasing focus, but not really significantly different from what data mining has always been about.

Of course, there is also a trend to make data mining easier (and less costly) to do, but that is hardly surprising: it is common across the whole IT sector.

In my view, perhaps the most important trend is towards the integration of text mining and data mining. As yet, this is a relatively immature market but the fact is that most information held within business today is in unstructured format. While most of the discussion has been about Search that is simply about finding things related to a particular topic, while text mining is about finding patterns of information within text which, in the right context, is much more valuable. Moreover, with the advent of DB2 Viper we are likely to see the increased use of applications that employ both relational and XML-based information, in which case a combination of data and text mining makes sense.

While SPSS is one of the two major players in the data mining market it is the clear leader in the text mining space, not least because it is the dominant provider of market research software, and doing text mining on the back of the results of market research makes obvious sense. However, it is probably also the leading provider of combined text and data mining outside of this environment as well, so if I am right about the future of data mining, and its increased use with text capabilities, then SPSS is very well-placed.

SPSS is also in a good position because IBM has withdrawn the client component of its Intelligent Miner product and users thereof will be looking for a replacement offering, and SPSS has a much closer relationship with IBM than its major competitors, which it is looking to capitalise upon by picking up these users.

Copyright © 2006, IT-Analysis.com

Internet Security Threat Report 2014

More from The Register

next story
PEAK APPLE: iOS 8 is least popular Cupertino mobile OS in all of HUMAN HISTORY
'Nerd release' finally staggers past 50 per cent adoption
Microsoft to bake Skype into IE, without plugins
Redmond thinks the Object Real-Time Communications API for WebRTC is ready to roll
Microsoft promises Windows 10 will mean two-factor auth for all
Sneak peek at security features Redmond's baking into new OS
Mozilla: Spidermonkey ATE Apple's JavaScriptCore, THRASHED Google V8
Moz man claims the win on rivals' own benchmarks
FTDI yanks chip-bricking driver from Windows Update, vows to fight on
Next driver to battle fake chips with 'non-invasive' methods
DEATH by PowerPoint: Microsoft warns of 0-day attack hidden in slides
Might put out patch in update, might chuck it out sooner
Ubuntu 14.10 tries pulling a Steve Ballmer on cloudy offerings
Oi, Windows, centOS and openSUSE – behave, we're all friends here
Was ist das? Eine neue Suse Linux Enterprise? Ausgezeichnet!
Version 12 first major-number Suse release since 2009
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
New hybrid storage solutions
Tackling data challenges through emerging hybrid storage solutions that enable optimum database performance whilst managing costs and increasingly large data stores.
Getting ahead of the compliance curve
Learn about new services that make it easy to discover and manage certificates across the enterprise and how to get ahead of the compliance curve.