Feeds

Sorting the ETL men from the boys

Diverging paths

Security for virtualized datacentres

Comment The ETL (extract, transform and load) market, far from commoditising, is diverging. To begin with, ETL is no longer an appropriate term to use, both because operations are no longer limited to the order indicated but also because the technology encompasses far more than just moving data into a warehouse. However, I don't like the alternatives such as "data movement" and "data transfer" much, while "data integration" is too broad, so I guess we are stuck with ETL. However, this is by no means the only area of divergence.

Perhaps the most obvious change in the market is the growth in code generating products and there is now a clear split in the market between black box solutions and code generating approaches. While the former saw off the previous generation of code-based products a decade ago, it is by no means clear cut that they will do so again: SQL and Java are much more portable than the Cobol-based products of the early nineties.

Code-based approaches are also helped by the many ISVs that want the ability to embed specific ETL capabilities within their own products, and there are a number of newer ETL suppliers specifically targeting this market either directly or in a complementary fashion. For example, Baycastle focuses on doing things like moving data into contact management systems.

Another major change has been the advent of Open Source (Clover and Kinetic Networks' KETL) products and even shareware products (DB Software), which should help to drive user acceptance of the "don't hand code" message and which can only benefit everybody.

However, returning to the established players versus the new entrants discussion, the big advantage that the former have is that they provide lots of complementary functionality, notably with data quality, enterprise information and application integration and so on, though this is not limited to black-box solutions (witness Sunopsis).

Finally, the latest area of divergence is in the ability to support the extraction, transformation and loading of unstructured and semi-structured content. Of course, the concept of unstructured content is a nonsense – if it was really unstructured it would collapse into a heap – but, for the purposes of this discussion I mean Word and pdf documents and the like on the one hand (unstructured) and HIPAA, EDIFACT, SWIFT and similar documents (semi-structured on the other).

Of course, this is not entirely new: Ascential has had abilities in the area of semi-structured data ever since it bought Mercator (now DataStage TX), while Hummingbird has offered the ability to extract unstructured content for some time, largely because it is the only ETL vendor that is also a major content/document management provider. However, Informatica has now added this capability as generic functionality and other vendors are likely to follow suit.

If the ability to build applications that combine content and data is to be the major growth area that many suspect that it will be, then the ability to support ETL functions against content as opposed to data is likely to be a defining factor and will sort out the ETL men from the boys.

Copyright © 2005, IT-Analysis.com

Secure remote control for conventional and virtual desktops

More from The Register

next story
PEAK APPLE: iOS 8 is least popular Cupertino mobile OS in all of HUMAN HISTORY
'Nerd release' finally staggers past 50 per cent adoption
Microsoft to bake Skype into IE, without plugins
Redmond thinks the Object Real-Time Communications API for WebRTC is ready to roll
Microsoft promises Windows 10 will mean two-factor auth for all
Sneak peek at security features Redmond's baking into new OS
Mozilla: Spidermonkey ATE Apple's JavaScriptCore, THRASHED Google V8
Moz man claims the win on rivals' own benchmarks
FTDI yanks chip-bricking driver from Windows Update, vows to fight on
Next driver to battle fake chips with 'non-invasive' methods
DEATH by PowerPoint: Microsoft warns of 0-day attack hidden in slides
Might put out patch in update, might chuck it out sooner
Ubuntu 14.10 tries pulling a Steve Ballmer on cloudy offerings
Oi, Windows, centOS and openSUSE – behave, we're all friends here
Was ist das? Eine neue Suse Linux Enterprise? Ausgezeichnet!
Version 12 first major-number Suse release since 2009
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
New hybrid storage solutions
Tackling data challenges through emerging hybrid storage solutions that enable optimum database performance whilst managing costs and increasingly large data stores.
Mitigating web security risk with SSL certificates
Web-based systems are essential tools for running business processes and delivering services to customers.