Feeds

Sorting the ETL men from the boys

Diverging paths

Boost IT visibility and business value

Comment The ETL (extract, transform and load) market, far from commoditising, is diverging. To begin with, ETL is no longer an appropriate term to use, both because operations are no longer limited to the order indicated but also because the technology encompasses far more than just moving data into a warehouse. However, I don't like the alternatives such as "data movement" and "data transfer" much, while "data integration" is too broad, so I guess we are stuck with ETL. However, this is by no means the only area of divergence.

Perhaps the most obvious change in the market is the growth in code generating products and there is now a clear split in the market between black box solutions and code generating approaches. While the former saw off the previous generation of code-based products a decade ago, it is by no means clear cut that they will do so again: SQL and Java are much more portable than the Cobol-based products of the early nineties.

Code-based approaches are also helped by the many ISVs that want the ability to embed specific ETL capabilities within their own products, and there are a number of newer ETL suppliers specifically targeting this market either directly or in a complementary fashion. For example, Baycastle focuses on doing things like moving data into contact management systems.

Another major change has been the advent of Open Source (Clover and Kinetic Networks' KETL) products and even shareware products (DB Software), which should help to drive user acceptance of the "don't hand code" message and which can only benefit everybody.

However, returning to the established players versus the new entrants discussion, the big advantage that the former have is that they provide lots of complementary functionality, notably with data quality, enterprise information and application integration and so on, though this is not limited to black-box solutions (witness Sunopsis).

Finally, the latest area of divergence is in the ability to support the extraction, transformation and loading of unstructured and semi-structured content. Of course, the concept of unstructured content is a nonsense – if it was really unstructured it would collapse into a heap – but, for the purposes of this discussion I mean Word and pdf documents and the like on the one hand (unstructured) and HIPAA, EDIFACT, SWIFT and similar documents (semi-structured on the other).

Of course, this is not entirely new: Ascential has had abilities in the area of semi-structured data ever since it bought Mercator (now DataStage TX), while Hummingbird has offered the ability to extract unstructured content for some time, largely because it is the only ETL vendor that is also a major content/document management provider. However, Informatica has now added this capability as generic functionality and other vendors are likely to follow suit.

If the ability to build applications that combine content and data is to be the major growth area that many suspect that it will be, then the ability to support ETL functions against content as opposed to data is likely to be a defining factor and will sort out the ETL men from the boys.

Copyright © 2005, IT-Analysis.com

The essential guide to IT transformation

More from The Register

next story
Munich considers dumping Linux for ... GULP ... Windows!
Give a penguinista a hug, the Outlook's not good for open source's poster child
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Intel's Raspberry Pi rival Galileo can now run Windows
Behold the Internet of Things. Wintel Things
Microsoft cries UNINSTALL in the wake of Blue Screens of Death™
Cache crash causes contained choloric calamity
Eat up Martha! Microsoft slings handwriting recog into OneNote on Android
Freehand input on non-Windows kit for the first time
Time to move away from Windows 7 ... whoa, whoa, who said anything about Windows 8?
Start migrating now to avoid another XPocalypse – Gartner
You'll find Yoda at the back of every IT conference
The piss always taking is he. Bastard the.
prev story

Whitepapers

5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.