Feeds

Sorting the ETL men from the boys

Diverging paths

Secure remote control for conventional and virtual desktops

Comment The ETL (extract, transform and load) market, far from commoditising, is diverging. To begin with, ETL is no longer an appropriate term to use, both because operations are no longer limited to the order indicated but also because the technology encompasses far more than just moving data into a warehouse. However, I don't like the alternatives such as "data movement" and "data transfer" much, while "data integration" is too broad, so I guess we are stuck with ETL. However, this is by no means the only area of divergence.

Perhaps the most obvious change in the market is the growth in code generating products and there is now a clear split in the market between black box solutions and code generating approaches. While the former saw off the previous generation of code-based products a decade ago, it is by no means clear cut that they will do so again: SQL and Java are much more portable than the Cobol-based products of the early nineties.

Code-based approaches are also helped by the many ISVs that want the ability to embed specific ETL capabilities within their own products, and there are a number of newer ETL suppliers specifically targeting this market either directly or in a complementary fashion. For example, Baycastle focuses on doing things like moving data into contact management systems.

Another major change has been the advent of Open Source (Clover and Kinetic Networks' KETL) products and even shareware products (DB Software), which should help to drive user acceptance of the "don't hand code" message and which can only benefit everybody.

However, returning to the established players versus the new entrants discussion, the big advantage that the former have is that they provide lots of complementary functionality, notably with data quality, enterprise information and application integration and so on, though this is not limited to black-box solutions (witness Sunopsis).

Finally, the latest area of divergence is in the ability to support the extraction, transformation and loading of unstructured and semi-structured content. Of course, the concept of unstructured content is a nonsense – if it was really unstructured it would collapse into a heap – but, for the purposes of this discussion I mean Word and pdf documents and the like on the one hand (unstructured) and HIPAA, EDIFACT, SWIFT and similar documents (semi-structured on the other).

Of course, this is not entirely new: Ascential has had abilities in the area of semi-structured data ever since it bought Mercator (now DataStage TX), while Hummingbird has offered the ability to extract unstructured content for some time, largely because it is the only ETL vendor that is also a major content/document management provider. However, Informatica has now added this capability as generic functionality and other vendors are likely to follow suit.

If the ability to build applications that combine content and data is to be the major growth area that many suspect that it will be, then the ability to support ETL functions against content as opposed to data is likely to be a defining factor and will sort out the ETL men from the boys.

Copyright © 2005, IT-Analysis.com

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
New 'Cosmos' browser surfs the net by TXT alone
No data plan? No WiFi? No worries ... except sluggish download speed
iOS 8 release: WebGL now runs everywhere. Hurrah for 3D graphics!
HTML 5's pretty neat ... when your browser supports it
Mathematica hits the Web
Wolfram embraces the cloud, promies private cloud cut of its number-cruncher
Mozilla shutters Labs, tells nobody it's been dead for five months
Staffer's blog reveals all as projects languish on GitHub
'People have forgotten just how late the first iPhone arrived ...'
Plus: 'Google's IDEALISM is an injudicious justification for inappropriate biz practices'
SUSE Linux owner Attachmate gobbled by Micro Focus for $2.3bn
Merger will lead to mainframe and COBOL powerhouse
iOS 8 Healthkit gets a bug SO Apple KILLS it. That's real healthcare!
Not fit for purpose on day of launch, says Cupertino
prev story

Whitepapers

Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Beginner's guide to SSL certificates
De-mystify the technology involved and give you the information you need to make the best decision when considering your online security options.