Feeds

The case for open source ETL

What you see is what you get

5 things you didn’t know about cloud backup

Comment As far as I have been able to discover, there are four open source ETL (extract, transform and load) tools on the market. Somewhat surprisingly, two of them are homonyms: KETL and Kettle, the other two being Enhydra Octopus and CloverETL.

Kettle is based on an ETTL paradigm, the extra ‘T’ standing for transport (which seems an unnecessary complication) and wins the prize for the product with the most sense of humour as it has four components that are variously named Spoon, Pan, Chef and Kitchen.

The most interesting question is where the market for open source ETL is. Looking at the products one would have to assume that they are mostly in the same space as the Sesame Software product that I discussed recently. That is, they are aimed at developers that know what they are doing and do not need (and do not get) a graphical drag-and-drop style product. The exception is Kettle, which looks much more like an PowerCenter or DataStage.

Another difference in open source products can be in implementation. Kinetic Networks, for example, the developers of KETL, reckons that you may need some implementation assistance with its product. In part, this is a result of the product’s origins: it was originally developed for in-house use and in conjunction with professional services engagements, so it is not really surprising if there are aspects of the product that have not been automated for the open source market yet.

In general, what you see is what you get with open source products, though there are add-on products for both of the homonymous products. In the case of Kettle one of the partners offers an SAP connector while Kinetic Networks has a number of options that it offers in conjunction with KETL, notably an MPP (massively parallel processing) option for improved performance, a data profiling extension and a clickstream capability.

As far as I can see there are no options (apart from support) available with either Enhydra Octopus or CloverETL, the latter being a product that generates Java. This, despite its attractions (especially for ISVs), is still a relatively rare capability: ETL Solutions has a product that generates Java and ETI plans to, but otherwise this is not generally available, so it represents a potential market for CloverETL that is not available to its open source counterparts.

Enhydra Octopus is distinguished by the fact that it has different companies offering support for the product in Europe, Japan and the United States whereas the other products only have limited support options (USA for KETL, Austria/Belgium for Kettle and the Czech Republic for Enhydra Octopus).

In other words, each of the products has something different going for it, though none of them will trouble the likes of Informatica, IBM or Ab Initio.

Copyright © 2005, IT-Analysis.com

Boost IT visibility and business value

More from The Register

next story
Why has the web gone to hell? Market chaos and HUMAN NATURE
Tim Berners-Lee isn't happy, but we should be
Linux turns 23 and Linus Torvalds celebrates as only he can
No, not with swearing, but by controlling the release cycle
Apple promises to lift Curse of the Drained iPhone 5 Battery
Have you tried turning it off and...? Never mind, here's a replacement
Sin COS to tan Windows? Chinese operating system to debut in autumn – report
Development alliance working on desktop, mobe software
Eat up Martha! Microsoft slings handwriting recog into OneNote on Android
Freehand input on non-Windows kit for the first time
Linux kernel devs made to finger their dongles before contributing code
Two-factor auth enabled for Kernel.org repositories
This is how I set about making a fortune with my own startup
Would you leave your well-paid job to chase your dream?
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Scale data protection with your virtual environment
To scale at the rate of virtualization growth, data protection solutions need to adopt new capabilities and simplify current features.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?