Feeds

Hands on with Java XML filter pipelines

Ignored by many

Boost IT visibility and business value

In this tutorial we will step through this little understood area of XML processing. Java XML Filters are part of the SAX API for XML processing.

Java XML filters are objects that can filter (or process) data within a pipeline of filters, wrapped around a core XML processing application. The SAX API is a standard part of the JAXP (Java API for XML Processing), which has itself been provided as part of the Java environment since Java 2 SDK 1.4.0 and can be used in a wide range of XML processing applications.

In the remainder of this tutorial we will first introduce the concept of Java XML filters and examine how a generic Java XML Filter can be written.

The XMLFilter interface

Although the XMLFilter interface is part of the SAX API for processing XML documents, it's ignored by many Java XML tutorials and overlooked by most Java developers. An XMLFilter is a sub-interface of the XMLReader class; as such it is very like the XMLReader except that it obtains its events from another XML reader rather than a primary source like an XML document, file or database. As such, it is a primary component in the JAXP pipeline architecture. That is, it can sit within a pipeline, receiving XML data from another XML processing element and passing the results of its processing onto another XML processing element if required.

Assuming you have a distribution of SAX, or a version of Java containing the XML APIs, you can look at the included classes; the one you want is org.xml.sax.XMLFilter. You should also ensure that you have the SAX helper classes, found in the org.xml.sax.helpers; package. In that package, you will want to focus on the org.xml.sax.helpers.XMLFilterImpl class.

The XMLFilter interface methods

If you examine the XMLFilter, you'll find that it extends the org.xml.sax.XMLReader interface, and adds two new methods:

1. public void setParent(XMLReader parent); this method allows the application to link the filter to a parent reader (which may be another filter). The argument may not be null.

2. public XMLReader getParent(); this method allows the application to query the parent reader (which may be another filter). It is generally a bad idea to perform any operations on the parent reader directly: they should all pass through this filter.

This probably doesn't look like much; however it is very significant as the ability to link one filter to another (via the setParent method) allows the "pipeline" to be constructed. Of course, you also get all the other XMLReader methods such as startElement(), endElement(), etc. In each of these methods, you can operate upon the input XML data before an application, or the next filter in the pipeline, gets to it.

This means that in the following diagram you get to modify data before an application gets that data using a standard XML framework. Note that as you have all of the SAX callback methods that an XMLReader does, you can work with the elements, the attributes, the prefix mappings, and anything else that SAX can work with.

Filter data for an XML Application

The key here is that the XML form the source, can be filtered (pre-processed) before being accessed by the application, without modification to that application, and in a standard manner. Essentially, the first filter reads the XML data, processes it if necessary and pushes it onto the next filter. Filter 3 then does the same and pushes it onto the XML application. Of course you could do the same thing by writing your own custom XML pre-processors, the point here is that a standard framework is available within which to do this.

Build a business case: developing custom apps

More from The Register

next story
KDE releases ice-cream coloured Plasma 5 just in time for summer
Melty but refreshing - popular rival to Mint's Cinnamon's still a work in progress
Leaked Windows Phone 8.1 Update specs tease details of Nokia's next mobes
New screen sizes, dual SIMs, voice over LTE, and more
Mozilla keeps its Beard, hopes anti-gay marriage troubles are now over
Plenty on new CEO's todo list – starting with Firefox's slipping grasp
Apple: We'll unleash OS X Yosemite beta on the MASSES on 24 July
Starting today, regular fanbois will be guinea pigs, it tells Reg
PEAK LANDFILL: Why tablet gloom is good news for Windows users
Sinofsky's hybrid strategy looks dafter than ever
Another day, another Firefox: Version 31 is upon us ALREADY
Web devs, Mozilla really wants you to like this one
Secure microkernel that uses maths to be 'bug free' goes open source
Hacker-repelling, drone-protecting code will soon be yours to tweak as you see fit
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.