Feeds

Hands on with Java XML filter pipelines

Ignored by many

Intelligent flash storage arrays

In this tutorial we will step through this little understood area of XML processing. Java XML Filters are part of the SAX API for XML processing.

Java XML filters are objects that can filter (or process) data within a pipeline of filters, wrapped around a core XML processing application. The SAX API is a standard part of the JAXP (Java API for XML Processing), which has itself been provided as part of the Java environment since Java 2 SDK 1.4.0 and can be used in a wide range of XML processing applications.

In the remainder of this tutorial we will first introduce the concept of Java XML filters and examine how a generic Java XML Filter can be written.

The XMLFilter interface

Although the XMLFilter interface is part of the SAX API for processing XML documents, it's ignored by many Java XML tutorials and overlooked by most Java developers. An XMLFilter is a sub-interface of the XMLReader class; as such it is very like the XMLReader except that it obtains its events from another XML reader rather than a primary source like an XML document, file or database. As such, it is a primary component in the JAXP pipeline architecture. That is, it can sit within a pipeline, receiving XML data from another XML processing element and passing the results of its processing onto another XML processing element if required.

Assuming you have a distribution of SAX, or a version of Java containing the XML APIs, you can look at the included classes; the one you want is org.xml.sax.XMLFilter. You should also ensure that you have the SAX helper classes, found in the org.xml.sax.helpers; package. In that package, you will want to focus on the org.xml.sax.helpers.XMLFilterImpl class.

The XMLFilter interface methods

If you examine the XMLFilter, you'll find that it extends the org.xml.sax.XMLReader interface, and adds two new methods:

1. public void setParent(XMLReader parent); this method allows the application to link the filter to a parent reader (which may be another filter). The argument may not be null.

2. public XMLReader getParent(); this method allows the application to query the parent reader (which may be another filter). It is generally a bad idea to perform any operations on the parent reader directly: they should all pass through this filter.

This probably doesn't look like much; however it is very significant as the ability to link one filter to another (via the setParent method) allows the "pipeline" to be constructed. Of course, you also get all the other XMLReader methods such as startElement(), endElement(), etc. In each of these methods, you can operate upon the input XML data before an application, or the next filter in the pipeline, gets to it.

This means that in the following diagram you get to modify data before an application gets that data using a standard XML framework. Note that as you have all of the SAX callback methods that an XMLReader does, you can work with the elements, the attributes, the prefix mappings, and anything else that SAX can work with.

Filter data for an XML Application

The key here is that the XML form the source, can be filtered (pre-processed) before being accessed by the application, without modification to that application, and in a standard manner. Essentially, the first filter reads the XML data, processes it if necessary and pushes it onto the next filter. Filter 3 then does the same and pushes it onto the XML application. Of course you could do the same thing by writing your own custom XML pre-processors, the point here is that a standard framework is available within which to do this.

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Preview redux: Microsoft ships new Windows 10 build with 7,000 changes
Latest bleeding-edge bits borrow Action Center from Windows Phone
Google opens Inbox – email for people too thick to handle email
Print this article out and give it to someone tech-y if you get stuck
Microsoft promises Windows 10 will mean two-factor auth for all
Sneak peek at security features Redmond's baking into new OS
UNIX greybeards threaten Debian fork over systemd plan
'Veteran Unix Admins' fear desktop emphasis is betraying open source
Entity Framework goes 'code first' as Microsoft pulls visual design tool
Visual Studio database diagramming's out the window
Google+ goes TITSUP. But WHO knew? How long? Anyone ... Hello ...
Wobbly Gmail, Contacts, Calendar on the other hand ...
DEATH by PowerPoint: Microsoft warns of 0-day attack hidden in slides
Might put out patch in update, might chuck it out sooner
Ubuntu 14.10 tries pulling a Steve Ballmer on cloudy offerings
Oi, Windows, centOS and openSUSE – behave, we're all friends here
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.