Feeds

Hands on with Java XML filter pipelines

Ignored by many

Internet Security Threat Report 2014

Typical uses of Filters

Some typical uses of filters include:

• Normalization of whitespace in which contiguous whitespace PCDATA is replaced by a single space.

• Ignoring information in the originating XML document.

• Modifying elements in the original XML document.

• Adding data to the elements in the original XML document.

The following example illustrates how a filter can be used to modify an actual element such that the element passed to the processing application differs form that define din the original XML document.

A simple filter

The following listing shows a very simple SAX filter that changes all postal code elements into postcode elements. This effectively pre-processes the input document to modify one element, while allowing all other elements to pass through unchanged. This filter is presented in the following listing.

A Simple XMLFilter class

To be a Java XML filter a class must either implement the XMLFilter interface or extend the XMLFilterImpl class. In our case we are extending the XMLFilterImpl class from the org.xml.sax.helpers package as this means that we only need to implement those methods that will actually do something, (all other methods required by the XMLFilter interface are provided by inheritance from the XMLFilterImpl class). This keeps our code cleaner, and requires less work on our part.

We now free to implement only the methods startElement and endElement (be careful to make sure that the method signatures are the same as those in the XMLReader interface, otherwise you will be overloading the methods rather than overriding them, which will mean that your code will not be called).

Our startElement and endElement methods change the localName and the qName (qualified name) of the element to from the American postal code to the British postcode if the element postal code is found. Otherwise they just pass the data through unaltered.

Be careful to call the inherited super class methods once you have finished processing your data. This will enable the pipelining behaviour to be invoked.

SAX Processing Rules

You may wonder why we set both values. This has to do with rules regarding SAX processing and the state of the following two SAXParserFactory properties:

http://xml.org/sax/features/namespaces and the

http://xml.org/sax/features/namespace-prefixes properties.

Essentially, these rules say that:

1. the Namespace URI and local name are required when the namespaces property is true (the default), and are optional when the namespaces property is false (if one is specified, both must be);

2. the qualified name is required when the namespace-prefixes property is true, and is optional when the namespace-prefixes property is false (the default).

To handle these situations we are setting both parameters to the new element name. This is also why we test both parameters.

Security for virtualized datacentres

More from The Register

next story
Microsoft WINDOWS 10: Seven ATE Nine. Or Eight did really
Windows NEIN skipped, tech preview due out on Wednesday
Business is back, baby! Hasta la VISTA, Win 8... Oh, yeah, Windows 9
Forget touchscreen millennials, Microsoft goes for mouse crowd
Apple: SO sorry for the iOS 8.0.1 UPDATE BUNGLE HORROR
Apple kills 'upgrade'. Hey, Microsoft. You sure you want to be like these guys?
ARM gives Internet of Things a piece of its mind – the Cortex-M7
32-bit core packs some DSP for VIP IoT CPU LOL
Microsoft on the Threshold of a new name for Windows next week
Rebranded OS reportedly set to be flung open by Redmond
Lotus Notes inventor Ozzie invents app to talk to people on your phone
Imagine that. Startup floats with voice collab app for Win iPhone
'Google is NOT the gatekeeper to the web, as some claim'
Plus: 'Pretty sure iOS 8.0.2 will just turn the iPhone into a fax machine'
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
The next step in data security
With recent increased privacy concerns and computers becoming more powerful, the chance of hackers being able to crack smaller-sized RSA keys increases.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.