XML anti discrimination plan hits hurdle

Cherokee Nation awaits

A well-intentioned attempt to make XML less exclusive to certain ethic groups actually risks causing breakage for those it's intended to help.

XML co-inventor Tim Bray and others have raised a last-minute objection to the planned XML Fifth Edition working its way through the World Wide Web Consortium (W3C). They say it could make it harder to program with or parse some legacy XML documents.

Bray, Sun Microsystems' director of web technologies, said the W3C should completely revise both XML 1.0 and Namespaces 1.0 together or stick with the existing work arounds.

"The change introduces an inconsistency between XML 1.0 and XML Namespaces 1.0, which is intolerable," Bray blogged. Bray subsequently told The Reg that XML documents whih use the proposed new rules would "probably break existing XML 1.0 software".

XML 1.0, which Bray worked on, only allowed characters in markup specified by Unicode by 1998, when XML 1.0 was created. That meant programmers writing in scripts such as Amharic or Cherokee, which have been added since then, can't use their characters in tag or attribute names.

The XML 1.0 Fifth Edition proposes that newly defined character sets can be used in tag and attribute names.

Problem is Namespaces 1.0 still uses the existing XML 1.0 rules - so you can't have Amharic tags. "That makes things tough for developers; how do you enforce both the rules of Namespaces 1.0 and XML 1.0 Fifth Edition," Bray told us.

An attempt was made to remedy this 2006 with XML 1.1 - but this proved controversial for other reasons. According to Bray, IBM pushed through features to suit mainframe programmers - although this claim is rejected by XML expert John Cowan, who worked on the XML 1.1 specification. Cowan also said that the W3C XML working group is in the process of fixing the XML Namespace inconsistency.®

Additional reporting by Gavin Clarke

Sponsored: 10 ways wire data helps conquer IT complexity