Groovy: XML without the bloat

Original URL: https://www.theregister.com/2007/12/14/groovy_xml_part_one/

Flexible strings

Posted in Channel, 14th December 2007 16:44 GMT

Hands on, part 1 You may find this hard to believe, but there was a time before XML hell, when the idea was that XML was going to solve just about every tricky problem in software development. From swapping data between applications or platforms, to storing complex data structures in a portable format - XML was the answer.

Growing up around the same time as XML was Java, which - like XML - has since gone on to shape an industry. If we’re honest, though, it’s never been an easy relationship between these two. Processing XML in Java is a common enough requirement, but it can be extremely verbose, requiring plenty of boilerplate code and all kinds of extraneous scaffolding that obscures what are often very straightforward functions.

That’s where Groovy comes in. As befits a young scripting language for the Java Virtual Machine, Groovy can do XML in a way that is relatively free of bloat and allows the developer to focus on the real problem at hand. In this two-part tutorial I shall look at how Groovy can help in reading and writing XML documents. You can find instructions on installing and running Groovy here, or you can refer back to my earlier article.

At some time or other we’ve probably all written code that just uses strings to write out a fragment of XML. It’s cheap but clunky, but it saves having to instantiate a DOM tree or any of that fiddly stuff. Groovy’s flexible strings allow us to do this very, very simply.

Here’s how we create a fragment of XML that contains a list of names each in an element called "hello", and each with an order attribute:

names=['john','bill','ted']
x=0
frag=''
names.each {
  x++
  frag+="""<hello order="$x">$it</hello>\n"""
}

Running this code in the GroovyConsole, or dumping it into a file called xml.groovy and running it from the command line produces the following output:

<hello order="1">john</hello>
<hello order="2">bill</hello>
<hello order="3">ted</hello>

But that’s only a bunch of strings, there’s no XML declaration, and certainly no way it can be processed as XML. If we do need to turn this into a DOM tree, we can do that easily enough too. Let’s make our hello elements children of a root element called "names" and then turn it into a DOM tree that we can interrogate:

def StringReader s=new StringReader("<names>\n" + frag + "</names>")
def xmldoc=groovy.xml.DOMBuilder.parse(s)
def names=xmldoc.documentElement

println names 

use (groovy.xml.dom.DOMCategory){
    println names.hello[0].text()
    println names.hello.size()
    println names.hello[1].attributes.item(0).text()
}

This code spools out our well-formed XML and the results of some simple tree walking:

<?xml version="1.0" encoding="UTF-8"?>
<names>
<hello order="1">john</hello>
<hello order="2">bill</hello>
<hello order="3">ted</hello>
</names>

john
3
2

It should be clear by now that the combination of Groovy’s convenience methods (from the groovy.xml.* libraries), iterators and powerful string handling capabilities make for very succinct code for creating XML from relatively straightforward data. While the equivalent Java code would take a lot more typing - which wouldn’t be half as readable and easy to follow - so far there’s nothing tremendously taxing here.

Groovy also includes a very handy feature for handling tree structures: builders. In the same way that Groovy has excellent built-in support for collections - lists and maps - it also comes with support for trees. Builders are perfect for all kinds of tree structures, from HTML to GUI elements to XML.

For a real-world example let’s say we have a nested structure that we want to export to XML. It could be the results of a query to a MySQL database, the data from a class hierarchy or some other source. In our example the data relates to a simple personnel database, with a record for each person. We store this as follows:

pers=["john":[surname:"smith",age:37,gender:'m',children:2],
      "jill":[surname:"jones",age:28,gender:'f',children:0]
      ]

Before we dive into the builder, let’s remind ourselves of what we can do with Groovy’s iterators and closures:

pers.each {name, data ->
  println name + ' ' + data['surname'] + ' is ' + data['age'] + ' years old'
}

That single line of code iterates through each person’s record, mapping the key value to the name variable, and the map of data to the variable we’ve cleverly labeled data. We can then address the contents of the data map directly by name. Running that line of code gives the following on the command line:

john smith is 37 years old
jill jones is 28 years old

We’re going to do something similar using a groovy.xml.MarkupBuilder object, as shown below:

s_xml=new StringWriter()
builder=new groovy.xml.MarkupBuilder(s_xml)
people=builder.people{
  pers.each{ name, data ->
    person(first_name:name, surname:data['surname']){
      age(data['age']){}
      gender(data['gender']){}
      children('count':data['children']){}
    }
  }
}

println s_xml

This clever little bit of code creates a builder object that writes its data to the StringWriter variable called s_xml. The builder uses a closure that contains our data source called pers, which uses the "each" iterator as in the previous example. The magic is in the pers.each closure. Here we use a set of pseudo-methods called person, age, gender and children. These are all turned into XML elements, and the arguments to these pseudo-methods are the values of the elements. If we run the above code we can see the results clearly enough:

<people>
  <person first_name='john' surname='smith'>
    <age>37</age>
    <gender>m</gender>
    <children count='2' />
  </person>
  <person first_name='jill' surname='jones'>
    <age>28</age>
    <gender>f</gender>
    <children count='0' />
  </person>
</people>

No wonder the language is called Groovy! We can even spool that out to file in a few lines of code as well:

str=s_xml.toString()
def fw= new FileWriter('pers.xml')
'<?xml version="1.0"?>\n'.each{fw.write(it)}
s_xml.toString().each{fw.write(it)}
fw.close()

Anyone who’s ever had to write code to get complex, hierarchical data out into XML will recognize that this is a very easy and natural way to go about navigating through the data and organizing it into the required format.

In second part of this series, I shall turn to the other side of this programming equation - reading in XML and querying or transforming it.®

Groovy: XML without the bloat

Related stories