Practical XML Manipulation with Groovy: Common Examples
Introduction
To parse xml in groovy there are two options available
XmlParser
which returns aNode
objects when parsing XMLXmlSlurper
which returnsGPathResult
instances when parsing XML. It evaluates the structure lazily. So if you update the xml you’ll have to evaluate the whole tree again.
When to use which one:
If you want to transform an existing document to another then
XmlSlurper
will be the choiceIf you want to update and read at the same time then
XmlParser
is the choice.
Read more here - https://groovy-lang.org/processing-xml.html
Examples
Copying a complete XML inside another XML node
- You have a xml and you want a part (it may consist nested nodes) of that xml inside some other xml structure you created.
You can get that part of xml using XmlParser
as a GPath
and then use the code given below. The main concept is to use mkp.yieldUnescaped
where mkp
is MarkupBuilderHelper
instance for the xmlBuilder
object.
Here is link for more details about MarkupBuilderHelper
- https://groovy-lang.org/processing-xml.html#_markupbuilderhelper
import groovy.xml.MarkupBuilder
import groovy.xml.XmlUtil
xml = '''
<records>
<car name='HSV Maloo' make='Holden' year='2006'>
<country>Australia</country>
<record type='speed'>Production Pickup Truck with speed of 271kph</record>
</car>
<car name='P50' make='Peel' year='1962'>
<country>Isle of Man</country>
<record type='size'>Smallest Street-Legal Car at 99cm wide and 59 kg in weight</record>
</car>
<car name='Royale' make='Bugatti' year='1931'>
<country>France</country>
<record type='price'>Most Valuable Car at $15 million</record>
</car>
</records>
'''
// records is a groovy.util.NodeList
def records = new XmlParser(false, false).parseText(xml).car
def writer = new StringWriter()
def xmlBuilder = new MarkupBuilder(writer)
xmlBuilder.mkp.xmlDeclaration(version: '1.0', encoding: 'UTF-8') // add xml declaration
xmlBuilder.'car-records'('id': '1') {
// since we are using NodeList we have to iterate
// in case it is Node you can directly use, mkp.yieldUnescaped()
records.forEach { child ->
mkp.yieldUnescaped(XmlUtil.serialize(child).replaceFirst(/<\?xml.*\?>/, ''))
}
}
println(writer.toString())
In case there are you don't want to insert one but child elements of a particular element you can utilize this code snippet:
import groovy.xml.MarkupBuilder
import groovy.xml.XmlUtil
xml = '''
<records>
<car name='HSV Maloo' make='Holden' year='2006'>
<country>Australia</country>
<record type='speed'>Production Pickup Truck with speed of 271kph</record>
</car>
<car name='P50' make='Peel' year='1962'>
<country>Isle of Man</country>
<record type='size'>Smallest Street-Legal Car at 99cm wide and 59 kg in weight</record>
</car>
<car name='Royale' make='Bugatti' year='1931'>
<country>France</country>
<record type='price'>Most Valuable Car at $15 million</record>
</car>
</records>
'''
def records = new XmlSlurper(false, false).parseText(xml).car
def writer = new StringWriter()
def xmlBuilder = new MarkupBuilder(writer)
xmlBuilder.mkp.xmlDeclaration(version: '1.0', encoding: 'UTF-8') // add xml declaration
xmlBuilder.'car-records'('id': '1') {
records.forEach { child ->
mkp.yieldUnescaped(XmlUtil.serialize(child).replaceFirst(/<\?xml.*\?>/, ''))
}
}
println(writer.toString())
Perform some operation (sort, unique, custom, ...) on some items in XML based on some criteria
Suppose you want perform some operation on some items inside XML, like sorting or keeping only unique items, here is a common outline on how you can do that.
First query for all the items, use
GPath
to access them.After extracting all the items remove them from the original XML.
Perform manipulation on them, you can use criteria in form of closures. Also if you want to multiple manipulations use chaining of functions.
Add the manipulated items back to the XML.
Use
XmlUtil
to serialize the output XML.
import groovy.xml.XmlUtil
xml = """<vehicles>
<cars>
<car>
<id>3</id>
<name>Toyota Camry</name>
<color>Silver</color>
</car>
<car>
<id>1</id>
<name>BMW 3 Series</name>
<color>Navy Blue</color>
</car>
<car>
<id>2</id>
<name>Audi A4</name>
<color>Black</color>
</car>
<car>
<id>5</id>
<name>Honda Accord</name>
<color>White</color>
</car>
<car>
<id>4</id>
<name>Mercedes-Benz C-Class</name>
<color>Dark Grey</color>
</car>
</cars>
</vehicles>
"""
def parser = new XmlParser(false, false)
def vehicles = parser.parseText(xml)
// Extract the items we need segments
def cars = vehicles.cars.car
// remove those items from xml
cars.each { item ->
def parent = item.parent()
parent.remove(item)
}
// Perform the operation you want on those items, like sorting in this case
// the criteria will be passed as a closure
def sortedCars = cars.sort { it.id.text() }
// or you can get all unique items
def uniqueCars = cars.unique { it.id.text() }
// or both unique soorted cars
def uniqueSortedCars = cars.unique { it.id.text() }.sort { it.id.text() }
// Add those modified items to the xml
// note that I'm getting the first root element at 0, if there were
// multiple roots then you have to use multiple each
sortedCars.each { vehicles.cars[0].append(it) }
// Convert the modified XML back to string
def outputXml = XmlUtil.serialize(vehicles)
println(outputXml)
Perform some operation on items with text values
Suppose you want to perform some operation like capitalization, removing leading zeros, changing date format, etc. on items which have text values.
Here is outline on how to do that:
first query for all the elements for which you want to do the transformation
use some closure or function to perform the transformation
def xml = "" // use same as the example above
def parser = new XmlParser()
def vehicles = parser.parseText(xml)
// Find all the elements which you want for transformation
vehicles.'**'.findAll { it.name() == 'name' }.each { element ->
element.value = element.text().toUpperCase()
}
// Convert the modified XML back to a string
def outputXml = XmlUtil.serialize(vehicles)
println outputXml
...more examples to come.
Subscribe to my newsletter
Read articles from Shivanshu Semwal directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Shivanshu Semwal
Shivanshu Semwal
Software Developer. Interested in finding innovative solutions to problems.