XML Tree Pruning

A little while back I had an XML document that I needed to prune down. Now tools like XPATH will easily let you pick out certain bits of an XML tree, but what I needed was kind of the complement: keep the document intact, but zap specific bits of it. Turns out that libxml2 provides an easy way to do this. For instance, in python:

import libxml2
doc = libxml2.parseFile("file.xml")
xpc = doc.xpathNewContext()
for n in xpc.xpathEval('/root/item|/root/folder[./title/text()!="Keeper"]'):
    n.unlinkNode()
doc.saveFormatFile("pruned.xml", True)

I.e. if I have an XML tree with a root element containing a bunch of item and folder nodes, this will toss out all of them except the folder I want to keep (here, the one with title "Keeper"). The procedure should be about the same from any language that can use libxml2.

About this Entry

This page contains a single entry by Milligan published on December 3, 2007 2:32 PM.

December and the Busy Blogger was the previous entry in this blog.

Felis Domesticus* is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Pages

Powered by Movable Type 4.31-en