Gentlemen, indent your XML!
When pushing a lot of XML around, something like this may come in handy. This is my ~/bin/xmlindent.py
:
#!/usr/bin/env python
from lxml import etree
import sys
def indent(elem, level=0):
i = "\n" + level*" "
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
for e in elem:
indent(e, level+1)
if not e.tail or not e.tail.strip():
e.tail = i + " "
if not e.tail or not e.tail.strip():
e.tail = i
else:
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = i
if len(sys.argv) > 1:
src = sys.argv[1]
else:
src = sys.stdin
tree = etree.parse(src)
indent(tree.getroot())
tree.write(sys.stdout, "utf-8")
The indent
function is a variant of the one in Fredrik Lundh's effbotlib, and I'm using lxml instead of cElementTree because it gives a cleaner and more human-friendly output when there are namespaces involved.
Oh, and there was a bug, I guess, in lxml 1.0 that made it barf on parse(sys.stdin)
. Upgrading to 1.1.2 fixed that, though.
(As a bonus, that made me get easy_install
working properly; one of those "nah, some other time" procrastination tarpits. It's a nice tool. easy_install
, I mean. Not the tarpit.)
Enjoy.