Stream Processing XML in IronPython
Harry Pierson likes the xml.dom.pulldom API from the Python standard library, but it doesn't work with IronPython because it requires the pyexpat C extension module. (There is however an IronPython compatible version of pyexpat in FePy.)
In order to use a similar API, Harry has written a module called 'ipypulldom' that wraps .NET functionality:
In order to use a similar API, Harry has written a module called 'ipypulldom' that wraps .NET functionality:
Outstanding, but I have a little bone to pick with the XML methodology used.
ReplyDeleteMy understanding is that IronPython has constantly improving support for Elementtree (and lxml). That is what should be used.
In the Python world, if you are processing XML, don't think DOM or SAX. Think, Elementtree.
Use the built-in Elementtree, unless you are doing heavy HTML. lxml.html is what to use for heavy HTML. Xpath is very nice, another reason to use lxml.
You can tree these Elementtree API data structures as very, very lightweight DOM, and practically ignore SAX altogether.
You can get almost the full benefit of SAX, but still with the massive convienence of DOM, by using the interator protocol hooks in the Elementtree API
see "getiterator" http://effbot.org/zone/element.htm