untangle
untangle copied to clipboard
Partially preserving XML?
Hi Chris,
Awesome work, I just had a quick question! I'm currently looking at using this module for pulling in data from multiple XML documents with the same data in different formats and it looks like exactly what I need. There's just one issue, given the following snippet:
<titles>
<title>Orbital anisotropy and low-energy excitations of the
quasi-one-dimensional conductor
<math display="inline">
<mi>β</mi></math>-Sr
<math display="inline">
<msub>
<mrow/>
<mrow>
<mn>0.17</mn></mrow>
</msub>
</math>V
<math display="inline">
<msub>
<mrow/>
<mn>2</mn></msub>
</math>O<title></title>
<math display="inline">
<msub>
<mrow/>
<mn>5</mn></msub>
</math>
</title>
</titles>
I want to get everything within the title tags, including the various math tags and their children. This doesn't seem to be possible currently. Either I can access title.cdata and get the stripped down string, or I can loop through the title.math list. Is this something you can see happening in the future?
Thanks, Sam
Hi Sam,
thank you for your comments. As to your problem, I'm afraid it is quite specific and I currently can't think of a way of implementing it without drastically changing the code and the API. I'd like to keep this ticket open and think about it some more, though. If you have any more thoughts on this, feel free to add them here.
Cheers, Chris
I have the same problem Can't we simply add a tostring method to Element Class which iterates over attributes and children of the element and returns back the xml string corresponding to the element?
(although it may not print the attributes of the element in the same other as original xml)
just like https://docs.python.org/2/library/xml.etree.elementtree.html#xml.etree.ElementTree.tostring