pyquery
pyquery copied to clipboard
replace_with doesn't work for top level nodes
for example
> print pq("<span/><div>foo</div>").filter("div").replace_with("<span>bah</span>")
<div><span/><div>foo</div></div>
The reason seems to be that replace_with expects a parent, which exists, but is not part of the original document.
Remove doesn't work either
pq("<span/><div>foo</div>",parser='html_fragments').filter("div").remove()
[<div>]
I think there are two things going on.
- remove and replace_with don't change whats in self.elements. So even though they remove it from some parent node somewhere, the pq object still includes the old set of nodes. That seems wrong.
- When using html_fragments parsing a new root node is created by not included in the pq object. The remove and replace_with will alter this parent. If you print out the children of this parent you will get the correct result.
For example this will get you the desired result
> doc = pq("<span/><div>foo</div>",parser='html_fragments')
> root = doc[0].getparent()
> print doc.filter("div").remove()
<div>foo</div>
> print (root.text or '') + ''.join([tostring(child) for child in root.iterchildren()])
<span/>