nokogiri
nokogiri copied to clipboard
XPath queries in fragments do not properly find root nodes
Summary (from conversation below):
- In a fragment,
foo
and./foo
match nodes at the top of the fragment, but/foo
does not. - In a fragment, neither
//foo
nor.//foo
match items at the top of the fragment. - In a fragment,
//foo
does not match any descendant items, but.//foo
does.
Original Issue
require 'nokogiri'
xml = '<?foo?><root/>'
xpath = './/processing-instruction()'
frag = Nokogiri::XML::DocumentFragment.parse(xml)
# The PI is part of the fragment
p frag.children, frag.to_s
#=> [#<Nokogiri::XML::ProcessingInstruction:0x17550b4 name="foo">,
#=> #<Nokogiri::XML::Element:0x175509c name="root">]
#=> "<?foo?><root/>"
# ...but cannot be found via XPath
p frag.xpath(xpath)
#=> []
# ...but can be found in a document
p Nokogiri.XML(xml).xpath(xpath)
#=> [#<Nokogiri::XML::ProcessingInstruction:0x190e7ec name="foo">]
The use case here is to remove all PIs from a fragment via:
frag.xpath('.//processing-instruction()').remove
The simple workaround (for PIs only at the root) is to find these nodes using Ruby:
frag.children.select{ |n| n.node_type==Nokogiri::XML::Node::PI_NODE }.each(&:remove)
Greetings!
Thanks for asking this question. It's not clear to me whether a fragment should have processing instructions, but in any case, this appears to be related to #370
That is, if you search using "./processing-instruction()" you'll get a match. Hope that workaround helps you until we fix the real underlying issue with fragment searches.
Ah, thanks, that helps. It does appear to the same core issue as #370. I've renamed the title to match. In summary:
- In a fragment,
foo
and./foo
match nodes at the top of the fragment, but/foo
does not. - In a fragment, neither
//foo
nor.//foo
match items at the top of the fragment. - In a fragment,
//foo
does not match any descendant items, but.//foo
does.
The proper workaround for my case is:
frag.xpath('processing-instruction()|.//processing-instruction()')
If you feel that this is exactly the same as #370, close away!
I've folded issues #454, #370, and #213 into this one -- they're all reporting the same underlying issue.
Note that I've committed tests that reproduce the issue to branch issue-572-tests-for-xpath-bug-on-fragment-roots
. There are some deep issues here that we should re-examine.
Also #1438.