nokogiri icon indicating copy to clipboard operation
nokogiri copied to clipboard

XPath queries in fragments do not properly find root nodes

Open Phrogz opened this issue 13 years ago • 4 comments

Summary (from conversation below):

  • In a fragment, foo and ./foo match nodes at the top of the fragment, but /foo does not.
  • In a fragment, neither //foo nor .//foo match items at the top of the fragment.
  • In a fragment, //foo does not match any descendant items, but .//foo does.

Original Issue

require 'nokogiri'
xml   = '<?foo?><root/>'
xpath = './/processing-instruction()'
frag  = Nokogiri::XML::DocumentFragment.parse(xml)

# The PI is part of the fragment
p frag.children, frag.to_s
#=> [#<Nokogiri::XML::ProcessingInstruction:0x17550b4 name="foo">,
#=>  #<Nokogiri::XML::Element:0x175509c name="root">]
#=> "<?foo?><root/>"

# ...but cannot be found via XPath
p frag.xpath(xpath)
#=> []

# ...but can be found in a document
p Nokogiri.XML(xml).xpath(xpath)
#=> [#<Nokogiri::XML::ProcessingInstruction:0x190e7ec name="foo">]

The use case here is to remove all PIs from a fragment via:

frag.xpath('.//processing-instruction()').remove

The simple workaround (for PIs only at the root) is to find these nodes using Ruby:

frag.children.select{ |n| n.node_type==Nokogiri::XML::Node::PI_NODE }.each(&:remove)

Phrogz avatar Nov 21 '11 22:11 Phrogz

Greetings!

Thanks for asking this question. It's not clear to me whether a fragment should have processing instructions, but in any case, this appears to be related to #370

That is, if you search using "./processing-instruction()" you'll get a match. Hope that workaround helps you until we fix the real underlying issue with fragment searches.

flavorjones avatar Nov 22 '11 16:11 flavorjones

Ah, thanks, that helps. It does appear to the same core issue as #370. I've renamed the title to match. In summary:

  • In a fragment, foo and ./foo match nodes at the top of the fragment, but /foo does not.
  • In a fragment, neither //foo nor .//foo match items at the top of the fragment.
  • In a fragment, //foo does not match any descendant items, but .//foo does.

The proper workaround for my case is:

frag.xpath('processing-instruction()|.//processing-instruction()')

If you feel that this is exactly the same as #370, close away!

Phrogz avatar Nov 22 '11 17:11 Phrogz

I've folded issues #454, #370, and #213 into this one -- they're all reporting the same underlying issue.

Note that I've committed tests that reproduce the issue to branch issue-572-tests-for-xpath-bug-on-fragment-roots. There are some deep issues here that we should re-examine.

flavorjones avatar Jan 02 '15 14:01 flavorjones

Also #1438.

flavorjones avatar Feb 10 '17 09:02 flavorjones