xml2 icon indicating copy to clipboard operation
xml2 copied to clipboard

xml_find doens't see added attributes with namespaces

Open ostroganov opened this issue 3 years ago • 1 comments

I'm trying to modify xml by adding child nodes with attributes, and then I'm looking for the subset of added nodes with attributes I just added:

library(xml2)
doc <- read_xml('<?xml version="1.0" encoding="UTF-8"?><bookstore xmlns:xxx="https://www.w3.org/"><book name="Harry Potter 1"></book></bookstore>')

node <- xml_find_first(doc, "book")
xml_add_child(node, .value="link", "xxx:resource"="http://harry_potter.com")

# Now, will try to find the node just added:
xpath <- "//book/link[@xxx:resource=\"http://harry_potter.com\"]"

# results 1:
xml_find_all(doc, xpath)
# nothing found: {xml_nodeset (0)}
# that's wrong

# saving to temporary file and loading from it:
write_xml(doc, "temp.xml")
doc <- read_xml('temp.xml')

# using same query again, results 2:
xml_find_all(doc, xpath)
# the results are correct and link is found

The bug is related to namespaces. If the added child node and query doesn't have xxx: namespace, the output is correct, and results 1 contain the node that was added.

ostroganov avatar May 23 '21 01:05 ostroganov

Looks like this works if you manually specify the namespace when setting the attribute; I suspect xml_add_child isn't correctly passing the ns info around.

library(xml2)
doc <- read_xml('<?xml version="1.0" encoding="UTF-8"?><bookstore xmlns:xxx="https://www.w3.org/"><book name="Harry Potter 1"></book></bookstore>')
xml_ns(doc)
#> xxx <-> https://www.w3.org/

node <- xml_find_first(doc, "book")
link <- xml_add_child(node, .value = "link")
xml_set_attr(link, "xxx:resource", "http://harry_potter.com", ns = xml_ns(doc))

xml_find_all(doc, "//link[@xxx:resource=\"http://harry_potter.com\"]")
#> {xml_nodeset (1)}
#> [1] <link xxx:resource="http://harry_potter.com"/>

Created on 2022-02-28 by the reprex package (v2.0.1)

hadley avatar Feb 28 '22 20:02 hadley