xml2
xml2 copied to clipboard
xml_find doens't see added attributes with namespaces
I'm trying to modify xml by adding child nodes with attributes, and then I'm looking for the subset of added nodes with attributes I just added:
library(xml2)
doc <- read_xml('<?xml version="1.0" encoding="UTF-8"?><bookstore xmlns:xxx="https://www.w3.org/"><book name="Harry Potter 1"></book></bookstore>')
node <- xml_find_first(doc, "book")
xml_add_child(node, .value="link", "xxx:resource"="http://harry_potter.com")
# Now, will try to find the node just added:
xpath <- "//book/link[@xxx:resource=\"http://harry_potter.com\"]"
# results 1:
xml_find_all(doc, xpath)
# nothing found: {xml_nodeset (0)}
# that's wrong
# saving to temporary file and loading from it:
write_xml(doc, "temp.xml")
doc <- read_xml('temp.xml')
# using same query again, results 2:
xml_find_all(doc, xpath)
# the results are correct and link is found
The bug is related to namespaces. If the added child node and query doesn't have xxx: namespace, the output is correct, and results 1 contain the node that was added.
Looks like this works if you manually specify the namespace when setting the attribute; I suspect xml_add_child
isn't correctly passing the ns info around.
library(xml2)
doc <- read_xml('<?xml version="1.0" encoding="UTF-8"?><bookstore xmlns:xxx="https://www.w3.org/"><book name="Harry Potter 1"></book></bookstore>')
xml_ns(doc)
#> xxx <-> https://www.w3.org/
node <- xml_find_first(doc, "book")
link <- xml_add_child(node, .value = "link")
xml_set_attr(link, "xxx:resource", "http://harry_potter.com", ns = xml_ns(doc))
xml_find_all(doc, "//link[@xxx:resource=\"http://harry_potter.com\"]")
#> {xml_nodeset (1)}
#> [1] <link xxx:resource="http://harry_potter.com"/>
Created on 2022-02-28 by the reprex package (v2.0.1)