xml2 icon indicating copy to clipboard operation
xml2 copied to clipboard

xml_validate() fails for constructed XML

Open bryanburke opened this issue 6 years ago • 1 comments

Issue Description and Expected Result

xml_validate() returns FALSE when given a newly constructed xml_document as input. When I convert that xml_document to a string and back using as.character() and read_xml(), xml_validate() returns TRUE as expected.

Reproducible Example

Example:

require(xml2)

xsd <- paste(
  '<?xml version="1.0" encoding="UTF-8"?>',
  '<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"',
  '           targetNamespace="http://www.example.com/Test"',
  '           xmlns:tns="http://www.example.com/Test"',
  '           attributeFormDefault="unqualified"',
  '           elementFormDefault="qualified">',
  '  <xs:element name="Root">',
  '    <xs:complexType>',
  '      <xs:sequence>',
  '        <xs:element name="Child" minOccurs="1" maxOccurs="unbounded">',
  '          <xs:complexType>',
  '            <xs:simpleContent>',
  '              <xs:extension base="xs:string">',
  '                <xs:attribute type="xs:string" name="name" use="required"/>',
  '              </xs:extension>',
  '            </xs:simpleContent>',
  '          </xs:complexType>',
  '        </xs:element>',
  '      </xs:sequence>',
  '    </xs:complexType>',
  '  </xs:element>',
  '</xs:schema>',
  sep = "\n"
)

xml <- xml_new_root(
  .value = "Root",
  xmlns = "http://www.example.com/Test"
)
xml_add_child(
  .x = xml,
  .value = "Child",
  name = "Alice",
  "11/14/1974"
)
xml_add_child(
  .x = xml,
  .value = "Child",
  name = "Bob",
  "06/28/1975"
)
xml_add_child(
  .x = xml,
  .value = "Child",
  name = "Charlie",
  "12/03/1978"
)

xml2 <- read_xml(as.character(xml))

cat(as.character(xml))
# <?xml version="1.0" encoding="UTF-8"?>
# <Root xmlns="http://www.example.com/Test">
#   <Child name="Alice">11/14/1974</Child>
#   <Child name="Bob">06/28/1975</Child>
#   <Child name="Charlie">12/03/1978</Child>
# </Root>
cat(as.character(xml2))
# <?xml version="1.0" encoding="UTF-8"?>
# <Root xmlns="http://www.example.com/Test">
#   <Child name="Alice">11/14/1974</Child>
#   <Child name="Bob">06/28/1975</Child>
#   <Child name="Charlie">12/03/1978</Child>
# </Root>
as.character(xml) == as.character(xml2)
# TRUE

xml_validate(x = xml, schema = read_xml(xsd))
# [1] FALSE
# attr(,"errors")
# [1] "Element 'Child': This element is not expected. Expected is ( {http://www.example.com/Test}Child )."
xml_validate(x = xml2, schema = read_xml(xsd))
# [1] TRUE
# attr(,"errors")
# character(0)
Session Info
devtools::session_info()
Session info -----------------------------------------------------------------------
 setting  value                       
 version  R version 3.4.1 (2017-06-30)
 system   x86_64, mingw32             
 ui       RStudio (1.0.153)           
 language (EN)                        
 collate  English_United States.1252  
 tz       America/Chicago             
 date     2017-09-22                  

Packages ---------------------------------------------------------------------------
 package   * version date       source        
 base      * 3.4.1   2017-06-30 local         
 compiler    3.4.1   2017-06-30 local         
 datasets  * 3.4.1   2017-06-30 local         
 devtools    1.13.3  2017-08-02 CRAN (R 3.4.1)
 digest      0.6.12  2017-01-27 CRAN (R 3.4.1)
 graphics  * 3.4.1   2017-06-30 local         
 grDevices * 3.4.1   2017-06-30 local         
 memoise     1.1.0   2017-04-21 CRAN (R 3.4.1)
 methods   * 3.4.1   2017-06-30 local         
 Rcpp        0.12.12 2017-07-15 CRAN (R 3.4.1)
 stats     * 3.4.1   2017-06-30 local         
 tools       3.4.1   2017-06-30 local         
 utils     * 3.4.1   2017-06-30 local         
 withr       2.0.0   2017-07-28 CRAN (R 3.4.1)
 xml2      * 1.1.1   2017-01-24 CRAN (R 3.4.1)

bryanburke avatar Sep 22 '17 18:09 bryanburke

I can confirm the issue, but do not have a good solution. The issue seems to be the added nodes do not have the default namespace attached. I tried using xmlReconciliateNs() to fix this, but without success.

I would suggest using your workaround for this case

xml_validate(read_xml(as.character(xml)), read_xml(xsd))

jimhester avatar Jan 04 '18 16:01 jimhester