builder icon indicating copy to clipboard operation
builder copied to clipboard

Indent and CDATA blocks

Open etdsoft opened this issue 3 years ago • 0 comments

If you use:

xml_builder = Builder::XmlMarkup.new(indent: 2)
xml_builder.card do |card_builder|
  #...
  card_builder.description do
    card_builder.cdata!(description)
  end
end

And then somewhere along the line add CDATA blocks to your tree, you end up with indented CDATA blocks:

    <card>
      ...
      <description>
        <![CDATA[Welcome to....
]]>
      </description>
    </card>
    </

The problem is that upon parsing the indentation (new line and leading whitespace) becomes part of the text() for the <description> node:

irb> xml.at_xpath('card/description').text
=> "\n        Welcome to

This is because XmlMarkup#cdata! calls #_special which in turns calls _indent.

I suspect indent should be skipped on CDATA sections: https://bugs.openjdk.java.net/browse/JDK-8223291 https://github.com/nashwaan/xml-js/issues/14

In fact Nokogiri does skip indent for CDATA-only nodes:

irb> doc = Nokogiri::XML("<foo><bar><![CDATA[lorem ipsum]]></foo>")
=> #<Nokogiri::XML::Document:0xe09c name="document" children=[#<Nokogiri::XML::Element:0xe088 name="foo" children=[#<Nokogiri::XML::Element:0xe074 name="bar" children=[#<Noko...
irb> puts doc
<?xml version="1.0"?>
<foo>
  <bar><![CDATA[lorem ipsum]]></bar>
</foo>

Is this a bug or a feature? 🤔

etdsoft avatar Jun 10 '21 14:06 etdsoft