scala-xml icon indicating copy to clipboard operation
scala-xml copied to clipboard

CDATA in MarkupParser

Open mbeckerle opened this issue 3 years ago • 3 comments

I've had to override this method of MarkupParser due to what I think is a bug:

def xCharData: NodeSeq = {
      xToken("[CDATA[")
      def mkResult(pos: Int, s: String): NodeSeq = {
         handle.text(pos, s) // NOTE: Handle the text
         PCData(s)             // Ignores the result of handling, and creates a PCData with the original string!
      }
      xTakeUntil(mkResult, () => pos, "]]>")
}

I put comments in there to illustrate what I think is the issue.

I think this is the fix:

  override def xCharData: NodeSeq = {
    xToken("[CDATA[")
    def mkResult(pos: Int, s: String): NodeSeq = {
       cdata(pos, s)
    }
    xTakeUntil(mkResult, () => pos, "]]>")
  }

  // This method below  gets a prototype on MarkupHandler and is overridden here 
  // with a default implementation that creates a PCData node.

  override def cdata(pos: Int, s: String): NodeSeq = {
    PCData(s) // by default, just create PCData with the string.
  }

I am not sure whether that method should be named cdata or pcdata.

If this makes sense, I can create a PR for this. But I wanted to run it by you first.

mbeckerle avatar May 21 '21 15:05 mbeckerle

does @dubinsky's #558 fix this?

SethTisue avatar Sep 09 '21 13:09 SethTisue

I believe #558 only fixes when the Java Xerces parser reads XML.

I believe MarkupParser is part of the Scala-based ConstructingParser.

ashawley avatar Sep 09 '21 14:09 ashawley

I see no reason to not have both XML facilities in the library support CDATA going forward, so a PR would be welcomed.

ashawley avatar Sep 09 '21 14:09 ashawley