gofeed
gofeed copied to clipboard
Fails parsing cdata markup that is followed by escaped markup
When parsing this valid xml:
<rss version="2.0">
<channel>
<item>
<description>
<![CDATA[<a src="http://foo.com?foo&bar=baz"></a>]]>&
</description>
</item>
</channel>
</rss>
gofeed fails with the error message:
unknown predefined entity &bar=baz"></a>]]>&
I can confirm that this is not a problem of encoding/xml
which has no problems decoding this input. See https://play.golang.org/p/wWJicjEa-iv
@dy-dx @lutzhorn thank you for the report and confirmation.
CDATA
parsing is currently a hack and needs to be rewritten. We aren't using encoding/xml
's Unmarshal
because it wasn't flexible enough for gofeed
's requirements, so we don't get it for free.
I'll try to take a look at CDATA
handling soon. Perhaps we can pull some code from encoding/xml
itself.
I have opened a PR to address this issue https://github.com/mmcdole/gofeed/pull/120 PTAL @mmcdole
This issue is fixed on the latest master. Here's the commit - https://github.com/mmcdole/gofeed/commit/22a67f9156f2a9c28d04dc012f5d24e1d7f2c49b