feed icon indicating copy to clipboard operation
feed copied to clipboard

Characters like & are not escaped in XML attribute values

Open cloudrac3r opened this issue 4 years ago • 3 comments

Describe the bug

Often a URL must be put into an XML attribute value. URLs may contain & to separate query parameters. However, XML may not contain & inside attribute values, and it must be written as &.

This causes problems with RSS readers and the official W3 validator since feed generates invalid XML:

Screenshot. "XML parsing error: not well-formed (invalid token)." An arrow points to the ampersand character.

It seems like this is a problem further up in xml-js: https://github.com/nashwaan/xml-js/issues/69
More info: https://github.com/nashwaan/xml-js/issues/26
A fork of xml-js changes this behaviour: https://github.com/nashwaan/xml-js/compare/master...tolbertam:sanitize-attribute

You can either fork xml-js and import that fork in package.json, or leave it for the user to solve by adding xml-js as a peer dependency which allows the user to install their preferred xml-js version or fork.

You could also wait for it to be fixed upstream by somebody else, but the issue has sat untouched for 18 months, so it seems unlikely.

Versions (please complete the following information):

  • NodeJS: v12.13.0
  • npm/yarn: 6.12.0
  • feed: 4.1.0

cloudrac3r avatar Jan 14 '20 12:01 cloudrac3r

I ran into this too, and just ran encodeURIComponent on the effected elements.

endquote avatar Jan 22 '20 20:01 endquote

May as well update this - I have a fork that encodes things the way I need them when generating feeds for Bibliogram. The fork has existed for a while, but I only just remembered to update this issue. https://git.sr.ht/~cadence/nodejs-feed

cloudrac3r avatar Aug 04 '20 12:08 cloudrac3r

For anyone searching for a quick solution you can escape urls before passing them to feed

function escapeXmlAttr(unsafe: string) {
    if (!unsafe) {
        return
    }
    return unsafe.replace(/[<>&'"]/g, function (c) {
        switch (c) {
            case '<':
                return '&lt;'
            case '>':
                return '&gt;'
            case '&':
                return '&amp;'
            case "'":
                return '&apos;'
            case '"':
                return '&quot;'
        }
    })
}

remorses avatar Jul 19 '22 17:07 remorses