cheerio icon indicating copy to clipboard operation
cheerio copied to clipboard

Bug: XML tags with mixed case, when using lowerCaseTags: true, selectors will be lowercased before comparing with the xml tag name

Open corinnaSchultz opened this issue 1 year ago • 0 comments

I'm using version 1.0.0-rc.12

If you have an xml source document that contains tags with mixed case, such as:

<parentTag class="myClass">
   <firstTag> <child> blah </child> </firstTag>
  <secondTag> <child> blah </child> </secondTag>
</parentTag>

And you load this document with the following cheerio options:

xml: {
    xmlMode: true,
    decodeEntities: false,
    lowerCaseTags: true,
    lowerCaseAttributeNames: false,
    recognizeSelfClosing: true,
  }

And you use a selector like this:

$ = cheerio.load(myDoc, options)
node = $('.myClass')
node.find('firstTag > child')

This does not return the element as expected. It used to work in 1.0.0.rc.6.

From tracing through the code, it appears that the selector is first changed to lowercase ("firsttag" in this example), before comparing it with the tag name ("firstTag" in this example). I've pasted the code below.

I think I saw a similar bug when using the is() function, so it appears that this is the general behavior for selectors.

My expectation is that the tagName would also be converted to lowercase before doing this comparison, so that if the selector were mixed case or not, it would still match.

Am I misunderstanding how lowerCaseTags is supposed to work?

Code (from css-select, so maybe it's their bug?) css-select/lib/general.js

        // Tags
        case css_what_1.SelectorType.Tag: {
            if (selector.namespace != null) {
                throw new Error("Namespaced tag names are not yet supported by css-select");
            }
            var name_1 = selector.name;
            if (!options.xmlMode || options.lowerCaseTags) {
                name_1 = name_1.toLowerCase();
            }
            return function tag(elem) {
                return adapter.getName(elem) === name_1 && next(elem);
            };
        }

corinnaSchultz avatar Nov 19 '23 02:11 corinnaSchultz