turndown icon indicating copy to clipboard operation
turndown copied to clipboard

Getting DOMException: Invalid character if an attribute starts with a "."

Open bebraw opened this issue 10 years ago • 6 comments

&lt;bottle.label&gt;<bottle .label=\"\">

yields

node_modules/to-markdown/node_modules/jsdom/lib/jsdom/living/helpers/validate-names.js:10
    throw new core.DOMException(core.DOMException.INVALID_CHARACTER_ERR,
          ^
DOMException: Invalid character: ".label" did not match the Name production: Expected ":", "_", [A-Z], [\u0370-\u037D], [\u037F-\u1FFF], [\u200C-\u200D], [\u2070-\u218F], [\u2C00-\u2FEF], [\u3001-\uD7FF], [\uD800-\uDB7F], [\uF900-\uFDCF], [\uFDF0-\uFFFD], [\xC0-\xD6], [\xD8-\xF6], [\xF8-\u02FF] or [a-z] but "." found.

Background:

I ran into this at blogger2ghost. It looks like a blog post of mine can contain attributes like this and it makes to-markdown, and consequently my library and service, to blow up.

bebraw avatar May 25 '15 14:05 bebraw

Thanks for filing this.

I suspect that <bottle .label=\"\"> is not valid HTML, and jsdom is throwing an exception as a result.

Perhaps this is one for https://github.com/tmpvar/jsdom ?

domchristie avatar May 25 '15 16:05 domchristie

Yeah, it's not valid as per specification.

If you want I can push the issue upstream. Maybe there's some nice way around this. At worst we may need to some preprocessing for attributes.

FYI the problematic bit comes from my templating engine post. It's that pre that contains the problematic bit. I can of course just fix the post... :+1:

Thanks for getting back so fast.

bebraw avatar May 25 '15 16:05 bebraw

At worst we may need to some preprocessing for attributes.

I’m not sure how easy this will be, given that as soon as the string is passed in to jsdom, it’s parsed (and therefore an error is thrown).

This probably is an issue for jsdom (though I’m not sure what the status is for v3, what with the io/node merge?!) However, we could definitely handle this error better, perhaps by catching this error and throwing a custom one to clear things up a bit.

domchristie avatar May 27 '15 10:05 domchristie

I’m not sure how easy this will be, given that as soon as the string is passed in to jsdom, it’s parsed (and therefore an error is thrown).

Yeah, that's true. By the looks of it, the error is due raised by xml-name-validator. Maybe the could accept a PR that allows us to inject a more lax alternative to jsdom? That would solve the problem effectively and give some flexibility for you.

This probably is an issue for jsdom (though I’m not sure what the status is for v3, what with the io/node merge?!) However, we could definitely handle this error better, perhaps by catching this error and throwing a custom one to clear things up a bit.

That's a good idea.

bebraw avatar May 27 '15 14:05 bebraw

Maybe the could accept a PR that allows us to inject a more lax alternative to jsdom?

Or perhaps it could be configurable in toMarkdown?

toMarkdown('<h1>Hello world!</h1>', {
  parser: function() {
    // return DOM tree
  }
});

domchristie avatar May 27 '15 22:05 domchristie

@domchristie As long as that avoids the jsdom nastiness I'm fine with that.

bebraw avatar May 28 '15 04:05 bebraw