markdownlint
markdownlint copied to clipboard
MD033 flagging HTML tags in image alt text strings
https://dlaa.me/markdownlint/#%25m!%5BThe%20default%2C%20focused%2C%20and%20disabled%20%3Ctextarea%3E%20element%20in%20Firefox%2071%20and%20Safari%2013%20on%20Mac%20OSX%20and%20Edge%2018%2C%20Yandex%2014%2C%20Firefox%20and%20Chrome%20on%20Windows%2010.%5D(textarea_basic.png) Since the alt tag doesn't get parsed as HTML, there shouldn't be a need to escape these. Ran into this because i had been escaping the tags, but then running prettier would clean off the escaping because it was not needed.
Similar thing happens with link title strings, but that's probably a separate bug
As your example shows, HTML content in the image alternate text region can be removed by the parser and so I think it is reasonable for markdownlint to warn about it.
Here is an example using markdown-it directly: http://markdown-it.github.io/#md3=%7B%22source%22%3A%22%23%20Issue%20579%5Cn%5Cn!%5Btext%20%3Ctextarea%3E%20text%5D%28image.png%29%5Cn%22%2C%22defaults%22%3A%7B%22html%22%3Atrue%2C%22xhtmlOut%22%3Afalse%2C%22breaks%22%3Afalse%2C%22langPrefix%22%3A%22language-%22%2C%22linkify%22%3Atrue%2C%22typographer%22%3Atrue%2C%22_highlight%22%3Atrue%2C%22_strict%22%3Afalse%2C%22_view%22%3A%22src%22%7D%7D
Hmm, I'm thinking it might be a Markdown-it bug then. If you run it through GitHub's parser or the remark parser like Prettier uses, its not treated as a literal

Toggling the "HTML" checkbox on that demo page opt into and out of this removal behavior.
Skimming the CommonMark specification, it's not clear to me that this scenario is directly addressed, so I think the parser is behaving consistently.
I filed something on Markdown-it, but looking at the spec https://spec.commonmark.org/0.30/#images it is light, but
Though this spec is concerned with parsing, not rendering, it is recommended that in rendering to HTML, only the plain string content of the image description be used. Note that in the above example, the alt attribute’s value is foo bar, not foo bar or foo bar. Only the plain string content is rendered, without formatting.
Everything is parsed in alt, but only plain text is rendered. Consider ![foo *bar* baz]() - it's gonna lose asterisks (in cmark and in github version too).
I believe linter for commonmark syntax should flag any non-text, non-escape inside img tag, because it'll just get ignored by parsers. HTML is no exception there.
The CommonMark sample does remove asterisks, but doesn't remove tags https://spec.commonmark.org/dingus/?text=%23%20Issue%20579%0A%0A!%5Btext%20asterisks%20text%5D(image.png)%0A%0A!%5Btext%20%3Ctextarea%3E%20text%5D(image.png)%0A%0A
Found a relevant discussion https://github.com/commonmark/commonmark-spec/issues/716 but there is no resolution right now
Closing this based on my Sept 10 example and lack of agreement in the comments about whether this is reasonable.
OK, I'll ping this issue if there is a resolution on the CommonMark or Markdown-it issues