flexmark-java
flexmark-java copied to clipboard
Invalid nested links with autolink
When converting MD [<https://www.example.org>](https://www.example.org) to HTML, we are getting nested links
<p><a href="https://www.example.org"><a href="https://www.example.org">https://www.example.org</a></a></p
I am not sure if it's a bug or a feature, but I can't find any config param to switch it off.
- [x]
Parser - [x]
HtmlRenderer - [ ]
Formatter - [ ]
FlexmarkHtmlParser - [ ]
DocxRenderer - [ ]
PdfConverterExtension - [ ] extension(s)
To Reproduce
Convert this to HTML
[<https://www.example.org>](https://www.example.org)
- Options used to configure the parser, renderer, formatter, etc. Please provide concise code when you can.
set(Parser.PARSER_EMULATION_PROFILE, ParserEmulationProfile.FIXED_INDENT)
set(Parser.LISTS_ITEM_INDENT, 2)
// double blank should start a new list
set(Parser.LISTS_END_ON_DOUBLE_BLANK, true)
// do not parse indented code blocks at all, the editor doesn't support them
set(Parser.INDENTED_CODE_BLOCK_PARSER, false)
// do not allow re-numbering of numbered list items, the editor doesn't support it
set(Parser.LISTS_ORDERED_LIST_MANUAL_START, false)
// do not parse underscores at all, the editor uses asterisks only
set(Parser.UNDERSCORE_DELIMITER_PROCESSOR, false)
// use style tags instead of semantic ones (the efault)
set(HtmlRenderer.EMPHASIS_STYLE_HTML_OPEN, "<i>")
set(HtmlRenderer.EMPHASIS_STYLE_HTML_CLOSE, "</i>")
set(HtmlRenderer.STRONG_EMPHASIS_STYLE_HTML_OPEN, "<b>")
set(HtmlRenderer.STRONG_EMPHASIS_STYLE_HTML_CLOSE, "</b>")
set(StrikethroughExtension.STRIKETHROUGH_STYLE_HTML_OPEN, "<s>")
set(StrikethroughExtension.STRIKETHROUGH_STYLE_HTML_CLOSE, "</s>")
set(InsExtension.INS_STYLE_HTML_OPEN, "<u>")
set(InsExtension.INS_STYLE_HTML_CLOSE, "</u>")
// make sure we always produce br instead of a newline
set(HtmlRenderer.SOFT_BREAK, "<br />")
// escape all HTML tags except for those explicitly handled in the parser
set(HtmlRenderer.ESCAPE_HTML, true)
Expected behavior Do not nest the links
<p><a href="https://www.example.org">https://www.example.org</a></p>
Resulting Output It is best to provide one of the following (in decreasing order of value):
<p><a href="https://www.example.org"><a href="https://www.example.org">https://www.example.org</a></a></p
Additional context I am not sure what the source of the data is, I guess our users are copying from some tool, I am not sure which one.