flexmark-java icon indicating copy to clipboard operation
flexmark-java copied to clipboard

Invalid nested links with autolink

Open lukas-krecan opened this issue 3 years ago • 0 comments

When converting MD [<https://www.example.org>](https://www.example.org) to HTML, we are getting nested links

<p><a href="https://www.example.org"><a href="https://www.example.org">https://www.example.org</a></a></p

I am not sure if it's a bug or a feature, but I can't find any config param to switch it off.

  • [x] Parser
  • [x] HtmlRenderer
  • [ ] Formatter
  • [ ] FlexmarkHtmlParser
  • [ ] DocxRenderer
  • [ ] PdfConverterExtension
  • [ ] extension(s)

To Reproduce

Convert this to HTML

 [<https://www.example.org>](https://www.example.org)
  1. Options used to configure the parser, renderer, formatter, etc. Please provide concise code when you can.
set(Parser.PARSER_EMULATION_PROFILE, ParserEmulationProfile.FIXED_INDENT)
set(Parser.LISTS_ITEM_INDENT, 2)
// double blank should start a new list
set(Parser.LISTS_END_ON_DOUBLE_BLANK, true)
// do not parse indented code blocks at all, the editor doesn't support them
set(Parser.INDENTED_CODE_BLOCK_PARSER, false)
// do not allow re-numbering of numbered list items, the editor doesn't support it
set(Parser.LISTS_ORDERED_LIST_MANUAL_START, false)
// do not parse underscores at all, the editor uses asterisks only
set(Parser.UNDERSCORE_DELIMITER_PROCESSOR, false)


// use style tags instead of semantic ones (the efault)
set(HtmlRenderer.EMPHASIS_STYLE_HTML_OPEN, "<i>")
set(HtmlRenderer.EMPHASIS_STYLE_HTML_CLOSE, "</i>")
set(HtmlRenderer.STRONG_EMPHASIS_STYLE_HTML_OPEN, "<b>")
set(HtmlRenderer.STRONG_EMPHASIS_STYLE_HTML_CLOSE, "</b>")
set(StrikethroughExtension.STRIKETHROUGH_STYLE_HTML_OPEN, "<s>")
set(StrikethroughExtension.STRIKETHROUGH_STYLE_HTML_CLOSE, "</s>")
set(InsExtension.INS_STYLE_HTML_OPEN, "<u>")
set(InsExtension.INS_STYLE_HTML_CLOSE, "</u>")
// make sure we always produce br instead of a newline
set(HtmlRenderer.SOFT_BREAK, "<br />")
// escape all HTML tags except for those explicitly handled in the parser
set(HtmlRenderer.ESCAPE_HTML, true)

Expected behavior Do not nest the links

<p><a href="https://www.example.org">https://www.example.org</a></p>

Resulting Output It is best to provide one of the following (in decreasing order of value):

<p><a href="https://www.example.org"><a href="https://www.example.org">https://www.example.org</a></a></p

Additional context I am not sure what the source of the data is, I guess our users are copying from some tool, I am not sure which one.

lukas-krecan avatar Feb 22 '22 11:02 lukas-krecan