flexmark-java icon indicating copy to clipboard operation
flexmark-java copied to clipboard

StackOverflowError occurs with a long link URL enclosed in Angle Quotes in a reference link definition

Open yehara opened this issue 10 months ago • 0 comments

Describe the bug

In flexmark 0.64.8, a StackOverflowError occurs when a link URL is enclosed in Angle Quotes in a reference link definition like [definition name]: <link URL>, and the link URL is long. The issue is located in the Parser.

A case where this error occurs is when using the "Copy as Markdown" feature for an image from Google Docs, which generates Markdown like the following:

**![][image1]**

[image1]: <data:image/png;base64,....>

To Reproduce

The issue can be reproduced with the following code:

Parser parser = Parser.builder().build();
String url = "https://example.com/" + "A".repeat(5000);
String markdown = "[link]: <%s>".formatted(url);
parser.parse(markdown);

Expected behavior

The parse method should complete successfully.

Resulting Output

java.lang.StackOverflowError
    at com.vladsch.flexmark.util.sequence.SubSequence.charAt(SubSequence.java:115)
    at java.base/java.lang.Character.codePointAt(Character.java:9320)
    at java.base/java.lang.Character.codePointAt(Character.java:9320)
    at java.base/java.util.regex.Pattern$CharProperty.match(Pattern.java:4106)
    at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4914)
    at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4969)
    at java.base/java.util.regex.Pattern$Loop.match(Pattern.java:5078)
    at java.base/java.util.regex.Pattern$GroupTail.match(Pattern.java:5000)
    at java.base/java.util.regex.Pattern$BranchConn.match(Pattern.java:4878)
    at java.base/java.util.regex.Pattern$CharProperty.match(Pattern.java:4110)
    at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4914)
    ...

Additional context The error occurs at the beginning of InlineParserImpl#parseLinkDestination() in the line BasedSequence res = match(myParsing.LINK_DESTINATION_ANGLES);. It appears that the stack overflow is caused by the regular expression matching process.

Possible Solution Using a possessive quantifier *+ instead of the greedy quantifier * in Parsing.ST_LINK_DESTINATION_ANGLES_SPC and Parsing.ST_LINK_DESTINATION_ANGLES_NO_SPC might avoid the error. However, there is no certainty that this change is correct.

yehara avatar Feb 28 '25 09:02 yehara