kramdown icon indicating copy to clipboard operation
kramdown copied to clipboard

Kramdown freezes with specific input.

Open EnotPoloskun opened this issue 2 years ago • 3 comments

Hello. We have noticed that kramdown has hard time parsing some string.

Example:

text = "___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe ___QWE ___qwe ___qwe ___qwe ___qwe"

Kramdown::Document.new(text)

takes around 30 second on my machine. And if I double string with same pattern - it never finishes. Is there a workaround or any option which would help processing this markdown?

Kramdown version - 2.4.0

EnotPoloskun avatar May 25 '22 00:05 EnotPoloskun

The reason for this is backtracking. For the first three underscores kramdown tries to find the matching pair. When encountering the second three underscores, it sees that those are not the matching ones. So it starts with those three to find their matching counterpart. And on, and on, and on it goes... until the end of the text where we see that nothing matches. So all nested matching attempts are abandoned and other things tried. Until we come to the first three underscores again. Then we see that they should be handled as normal text. After that we find the second three underscores and everything starts again...

I will have a look at the code.

gettalong avatar May 25 '22 08:05 gettalong

When encountering the second three underscores, it sees that those are not the matching ones

Why second three underscores are considered not "matching"? Is it intentional?

I have tried https://markdowntohtml.com/ for this text, and it considers them as matching if I understand correctly.

So If I understand correctly

___1 ___QWE ___2 ___

In this example 1 and 2 should be formatted.

EnotPoloskun avatar May 25 '22 16:05 EnotPoloskun

Underscores only match at word boundaries and the stopping delimiter must not be preceded by a space (see https://kramdown.gettalong.org/syntax.html#emphasis). So yes, that is intentional.

gettalong avatar May 25 '22 18:05 gettalong