nim-markdown icon indicating copy to clipboard operation
nim-markdown copied to clipboard

Bad performance with larger documents

Open osmarks opened this issue 3 years ago • 4 comments

I didn't really want to just post this without also providing a good fix, but it's causing me problems, and I lack the understanding of the large amount of code in this necessary to do much to it.

It seems that this library parses and renders Markdown quite slowly, especially on larger documents. I noticed this while testing with ~10KB documents but have mostly tried testing it on ~100KB ones, for which rendering takes about 4 seconds, while a Python CommonMark implementation seems to manage 400ms and a JavaScript one ~50ms. This is proving problematic for an application I want to use this in, and while one fix might be to switch to cmark bindings, they don't provide options to customize parsing, so I would have to reimplement a lot of logic.

By switching most of the internals away from doubly linked lists, I was able to get a roughly 25% improvement (very rough, as I just ran time over it a few times with a big test file as input) but I think bigger changes would be needed to get a significant difference; I think it needs to avoid allocating as many objects (maybe use an ADT or something instead) and to handle chunks of unformatted text better. This also breaks quite a few of the test cases (mostly only spacing in lists as far as I can tell): https://gist.github.com/osmarks/c7d8db89896047368d6512f6284cd7c2

Here is a test input file (without any formatting) and profiling output from it:

out2.txt profile_results.txt

osmarks avatar Dec 30 '20 20:12 osmarks

Performance is indeed a problem. I intend to solve it in 2021. ⌨️

soasme avatar Dec 31 '20 04:12 soasme

@osmarks can you recheck performance after #50?

Yardanico avatar Mar 05 '21 23:03 Yardanico

This does seem to improve performance a bit on my test case, but it still takes 3 seconds to work for some reason. I'm just using cmark-gfm for now.

osmarks avatar Mar 07 '21 12:03 osmarks

@osmarks Although there are still plenty to optimize, I'm satisfied with the current result (as of v0.8.4):

$ nim c -d:release src/markdown.nim && time ./src/markdown < /tmp/out2.txt
real	0m1.390s
user	0m0.156s
sys	0m0.013s

Nimprof profile result also shows a significant reduce on the number of instruction calls, 3277 from the profile_results.txt you provided v/s 323 on the one run against v0.8.4:

total executions of each stack trace:
Entry: 1/131 Calls: 13/323 = 4.02% [sum: 13; 13/323 = 4.02%]

LMK if the performance improvement works on your side. Meanwhile, I'll keep working on improving the performance as tracked in #51.

Cheers.

soasme avatar Mar 19 '21 03:03 soasme