markup.ml icon indicating copy to clipboard operation
markup.ml copied to clipboard

Consider switching to string streams instead of char streams

Open aantron opened this issue 9 years ago • 3 comments

The performance will probably be much better, but compare and measure first.

aantron avatar Mar 14 '16 23:03 aantron

What is your intuition about this for performance improvement?

xguerin avatar Jan 05 '18 19:01 xguerin

For consumers that allocate something per unit of output from the parser, giving them one byte at a time makes them allocate too much. Feeding them strings at a time (so chunks, basically) amortizes the cost of their allocations. Also, if they do unbuffered I/O, it saves overhead of doing system calls or other I/O one byte at a time, in the same way.

aantron avatar Jan 05 '18 19:01 aantron

Makes sense. I was looking at the code and since most of the internal logic is byte oriented, I was wondering whether or not it made sense to convert all functions to string streams. At this moment it is not clear to me that's the case. Especially if the original intent is I/O optimization, the outer API can expose string streams while some of the logic can retain its byte stream interface.

That being said, a benefit to more generally use strings instead of bytes would be to reduce the amount of continuation calls.

xguerin avatar Jan 05 '18 21:01 xguerin