syntect icon indicating copy to clipboard operation
syntect copied to clipboard

long-term feature request: ability to specify a timeout for highlighting

Open slimsag opened this issue 5 years ago • 4 comments

Sometimes (rarely), highlighting can take a very long time. This can be due to various things: a bad syntax regexp path, a bug in Syntect (see also #201), or maybe just a large file or incorrectly attempting to highlight a binary file.

It would be useful if a timeout duration could be specified to the highlighter that would say something like "If highlighting takes longer than X seconds, abort as soon as possible and just return an error".

One use case for this is when using syntect to highlight arbitrary user-uploaded source code files. For example, when running Syntect in a server setting via syntect_server. With this feature, it would be possible for syntect_server to abort highlighting if it takes longer than a few seconds and continue serving other requests rather than the server potentially being bogged down and/or needing to be killed.

I suspect implementing this could be tricky (e.g. maybe the regexp library in use doesn't support timeouts either), or perhaps it's just not advised to directly call into Syntect in this fashion and instead something like shelling out to a separate syntect binary + collecting results from it would be advised/simpler. I'd be okay with whatever answer here, and don't have any particular rush on getting this feature in. Just something I've thought of and would like to see eventually! :)

slimsag avatar Aug 27 '18 20:08 slimsag

Thank you for writing this up. I was thinking of the exact same thing when filing #201. I'm not sure if this should really be handled from within Syntect, but it would definitely be convenient for users.

sharkdp avatar Aug 27 '18 21:08 sharkdp

Yah I can see this being useful definitely. I would accept a PR that added a version of the parse function that takes a timeout as a std::time::Duration that just compares the elapsed time against that Duration every Nth parsing operation / regex match (not sure what N should be, depends on how much overhead getting the current time is).

The onig crate also added the ability to set a stack depth and retry limit so that individual regex matches won't take forever, which should allow this strategy to work. I thought that the latest release put reasonable limits on by default but it seems the default stack size is unlimited and the retries may be limited but I can't tell if the retry limit support is compiled in. @robinst you seem to have been involved with this do you know if it's enabled and working to stop individual regexes from taking too long?

I'm unlikely to find time to work on this myself, I may eventually but I'm pretty busy.

trishume avatar Aug 27 '18 22:08 trishume

@robinst you seem to have been involved with this do you know if it's enabled and working to stop individual regexes from taking too long?

Yeah, it's enabled by default in the recent onig releases, so that part should no longer be a problem. It was enabled by default in Oniguruma 6.8.0: https://github.com/kkos/oniguruma/#new-feature-of-version-680

robinst avatar Nov 08 '18 00:11 robinst

I'll try to gather some evidence & reproduction cases, but more recently I've seen what appears to be Syntect getting stuck entirely somewhere in the state machine (i.e. never completing or panicking no matter how long you wait, just stuck at 100% CPU forever). I know this comment isn't useful as-is so I'll try to grab some reproduction cases soon :)

slimsag avatar Mar 18 '20 14:03 slimsag