ocaml icon indicating copy to clipboard operation
ocaml copied to clipboard

stdlib audit: use DLS for parsing global state

Open Octachron opened this issue 3 years ago • 2 comments

The Parsing module which is used by every ocamlyacc generated parsers relies on a global mutable work state.

This means that currently no parsers generated by ocamlyacc should be run concurrently. This property does not seem that simple to satisfy in presence of multiple libraries that may use ocamlyacc parser internally.

This PR proposes thus to alleviate this issue by making the Parsing work state a domain local state.

This is not sufficient to make ocamlyacc thread-safe: users will still need to use locks to achieve concurrency safety, but they would no longer be required to use a single lock for all ocamlyacc generated parsers. I believe that this is an usability improvement. Moreover, this is the the best that we can do in backward compatible way since location functions like right_symbol depends on the implicit parser work state.

Octachron avatar Apr 28 '22 15:04 Octachron

What should we do with this change? Should we consider dropping it and seeing if users are impacted by the ocamlyacc parser shared mutable state?

Octachron avatar Jul 22 '22 13:07 Octachron

Running the ocaml 4.07 parser on the 353 compiler ml files after loading them in memory, yields a median increase of parsing time of 7% when using the DLS version.

More precisely, looking at percentile

percentile increase in parsing time
10% +3.2%
20% +4.6%
30% +5.5%
40% +6.4%
50% +6.9%
60% +7.4%
70% +8%
80% +8.6%
90% +9.6%

with the worst cases being:

rank percentile name increase in parsing time
349 98.5876% ocamldoc/odoc_module.ml 1.12673
350 98.8701% bytecomp/simplif.ml 1.12819
351 99.1525% tools/eqparsetree.ml 1.12993
352 99.435% typing/printtyped.ml 1.13111
353 99.7175% ocamldoc/odoc_texi.ml 1.15284

Consequently, it seems that there is some visible cost (at least when parsing from memory). Moreover, since the restriction on concurrent uses already exist with threads, I am starting to think that it might be better to keep the global state with some added warning in the documentation.

Octachron avatar Aug 22 '22 08:08 Octachron

Taking in account the facts that:

  • thread safety is already an issue in OCaml 4
  • menhir is a widely available upgrade
  • there are hacks that allow to decouple a specific parser from the global ocamlyacc state in case of imperious needs

I am closing this PR in favor of #11562 which documents the existing issue and points towards menhir as a thread-safe alternative to ocamlyacc.

Octachron avatar Sep 23 '22 15:09 Octachron