stdlib audit: use DLS for parsing global state
The Parsing module which is used by every ocamlyacc generated parsers relies on a global mutable work state.
This means that currently no parsers generated by ocamlyacc should be run concurrently. This property does not seem that simple to satisfy in presence of multiple libraries that may use ocamlyacc parser internally.
This PR proposes thus to alleviate this issue by making the Parsing work state a domain local state.
This is not sufficient to make ocamlyacc thread-safe: users will still need to use locks to achieve concurrency safety, but they would no longer be required to use a single lock for all ocamlyacc generated parsers. I believe that this is an usability improvement. Moreover, this is the the best that we can do in backward compatible way since location functions like right_symbol depends on the implicit parser work state.
What should we do with this change? Should we consider dropping it and seeing if users are impacted by the ocamlyacc parser shared mutable state?
Running the ocaml 4.07 parser on the 353 compiler ml files after loading them in memory, yields a median increase of parsing time of 7% when using the DLS version.
More precisely, looking at percentile
| percentile | increase in parsing time |
|---|---|
| 10% | +3.2% |
| 20% | +4.6% |
| 30% | +5.5% |
| 40% | +6.4% |
| 50% | +6.9% |
| 60% | +7.4% |
| 70% | +8% |
| 80% | +8.6% |
| 90% | +9.6% |
with the worst cases being:
| rank | percentile | name | increase in parsing time |
|---|---|---|---|
| 349 | 98.5876% | ocamldoc/odoc_module.ml | 1.12673 |
| 350 | 98.8701% | bytecomp/simplif.ml | 1.12819 |
| 351 | 99.1525% | tools/eqparsetree.ml | 1.12993 |
| 352 | 99.435% | typing/printtyped.ml | 1.13111 |
| 353 | 99.7175% | ocamldoc/odoc_texi.ml | 1.15284 |
Consequently, it seems that there is some visible cost (at least when parsing from memory). Moreover, since the restriction on concurrent uses already exist with threads, I am starting to think that it might be better to keep the global state with some added warning in the documentation.
Taking in account the facts that:
- thread safety is already an issue in OCaml 4
- menhir is a widely available upgrade
- there are hacks that allow to decouple a specific parser from the global ocamlyacc state in case of imperious needs
I am closing this PR in favor of #11562 which documents the existing issue and points towards menhir as a thread-safe alternative to ocamlyacc.