analyzer
analyzer copied to clipboard
Parallel preprocessing and parsing
In the context of interactive analysis, minimizing the wall time of preprocessing and parsing of large projects with many files would be beneficial. But this would also be useful non-incrementally.
Parallel preprocessing
Currently we use Unix.system
to run the preprocessor, but this function waits until that subprocess terminates to start preprocessing the next file. Such sequentialization is completely unnecessary and could be performed in parallel instead.
Doing so doesn't require running any OCaml code in parallel! Instead we could use functions from here: https://ocaml.org/api/Unix.html#1_Highlevelprocessandredirectionmanagement. We could easily start all the preprocessor subprocesses first and not wait for any termination in between. And then wait for all of them to terminate. This is completely OS-level subprocess management, with no parallelism or even concurrency inside Goblint.
Parallel parsing
CIL's parsing is implemented in OCaml, so it's not as simple to parallelize. But again, there might not be a need to wait for Multicore OCaml. Instead one of these libraries might be sufficient:
- https://opam.ocaml.org/packages/parmap/
- https://opam.ocaml.org/packages/parany/
What they do instead of Multicore OCaml is to fork the Goblint process, open a socket pair between the parent and child process and execute different (parsing) code in the forked child process. Then the resulting OCaml data structure is sent back to the parent Goblint process through the socket (using Marshal
). These libraries completely automate that process and make it transparent.
This should work because parsing the individual files is an independent process. No data needs to be continuously shared on the OCaml heap, so the concurrent GC of Multicore OCaml isn't needed either.
TODO
- [x] Parallel preprocessing
- [ ] Parallel parsing