`learn-ocaml build -j 2` is broken
Bug description
It seems there's a race condition in learn-ocaml build that prevents us from safely using the --jobs option.
To Reproduce
Steps to reproduce the behavior:
- Clone learn-ocaml (
git clone https://github.com/ocaml-sf/learn-ocaml.git) - Run
cd learn-ocaml - Docker-pull learn-ocaml master (
docker pull ocamlsf/learn-ocaml:master) - Run
docker run --rm -it -v "$PWD/demo-repository:/repository" ocamlsf/learn-ocaml:master -j 2
Then, there is high probability that we get:
Learnocaml v.0.12 running.
Updating app at ./www
Cannot process exercises: Unix.Unix_error(Unix.EEXIST, "mkdir", "./www/static")
if the repository contains 2+ exercises (which is now the case of demo-repository).
Expected behavior
No error should happen.
As an aside, when this bug will be fixed:
we may want to add a -j 2 (or -j $(nproc)) flag in the biggest, learn-ocaml-corpus based, test of our CI to get a speed-up…
I have examined this a bit, and it attempts to use Lwt_unix.fork() which just can't be made to work reliably (see e.g. https://github.com/ocsigen/lwt/issues/812 and there are others like it) ; duplicating the Lwt state in the child process is just going to result in a mess, I am even puzzled that there appears to have been an Lwt version where it worked at some point.
So the option should just be disabled at this point. The more reasonable implementation could be:
- define a dedicated CLI command to compile a specific exercise
- run a whole process with that command inside the
Lwt_list.map_pon each exercise, using e.g.Lwt_process.exec. The difficulty is to make sure that all relevant options in the current context are properly passed to the child process.
Or, of course, we could wait for multicore which should make it fairly trivial :rocket:
Hi @AltGr, thanks a lot for investigating, and for these pointers.
Or, of course, we could wait for multicore which should make it fairly trivial
Yes, that sounds a good trade-off → and replacing the learn-ocaml build -j $n option with an error, or just with (a -j 1 fallback + a warning)… I don't know what'd be more sensible 🤔
BTW, I've just skimmed the open issues, and it seems this had been spotted some time ago; (just putting the URL here to link both issues: https://github.com/ocaml-sf/learn-ocaml/issues/158)