opam-repo-ci
opam-repo-ci copied to clipboard
Add ocaml-system to the build matrix
This more or less mirrors the INSTALL_LOCAL feature from ocaml-ci-scripts (https://github.com/ocaml/ocaml-ci-scripts/blob/9cb3224392a8a0593e62e002747831ebc25edcf3/.travis-ocaml.sh#L230) added by @dra27 in https://github.com/ocaml/ocaml-ci-scripts/pull/202
I had to change quite a bit of code because the compiler version wasn't available where it is needed in the Opam_build module but I did my best to avoid changing too much for review purpose.
Isn't it better to modify the Docker base images so that a system compiler is available there as a variant? That would avoid the need for the OCaml build instructions to be included in opam-repo-ci.
Isn't it better to modify the Docker base images so that a system compiler is available there as a variant? That would avoid the need for the OCaml build instructions to be included in opam-repo-ci.
I'm not sure, wouldn't that just move the build instructions to docker-base-images?
I'm not sure, wouldn't that just move the build instructions to docker-base-images?
Yes, but then the images could just be reused by ocaml-ci as well. There's one shortcut here: the Docker base images already record what the system compiler for a distribution is, so we could modify the builder to simply use that system image if the distro/version match.
For example, Fedora 33 has OCaml 4.11.1 already, and so we could automatically get system builds (and save some disk space) by modifying ocaml-dockerfile to simply use the yum installed version rather than compiling a local one.
Note the INSTALL_LOCAL feature tests any version of OCaml as a system compiler. My only concern is that it will increase the image size (having both a system compiler in /usr/local and ocaml-base-compiler built at once - or do the images automatically dedup?). IIRC it was a follow-up from the camlp5 debacle
The images do not automatically dedup per-file. So back to my question: do we need to test the system compiler with a particular compiler, or would it having it tested across distros as they happen to have coincident package manager versions be sufficient? We have all the information in dockerfile_distro: https://github.com/avsm/ocaml-dockerfile/blob/master/src-opam/dockerfile_distro.ml#L119
The main thing is not the specific version it's that it's a recent version (or at least that it can be made to be a recent version).
I'm not sure I'm following your question correctly: using Fedora's 4.11.1 compiler won't save any disk space - there will still be 4.11.1 installed in /home/opam/.opam/4.11 in the image and 4.11.1 installed by dnf in /usr (instead of /usr/local if we build it)?
The base images must have consistent opam switches (we definitely cannot at this stage have some images which use ocaml-system.4.11.1 and others which use ocaml-base-compiler.4.11.1). Even worse, I don't think updating the OS should be allowed to change the compiler package type: e.g. ocaml/opam:debian-ocaml-4.11.1 must always be ocaml-base-compiler.4.11.1, and not suddenly switch to ocaml-system.4.11.1 as Bullseye gets released and we move the debian tag forwards.
For opam-repo-ci, given that Fedora updates to new OCaml releases quickly, we could use Fedora for the system compiler test, which would have the benefit of simplifying the changes in opam-repo-ci since it then just becomes dnf install ocaml. That change could indeed be done in docker-base-images, but only by having a new image - for example ocaml/opam:fedora-ocaml-system (i.e. you don't specify the version of OCaml).
What are we trying to test here? I see a couple of possibilities:
-
That a package's install/uninstall scripts work with a non-opam ocaml. That seems reasonable for opam-repo-ci, but probably not worth it for ocaml-ci (which only supports dune projects, which should always install properly).
-
That a package will work with the distribution's version of OCaml. That could be useful for ocaml-ci too, but I suspect that for many packages the distribution's version will be too old to be useful. Probably only useful for people wanting to submit their packages to the distribution repository. In that case you really want to be testing against the distribution's dev version, though. But for that it would be better for us to check that things work with the latest OCaml and let the distributions worry about updating whatever packages are needed to copy opam-repository's working solution.
I also wonder how many bugs we're expecting to catch with this testing. Sounds like it won't find anything with modern (dune) software anyway.
It's the former, yes - it's that it works correctly with a compiler installed outside of .opam. It doesn't catch many bugs, but when this went wrong before it was - and continues to be - quite painful. The trigger for the original change was OCaml's former num package installing to the brew-installed OCaml on macOS (which is writeable by the user). Fixing the package was a nightmare as was fixing people's brew installations and we still get reports about it from time-to-time several years later! The issue would never have arisen with system-compiler testing because Linux (where the distribution is not writeable!) would have caught the error.
I agree it doesn't seem that useful for ocaml-ci - especially given Dune's trajectory towards ubiquity!
I also wonder how many bugs we're expecting to catch with this testing. Sounds like it won't find anything with modern (dune) software anyway.
Mh, it's not quite true in opam-repository. This test found a bug in some usage of dune install (some packages need that), which tries to install to /usr/ because this command of dune uses ocamlfind and the only ocamlfind it could find was the one installed by the system.
A fairly simple option to avoid yet another package variant is:
- make all base compiler images system compilers, by modifying ocaml-dockerfile to emit ocaml-system runes if there are no variants specified. It's easy to ./configure && make && make install outside of opam.
- all variant compilers (e.g. flambda or fp) are opam switches as currently used.
System compilers are strictly more pessimal than opam switches in terms of finding bugs (that is, we've not found a bug where something installed in a system compiler and it didnt work in an opam switch for some years -- at least since +camlp4 disappeared).
OK, we've stung ourselves by dropping this one. I like @avsm's idea, but possibly it would be less invasive to do it the other way around - i.e. keep the main compilers as they are, but do the flambda test with a "system" flambda compiler? Most users are using ocaml-base-compiler so it seems better that the main compiler test for a given version matches what most users are doing.
The question then is whether we should have the flambda docker-base-image always a system compiler or have two versions - one system compiler and one not. For the sake of consistency, I would prefer ocaml/opam:debian-10-ocaml-4.11-flambda et al to remain ocaml-base-compiler and add ocaml/opam:debian-10-ocaml-4.11-system-flambda for opam-repo-ci to use. However, that's just a docker-base-images question - I have no problem at all with the flambda test in opam-repo-ci also being a system compiler test, which adds the test without increasing the matrix size.
Given the state of the discussion here and the the fact that we've gotten along without this change for 4 years, I'm assuming that this changeset is no longer needed, but without doing the careful archaeology to be entirely sure.
If we still need work here or on ocaml-dockerfile to address a persistent problem, please open an issue explaining the need. Thanks!