RFC: DUNE_IGNORE variable to ignore some source files
The dune build system for github/ocaml is very painful to use for everyday compiler development. I would like to make it less painful to use, and I think a reasonable approach would be to implement a couple simple tweaks in dune to make it more pleasant. (The compiler is an unusual project building in a special environment, so maybe it deserves a couple specific tweaks in dune).
This PR is not inteded to be merged as-is, but more a starting point for discussion: I'm willing to do a bit of work to make dune less painful, what changes would you be willing to accept?
In the PR I add a DUNE_IGNORE environment variable to ask it to ignore some files in the source directories -- directed by a glob pattern.
Before:
$ dune build
Error: Multiple rules generated for _build/default/stdlib/std_exit.cmo:
- stdlib/dune:15
- file present in source tree
-> required by alias default
Hint: rm -f stdlib/std_exit.cmo
Error: Multiple rules generated for _build/default/otherlibs/str/str.cma:
- otherlibs/str/dune:15
- file present in source tree
-> required by alias default
Hint: rm -f otherlibs/str/str.cma
Error: Multiple rules generated for _build/default/otherlibs/unix/unix.cma:
- otherlibs/unix/dune:15
- file present in source tree
-> required by alias default
Hint: rm -f otherlibs/unix/unix.cma
After:
$ DUNE_IGNORE="*.{cmo,cma,cmx,cmxa,o,a}" dune build
# no errors
Much more pleasant!
Context
Currently the only way to get Merlin support for the compiler sources is to use its dune build system. See github/ocaml/HACKING.adoc#using-merlin.
One major limitation in my experience is that, because the dune build is incomplete, we need to keep building using the makefile for development, testing, etc. So currently the workflow is as follows:
- start hacking on the compiler using
make - suddenly you want Merlin type feedback, you need to
make clean; dune build @libs - continue hacking on the compiler using
make - oh no, you changed a .mli file somewhere and the types in _build are all wrong! you need to
make clean; dune build @libs
The need to run make clean in between two dune invocations is a massive pain. This is mostly solved by the proposed feature.
(Note: it's important to ignore the build outputs that are present in source directories, and not, for example, to use them as valid outputs instead of the other rules to build them, because they are not built by the same compiler as Dune. Dune uses the compiler from the opam switch, while Make uses the bootstrap or in-development compilers.)
What would be an acceptable interface?
I understand that such "ignore rules" are not well-regarded by Dune designers. I wish there were available, but they could be maked by a prefix to discourage their usage, or require a certain explicit marker in dune_project (for example a special_hacks_for_the_compiler stanza), or be hidden within an environment variable to discourage usage ever more. What would be acceptable to you?
What would be an acceptable implementation?
I can see that other environment variables are surrounded by much more ceremony to guarantee that they are recorded as build dependencies and do not break the memoization machinery. I'm happy to try to implement this for DUNE_IGNORE as well, if there is a consensus that this is the preferred user interface for this feature. Any other implementation advice is welcome.
The feature request is reasonable, but I'd rather such a glob live the dune file rather than as an environment variable.
We already have a dirs stanza for ignoring directory, so it seems like the most intuitive way to support this feature is through a files stanza? As a first step, I would add such a stanza, and then think about a way of making it be the default for every directory in the project (perhaps in an ad-hoc way).
See this ticket for more discussion https://github.com/ocaml/dune/issues/7811
Thanks! I'm happy to have a look at implementing (files ...), but could there be a way to make this a recursive setting that applies to subdirectories as well? This could also be useful for (dirs ...), even though it is probably less important there.
Ideas:
-
dirs*andfiles*could be interpreted as recursive/transitive -- applying to the whole filesystem tree reachable from the current point. - there could be a stanza
subtree:(subdir foo (dirs :standard \ bar))removesfoo/bar, and(subtree (dirs :standard \ bar))removes**/bar. - or maybe via glob patterns,
(dirs :standard \ **/bar)?
(My preference would go to a subtree stanza -- or deep, transitive, etc.)
I was thinking of something a little simpler, perhaps:
(repeat_in_subdirs
...)
Would evaluate ... for every sub directory starting from the current directory. The evaluation of such a stanza would indeed be similar to subdir.
Your other suggestions seem fine as well. I'd say go for whatever you find simplest to implement. I'd suggest that you look over the implementation of subdir to get an idea of the kind of changes each option would take.
I was thinking of something a little simpler, perhaps:
Alternatively, a (dirs) or (files) stanza could be accepted in dune-project and/or dune-workspace to set default rules.
Note: I have not had any time to look at this again, but this remains a feature that would be very useful for using dune in the compiler for Merlin support (in parallel to the existing, un-hygienic Make build). If someone wants to have a look at this, they would be warmly welcome, otherwise I will just wait to be annoyed by it again to give more work cycles in this direction.
I haven't had time to revisit this PR, but at the same time it would be very convenient if it (or something like this) would be available in Dune today to work on the compiler with Merlin support.
I wonder if one of the following would be possible?
-
Could this PR maybe we merged as-is (with an environment variable), which is admittedly a hack, and then maybe refined in the future? (I promise to not complain if the environment variable gets removed when
dune-projectsupport becomes available.) -
Could someone who is more familiar with Dune development, for whom the addition of new stuff in
dune-projectis easy work, do the work of tweaking the PR to use a(files <glob>)stanza indune-projectinstead of an environment variable?
2. Could someone who is more familiar with Dune development, for whom the addition of new stuff in
dune-projectis easy work, do the work of tweaking the PR to use a(files <glob>)stanza indune-projectinstead of an environment variable?
I'll try to give it a shot. But since I haven't actually looked at the code yet, I can't promise something unexpected won't come up :)
Thanks! (Indeed it is not clear to me that configuration information is available at the place I tweaked in the code, and I don't know how easy and elegant it would be to propagate configuration information until that point. It may be that the same filtering can be performed somewhere else, but I tried a couple things before finding a place that works as intended.)
- Could someone who is more familiar with Dune development, for whom the addition of new stuff in
dune-projectis easy work, do the work of tweaking the PR to use a(files <glob>)stanza indune-projectinstead of an environment variable?I'll try to give it a shot. But since I haven't actually looked at the code yet, I can't promise something unexpected won't come up :)
See #12879 for my attempt.