dune icon indicating copy to clipboard operation
dune copied to clipboard

dune overcaching with local opam switch

Open ygrek opened this issue 3 years ago • 10 comments

Expected Behavior

dune to correctly rebuild artifacts with different ocaml versions (more general : when anything affecting compilation changes in outside environment, not sure how feasible)

Actual Behavior

dune doesn't rebuild object files when compiler version changes provided that all paths remain the same (which is achievable with local opam switch or with the same named switch)

Reproduction

# make sure cache is enabled
$ cat ~/.config/dune/config 
(lang dune 2.0)
(cache enabled)
(cache-trim-period 1h)
(cache-trim-size 20GB)
$ rm -rf _build/ _opam ~/.cache/dune
$ opam sw create . 4.09.0
$ dune build
$ ls -hlA _build/default/stubs.o 
-r--r--r-- 2 user user 5.1K May 12 15:28 _build/default/stubs.o
$ mv _opam _opam_09
# NB keeping cache and build dir intact
$ opam sw create . 4.11.2
$ dune build
    ocamlopt test.exe (exit 2)
(cd _build/default && /home/user/dune_cache/_opam/bin/ocamlopt.opt -w @[email protected]@30..39@[email protected]@[email protected] -strict-sequence -strict-formats -short-paths -keep-locs -g -o test.exe stubs.cmxa -I . .test.eobjs/native/dune__exe__Test.cmx)
/usr/bin/ld: ./libstubs_stubs.a(stubs.o): in function `ml_test':
/home/user/dune_cache/_build/default/stubs.c:6: undefined reference to `caml_local_roots'
/usr/bin/ld: /home/user/dune_cache/_build/default/stubs.c:6: undefined reference to `caml_local_roots'
/usr/bin/ld: /home/user/dune_cache/_build/default/stubs.c:6: undefined reference to `caml_local_roots'
/usr/bin/ld: /home/user/dune_cache/_build/default/stubs.c:7: undefined reference to `caml_local_roots'
/usr/bin/ld: /home/user/dune_cache/_build/default/stubs.c:7: undefined reference to `caml_local_roots'
/usr/bin/ld: ./libstubs_stubs.a(stubs.o):/home/user/dune_cache/_build/default/stubs.c:9: more undefined references to `caml_local_roots' follow
collect2: error: ld returned 1 exit status
File "caml_startup", line 1:
Error: Error during linking (exit code 1)
$ ls -hlA _build/default/stubs.o 
-r--r--r-- 2 user user 5.1K May 12 15:28 _build/default/stubs.o # NB stubs.o is same size as with 4.09.0
$ rm -rf _build # cleaning build dir
$ dune build
# same error as above
[..]
/home/user/dune_cache/_build/default/stubs.c:6: undefined reference to `caml_local_roots'
[..]
$ ls -hlA _build/default/stubs.o 
-r--r--r-- 2 user user 5.1K May 12 15:28 _build/default/stubs.o # still same
$ rm -rf _build/ ~/.cache/dune/ # cleaning build dir and cache dir
$ dune build # build successful
$ ls -hlA _build/default/stubs.o 
-r--r--r-- 2 user user 12K May 12 15:29 _build/default/stubs.o # stubs.o changed

My interpretation: dune caches stuff by path. In this case installation changes from 4.09 to 4.11 with paths remaining unchanged. This makes dune think that stubs.o file doesn't need to be rebuilt and it fails (because e.g. 4.11.2 doesn't have caml_local_roots symbol, it is only present in 4.09.0, among other possible reasons). I understand dune uses hash of binary running the command as a key in hash so this issue doesn't affect ocaml compiler invocations, only affects C stubs (gcc indeed remains the same), because from the stubs.o compilation pov only include headers change, which are not tracked by dune.

NB this issue affects both dune cache and regular build re importance: in our usecase it makes dune cache unusable when switching between compiler versions (or having branches with different versions of ocaml on the same CI), the only workaround we have in such situation is dune clean and remove dune cache before every build (which obv defeats the purpose of cache)

Specifications

  • Version of dune (output of dune --version): 2.8.5
  • Version of ocaml (output of ocamlc --version): 4.09.0 and 4.11.2
  • Operating system (distribution and version): debian 10.8

Additional information

dune

(library
  (name stubs)
  (modules ())
  (foreign_stubs
    (language c)
    (names stubs)
    (flags -g -fPIC -O0 -Wextra -Wstrict-overflow=5 -Wno-implicit-fallthrough) # -O0 is important otherwise in this synthetic example caml_local_roots usage gets optimized out
  )
)

(executable (name test) (libraries stubs))

dune-project

(lang dune 2.8)
(use_standard_c_and_cxx_flags true) # this is important for this reproduce case because otherwise a set of C compiler invocation flags changes slightly between chosen ocaml versions

stubs.c

#include <caml/memory.h>
#include <caml/mlvalues.h>

value ml_test(value v)
{
  CAMLparam1(v);
  CAMLlocal1(res);
  res = v;
  CAMLreturn(res);
}

test.ml

external test : unit -> unit = "ml_test"

let () = test ()

ygrek avatar May 12 '21 19:05 ygrek

That's likely due to a missing dependency declaration. You can check that by running dune rules stubs.o and looking at the deps field.

If all dependencies are declared, then Dune shouldn't over-cache. Indeed, it digests the contents of both local and external dependencies. FTR, I implemented that at the very beginning as I was fed up of constantly having to do ocamlbuild -clean every time I was re-installing a dependency.

By the way, when I say "all dependencies" it can never really be all dependencies. Every single command might go and read a bunch of files in /etc, /lib, ... and we can't track all of these in Dune. So it's a best effort.

ghost avatar May 13 '21 09:05 ghost

((deps
  ((File (External /usr/lib/ccache/gcc))
   (File (In_build_dir _build/default/.merlin-exist-lib-stubs))
   (File (In_build_dir _build/default/stubs.c))
   (Sandbox_config ((disallow symlink) (disallow copy)))))
 (targets ((In_build_dir _build/default/stubs.o)))
 (context default)
 (action
  (chdir
   _build/default
   (run
    /usr/lib/ccache/gcc
    -g
    -fPIC
    -O0
    -Wextra
    -Wstrict-overflow=5
    -Wno-implicit-fallthrough
    -g
    -I
    /home/user/dune_cache/_opam/lib/ocaml
    -o
    stubs.o
    -c
    stubs.c))))

the missing dependency in this case is caml/mlvalues.h (and technically speaking all other recursively included header files), so it should be specified by hand? Maybe dune can add something as a dependency automatically (e.g. always depend on ocaml binary?)

ygrek avatar May 13 '21 14:05 ygrek

By the way, when I say "all dependencies" it can never really be all dependencies. Every single command might go and read a bunch of files in /etc, /lib, ... and we can't track all of these in Dune. So it's a best effort.

In essence this is actually what this issue is about. Caching is hard, with obvious choice of performance (over-caching) vs safety (under-caching) in such situations. Maybe for external sources it is better to leave caching to external cache tools (e.g. for C code ccache is doing good job without mistakes), and let dune only cache ocaml artifacts?

ygrek avatar May 13 '21 15:05 ygrek

the missing dependency in this case is caml/mlvalues.h (and technically speaking all other recursively included header files), so it should be specified by hand?

It seems like Dune shoud do it in this case. In particular, we pass the -I <stdlib-dir> explicitly here. However, we just pass -I ... without actually recording dependencies. IIRC, we have a mechanism to scan include dirs for header files. @snowleopard or @nojb does that ring a bell?

Maybe for external sources it is better to leave caching to external cache tools (e.g. for C code ccache is doing good job without mistakes), and let dune only cache ocaml artifacts?

That's an interesting idea.

ghost avatar May 17 '21 09:05 ghost

It seems like Dune shoud do it in this case. In particular, we pass the -I <stdlib-dir> explicitly here. However, we just pass -I ... without actually recording dependencies. IIRC, we have a mechanism to scan include dirs for header files. @snowleopard or @nojb does that ring a bell?

I think external -I paths are currently tracked non-recursively. Here is the corresponding TODO in the sources:

https://github.com/ocaml/dune/blob/927d1dce42373cb2cad8feef9d14a8242fafc097/src/dune_rules/foreign_rules.ml#L37-L42

This may be the core of this issue.

snowleopard avatar May 18 '21 10:05 snowleopard

That makes sense. What you describe and replacing the explicit -I <stdlib-dir> in the bellow code should fix the current ticket.

https://github.com/ocaml/dune/blob/fa4b72a6db1fede07cadf3f106d8b3522e7fd404/src/dune_rules/foreign_rules.ml#L168-L172

ghost avatar May 18 '21 11:05 ghost

My 2 cents:

  • One can ask C compilers to output dependency informations — see the -M option in https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html#Preprocessor-Options. That's a standard approach.

Maybe for external sources it is better to leave caching to external cache tools (e.g. for C code ccache is doing good job without mistakes), and let dune only cache ocaml artifacts?

That's an interesting idea.

At least originally, the key trick for ccache is to not cache the preprocessing, but only content-based caching of the translation from the preprocessor output to the compiler output.

Blaisorblade avatar Dec 03 '21 02:12 Blaisorblade

One can ask C compilers to output dependency informations — see the -M option in https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html#Preprocessor-Options. That's a standard approach.

Just to make sure we are on the same page, I think the idea is as follows. Instead of depending on every file inside external directories like ctx.stdlib_dir, find out the actual set of files used by gcc via the -M flag, and record dependencies only on that set. That sounds like a good idea but it's probably not enough for reproducible builds: in addition to the files returned by gcc, I think we also need to depend on the listing ctx.stdlib_dir, because if a new file appears there, gcc might want to pick it up.

@Blaisorblade Do you agree?

Implementation wise, this may be non-trivial because I think Dune engine doesn't provide a mechanism to specify dependencies in this way at the moment. But I'm all for changing Dune to make expressing dependencies like this possible.

snowleopard avatar Dec 03 '21 13:12 snowleopard

Good point; preprocessing depends on the listing of the whole search path. Which is a good argument for ccache or its approach.

Blaisorblade avatar Dec 03 '21 15:12 Blaisorblade

FTR, this is still an issue with dune 3.4 even though the code paths mentioned in the todo comment have been moved.

emillon avatar Jul 28 '22 08:07 emillon