rules_haskell
rules_haskell copied to clipboard
.dump-simpl files from GHC are missing
Describe the bug
GHC has an option to dump various internal representations to files. However, when invoked through bazel/rules_haskell, these files are missing — not just in the source directory, but in the bazel directories as well.
To Reproduce
- Pass
["-ddump-simpl", "-ddump-to-file"]
as GHC flags in bazel -
bazel build <target>
- find -L . ~/.cache/bazel* -name '*.dump-simpl'
Expected behavior
Expected to see the files written by the ghc. Instead, find
doesn't find anything.
Environment
- OS name + version: NixOS 20.09
- Bazel version: bazel 3.3.1- (@non-git)
- Version of the rules: aabeedc18f5e5db030ca1aa0c10a7dc14e4a4a55
Afaik, the files are written inside the bazel sandbox virtual filesystem, which does not map to the real filesystem, that's why you won't find them using find
.
One quick and dirty solution may be to pass an absolute path to -ddump-to-file
if that's possible. I've used that hack many times in other context.
A more robust solution may be to change the haskell_library/binary
rule so that they do generate an output with theses files if they exists. A generic solution may be appreciated (for example, to dump the .mix
files generated by coverage, or dump stdout/stderr
of the ghc processes).
I'd really like a generic bazel solution for that. Something which override a rule and specifies some file generated by the rule (but selecting the inner action will be difficult) that must be saved.
A more robust solution may be to change the
haskell_library/binary
rule so that they do generate an output with theses files if they exists.
To clarify, Bazel has no mechanism for optional outputs. Undeclared outputs are always dropped for sandboxed builds, declared outputs are always expected and it is a build error if they are not generated. We'd either need to parse the GHC flags to decide what additional outputs to expect, or make additional outputs user configurable, e.g. through a build setting. The latter seems preferable in terms of flexibility and maintenance cost. Though, as you say it's difficult to know which action to attach the outputs to. An added difficulty is that the path of additional outputs often depends on the module name, which we don't know in rules_haskell (maybe another point in favor of https://github.com/tweag/rules_haskell/pull/1281). We could fall back to directory outputs, but I'm not sure if that's applicable for all additional GHC outputs.
Relatedly, https://github.com/tweag/rules_haskell/issues/1415 is a very similar issue asking for .hie
files.
I've recently needed access to the dump-hi
files emitted from GHC and there were two workarounds I used to get what I wanted that might help others:
-
You can pass
--sandbox_debug
to thebazel build
command that will stop it from cleaning up any intermediate files in the sandbox when the build is complete. If you only need the dump files for a one off process this might be suitable. Because none of the sandbox files are removed between runs it's possible that the sandbox will have stale versions of the dump files you're after so I'd first do abazel clean --expunge
before a single invocation ofbazel build <some_target> --sandbox_debug
to give you a clean set of files. -
You can pass the GHC compiler flag
-dumpdir
which allows you to set an absolute directory to the location where the dump files will be placed. However, by default, the bazel linux sandbox will prevent writing to any location outside of the sandbox so you'll have to combine this compiler flag with the bazel build flag--sandbox_writable_path
(https://docs.bazel.build/versions/master/command-line-reference.html#flag--sandbox_writable_path) to allow GHC to dump the files outside of the sandbox.
@crossleydominic great tips. Would be nice to have this in the use cases documentation.
you'll have to combine this compiler flag with the bazel build flag --sandbox_writable_path
Yes. Although, I believe /tmp/
is mounted read-write inside the sandbox. So a -dumpdir
under /tmp
might not need --sandbox_writable_path
.
This feels more like a limitation of Bazel, that can be worked around, rather than a property of rules_haskell
. @aherrmann suggests some directions for implementation, but I agree with the assessment that the maintenance cost/usefulness trade-off pushes this more towards the "document workarounds" approach in the short term.