haskell_library may cause extra rebuilds when ABI is not changed
Describe the bug
Modification of Haskell files used in a haskell_library causes rebuild of dependent libraries, even though the library API/ABI is not changed.
To Reproduce
- Run
git clone https://github.com/tweag/rules_haskell.git - Go to
rules_haskell/tutorialdirectory. - Put the following files into the current directory (shown below):
A.hs,B.hs, andBUILD.bazel. - Run
bazel build :mylib2 - Update code in
A.hs: change definitionaConst = 1toaConst = 2. - Run
bazel build :mylib2
Expected behavior
I expect that only :mylib1 is rebuilt. But both libraries are actually rebuilt: we see warnings from the both Haskell modules. I suppose that step 5 does not change the ABI of mylib1.
Environment
- OS name + version: Linux Ubuntu 20.04.1
- Bazel version: 5.1.1
- Version of the rules: 487535b8b10d496d8aa11aa4e9f1c91da476cf61 (current master)
Additional context
If we compare MD5 sums of build files, we will see that no files related to mylib2 have their content modified. All modified files relate to mylib1 only.
If we build the attached libraries using stack, we will see that no modules of mylib2 are rebuilt (however, configure stage is performed). I guess that rules_haskell should behave in a similar way.
If we don't introduce implementation changes in A.hs, but only whitespace changes, only mylib1 is rebuilt. This is good, but not an interesting case, since no build products have their content changed.
My project suffers heavily from extra rebuilds by bazel, although most source code modifications relate to library implementations and do not change their ABI. Unfortunately, using haskell_module or gazelle is not an option for me.
Attached files are below.
A.hs:
{-# OPTIONS_GHC -Wmissing-signatures #-}
module A where
aConst = 1
B.hs:
{-# OPTIONS_GHC -Wmissing-signatures #-}
module B where
import A (aConst)
bConst = aConst + 1
BUILD.bazel:
haskell_toolchain_library(name = "base")
haskell_library(
name = "mylib1",
srcs = ["A.hs"],
visibility = ["//visibility:public"],
deps = [":base"],
)
haskell_library(
name = "mylib2",
srcs = ["B.hs"],
visibility = ["//visibility:public"],
deps = [":base", ":mylib1"],
)
Unfortunately, using haskell_module or gazelle is not an option for me.
What's preventing you from adopting these? As described in the corresponding blog post haskell_module enables recompilation avoidance.
@aherrmann there are some points in the documentation of gazelle_haskell_modules that seem to be obstacles for adopting:
-
gazelle_haskell_moduleschanges BUILD files, which are stored in git. Should we commit the changes?-
suppose we do. Keeping frequently modified auto-generated text in git results in annoying merge conflicts, cluttered diffs, compilation errors caused by obsolete configuration, non-committed by mistake. Our project is large enough and is actively developed by many people, so that these troubles look unavoidable.
-
if we don't commit the changes, we should remove updates made by gazelle from BUILD-files before
git commit. It's a distracting and error-prone activity, which will probably result in mistakenly committed data that shouldn't be committed - and, again, merge conflicts and build failures.
It would be fixed, if gazelle could generate additional BUILD files, but not update existing ones. We would add these auto-generated files to
.gitignore. Is it possible? -
-
it requires the developer to invoke
bazel run //:gazelle_haskell_modulesmanually at all changes in module imports.-
Is it fast enough to invoke it frequently in a large project (~200 BUILD files, ~700 .hs files)?
-
It looks inconvenient to run this command manually, since module imports are modified very often. Developers may forget to run
gazelle, resulting in higher rate of build failures because of obsolete configuration, which is annoying. It would be easier to invokegazellebefore every ordinarybazelcommand, but it's cumbersome - the resulting commandbazel run //:gazelle_haskell_modules && bazel build MYPACKAGEis rather long. I'd prefer to make//:gazelle_haskell_modulesinto a dependency of all packages in our project, so that a developer wouldn't be obliged to type it manually - is it possible? I guess it's not.
-
-
We use custom bazel rules and macros, built on top of
haskell_library, in order to keep common defaults (Haskell extensions, ghc options) and add auto-generatedhlinttests, REPL targets etc. Is it possible to letgazelle_haskell_modulesknow about these rules, so that it could read and update them? It would be extremely inconvenient to migrate back to plainhaskell_library.
Point 3 seems to be an absolute obstacle, other points are less critical, but they affect developer experience in a very bad way. This is why I'm thinking about fixing haskell_library - is it possible to fix it in a way similar to haskell_module (using ABI hashes etc)?
Re 1. Gazelle is indeed intended to be used in a way where you check-in the generated BUILD files. Gazelle is designed to be able to update the files in-place, leave alone parts that don't concern it, and respect manual edits when indicated, e.g. using # keep comments. The noise should be mostly the same noise you would get if you were to manually define haskell_module targets.
Re 2. On automation, there is autogazelle that does something like you describe. In terms of performance, from past experience it completed in about 3-6 seconds on a project with near a thousand modules.
Re 3. Gazelle has a builtin directive called map_kind to support this use-case. So, this shouldn't be a problem.