bazel-diff icon indicating copy to clipboard operation
bazel-diff copied to clipboard

Transitive dependency of repository rule do not change bazel-diff output hashes

Open tingilee opened this issue 2 years ago • 2 comments

In this minimal repo https://github.com/tingilee/repro-pip-bazel-diff/, we have found that transitive dependencies in the repository rules do not change the output hashes.

https://github.com/tingilee/repro-pip-bazel-diff/blob/master/BUILD#L15 :lib depends on testcontainers and testcontainers dependent on wrapt.

I ran the following at commit https://github.com/tingilee/repro-pip-bazel-diff/commit/697f9090ae87b97f8dad7edd94520c34ea89aaa8 and one commit prior.

bazel run @bazel_diff//:bazel-diff -- generate-hashes -w /Users/jacqueline.lee/repro-pip-bazel-diff -b /usr/local/bin/bazel /tmp/hashes.json

And then ran

bazel run @bazel_diff//:bazel-diff  -- -sh /tmp/starting_hashes_before_change.json -fh /tmp/starting_hashes_after_change.json -w /Users/jacqueline.lee/repro-pip-bazel-diff -b /usr/local/bin/bazel -o /tmp/impacted

The expectation is :lib should be in the impacted targets since its transitive dependency has changed, but this is not true.

tingilee avatar May 19 '22 21:05 tingilee

It seems to me that bazel-diff is relying on the definition of a repository rule to understand the dependency graph, rather than the result of executing the repository rule. In the case of rules_python, and I imagine many other rulesets, the dependency graph is produced by some code generation inside the repository rule implementation.

Taking this example repro above, we can see that the //external:all-targets targets that bazel-diff queries for, which define a repository rule, do not express dependencies. For example,

 % bazel query --output=build //external:my_deps_testcontainers
# /Users/alex.eagle/repros/pip_bazel-diff/WORKSPACE:31:13
whl_library(
  name = "my_deps_testcontainers",
  generator_name = "my_deps_testcontainers",
  generator_function = "install_deps",
  repo = "my_deps",
  requirement = "testcontainers==3.5.4     --hash=sha256:7a5583922cbb3da5ed255975f1927d8b29bf6a638323a85fd41b6e46c48ada8f",
...
)

doesn't express any dependency on wrapt.

If we instead used the result of executing the repository rule, then the dependency is present:

% bazel query --output=build @my_deps_testcontainers//:pkg    
# /private/var/tmp/_bazel_alex.eagle/a060b90d6e0425005f043993b0078cee/external/my_deps_testcontainers/BUILD.bazel:22:11
py_library(
  name = "pkg",
  tags = ["__PYTHON_RULES_MIGRATION_DO_NOT_USE_WILL_BREAK__", "pypi_name=testcontainers", "pypi_version=3.5.4"],
  generator_name = "pkg",
  generator_function = "py_library",
  generator_location = "/private/var/tmp/_bazel_alex.eagle/a060b90d6e0425005f043993b0078cee/external/my_deps_testcontainers/BUILD.bazel:22:11",
  deps = ["@my_deps_deprecation//:pkg", "@my_deps_docker//:pkg", "@my_deps_wrapt//:pkg"],
  data = ["@my_deps_testcontainers//:testcontainers-3.5.4.dist-info/LICENSE.txt", "@my_deps_testcontainers//:testcontainers-3.5.4.dist-info/METADATA", "@my_deps_testcontainers//:testcontainers-3.5.4.dist-info/WHEEL", "@my_deps_testcontainers//:testcontainers-3.5.4.dist-info/top_level.txt"],
 ...
)
# Rule pkg instantiated at (most recent call last):
#   /private/var/tmp/_bazel_alex.eagle/a060b90d6e0425005f043993b0078cee/external/my_deps_testcontainers/BUILD.bazel:22:11 in <toplevel>
#   /private/var/tmp/_bazel_alex.eagle/a060b90d6e0425005f043993b0078cee/external/rules_python/python/defs.bzl:52:22       in py_library

To prove this theory, I made a trivial edit in rules_python, so that the whl_library target from that first query will express the dependency:

diff --git a/python/pip_install/parse_requirements_to_bzl/__init__.py b/python/pip_install/parse_requirements_to_bzl/__init__.py
index 9519d62..8ee5da2 100644
--- a/python/pip_install/parse_requirements_to_bzl/__init__.py
+++ b/python/pip_install/parse_requirements_to_bzl/__init__.py
@@ -144,6 +144,7 @@ def generate_parsed_requirements_contents(
                 whl_library(
                     name = name,
                     requirement = requirement,
+                    deps = ["@my_deps_wrapt//:whl"] if not name.startswith("my_deps_wrapt") else [],
                     annotation = _get_annotation(requirement),
                     **_config
                 )
diff --git a/python/pip_install/pip_repository.bzl b/python/pip_install/pip_repository.bzl
index d9888a2..fe39908 100644
--- a/python/pip_install/pip_repository.bzl
+++ b/python/pip_install/pip_repository.bzl
@@ -408,6 +408,7 @@ whl_library_attrs = {
         ),
         allow_files = True,
     ),
+    "deps": attr.label_list(),
     "repo": attr.string(
         mandatory = True,
         doc = "Pointer to parent repo name. Used to make these rules rerun if the parent repo changes.",

and now we get the output we expected:

----------------------
Impacted Targets between c0328895e7adcf7492671680f2622b075a588c27 and 6d47b43f8beabf26bc7d1a4d16aed149e963a5ea:
//external:my_deps_websocket_client
//:requirements.txt
//:requirements.update
//external:my_deps_deprecation
//:requirements_test
//:bin
//:requirements
//:lib
//:requirements.in
//external:my_deps_idna
//external:my_deps_urllib3
//external:my_deps_charset_normalizer
//external:my_deps_testcontainers
//external:my_deps_requests
//external:my_deps_certifi
//external:my_deps_pyparsing
//external:my_deps
//external:my_deps_packaging
//external:my_deps_docker
//external:my_deps_wrapt

alexeagle avatar May 19 '22 21:05 alexeagle

FWIW this style of issue was one of our motivations for putting together a similar project at https://github.com/bazel-contrib/target-determinator - we put in quite a bit of care to follow deps through repository rules to all affecting files.

I manually tested out the supplied repro, and that implementation detected the change properly.

The target-determinator repo includes a test suite of edge-cases which runs those tests with bazel-diff, bazel-differ, and the target-determinator implementation (https://github.com/bazel-contrib/target-determinator/tree/main/tests/integration) - I'm not sure we have an exact test case for this case, but it could be nice to add a reduced version of this situation to that test suite to track improvements :)

We'd also absolutely love to collaborate with folks (particularly those who have contributed to bazel-diff) to consolidate all of our varied experiences together!

illicitonion avatar May 20 '22 17:05 illicitonion

This is expect behavior. If you want to include these transitive targets I would look into the -s=<seedFilepaths> option.

tinder-maxwellelliott avatar Mar 01 '23 14:03 tinder-maxwellelliott