🚀 feature request

Relevant Rules

gazelle

Description

This is the same as #1862, but for the data attribute on generated targets.

Describe the solution you'd like

# foo/bar/baz_test.py
import pathlib

# gazelle:include_data //foo/bar:test_data
TEST_DATA_DIR = pathlib.Path(__file__).parent / "test_data"

def test_main():
    assert TEST_DATA_DIR.exists()

Running Gazelle generates:

py_test(
    name = "baz_test",
    srcs = ["baz_test.py"],
    data = ["//foo/bar:test_data"],
)

Note that Gazelle does not generate the test_data target - it's still on the user to do that themselves.

Describe alternatives you've considered

The current method is to manually add data attributes.

Apr 23 '24 21:04 dougthor42

Note: one issue with this request is that Gazelle doesn't currently do anything with data. I assume we don't want Gazelle to clear out any manually-added data targets (thus forcing people to define data using the proposed annotation), so more work will be needed to ensure that this feature only ensures that the include_data values are present in data but does not remove any values.

This also conflicts with how the deps attribute is populated with Gazelle: if a Python import is removed, Gazelle will remove it from deps. This difference in behavior can lead to confusion and possible pollution of data, such as when someone removes a existing # gazelle:include_data //foo:bar annotation - the //foo:bar target will still be part of data after running Gazelle again.

Example of not removing manually added values:

# some existing BUILD.bazel file, possibly generated by Gazelle before adding the annotation to the python file
py_library(
    name = "foo",
    srcs = ["foo.py"],
    data = ["//manually/added/data:target"],
)

# foo.py
# gazelle:include_data //foo:bar

Running Gazelle

Want

py_library(
    name = "foo",
    srcs = ["foo.py"],
    data = [
        "//foo:bar",
        "//manually/added/data:target",
    ],
)

Don't want

py_library(
    name = "foo",
    srcs = ["foo.py"],
    data = ["//foo:bar"],
)

Apr 23 '24 22:04 dougthor42

I think the comment in python code controlling data is super neat. I would gate this feature under some boolean gazelle directive so that we can enable it everywhere at some point. In the next release it would be disabled and then once we know that it is working well, we enable it by default forcing users to either explicitly disable it or to update their code, then at some point we could remove the feature flag.

What do you think?

Apr 30 '24 14:04 aignas

Agreed, being gated by a feature flag is definitely the way to go.

# The default "now"
# gazelle:python_experimental_enable_annotation_include_data false

# The default "soon"
# gazelle:python_experimental_enable_annotation_include_data true

# Then either promote it to a non-experimental flag or remove it as you said.
# gazelle:python_enable_annotation_include_data true  # default
# gazelle:python_enable_annotation_include_data false

we enable it ... [and force users to] update their code

Can you elaborate on this? Are you saying that users must use # gazelle:include_data //foo/bar:test_data to add items to the data attribute?

May 08 '24 21:05 dougthor42

Thinking about this more, I do think that the data management through comments in code has to stay opt-in. We could make it opt out instead where you would have to add the keep comments to the data dependencies instead, but that may be a surprising behaviour to some existing codebases.

I am fine with just adding the opt-in feature flag and deciding on whether we want opt-out behaviour sometime in the future.

May 09 '24 23:05 aignas

Add a new python comment annotation that adds targets to `data`

🚀 feature request

Relevant Rules

Description

Describe the solution you'd like

Describe alternatives you've considered

Example of not removing manually added values:

Running Gazelle

Want

Don't want