rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

[gazelle] Support relative python imports

Open dougthor42 opened this issue 1 year ago • 5 comments

🚀 feature request

Relevant Rules

  • gazelle

Description

Sometimes a python code base uses relative imports:

from .baz import Baz
from .same_package_other_module import cat_in_the_hat
from ..one_level_up import thing1
from ...two_levels_up import thing2
from .. import moduleA

While I don't necessarily agree with this style, it's something that Python supports (see PEP 328) and thus I believe that Gazelle should support it too.

Right now, Gazelle simply ignores these imports. Here's an example:

example/
    __init__.py
    a.py
    b.py
    BUILD.bazel
# example/a.py
from . import b
del b
# example/b.py
pass
# example/BUILD.bazel
# as generated by Gazelle with "file" generation mode
load("@rules_python//python:defs.bzl", "py_library")

py_library(
    name = "a",
    srcs = ["a.py"],
    visibility = ["//:__subpackages__"],
)

py_library(
    name = "b",
    srcs = ["b.py"],
    visibility = ["//:__subpackages__"],
)

If we change a.py to use an absolute import:

# example/a.py
from example import b
del b

Gazelle will add the dep:

# example/BUILD.bazel
# as generated by Gazelle with "file" generation mode
load("@rules_python//python:defs.bzl", "py_library")

py_library(
    name = "a",
    srcs = ["a.py"],
    visibility = ["//:__subpackages__"],
    deps = [":b"],
)

...

Describe the solution you'd like

I'm somewhat familiar with the Gazelle code base. I believe it will be pretty easy (for the most part - see Issues below) below) to generate the full target path for the current file being processed, and then adjust the target based on how many dots . are in the relative import.

Because of the Issues, I'd imagine that this would be an experimental, opt-in feature for a while. It would be guarded by a directive:

# gazelle:experimental_allow_relative_imports true

Here are some examples of the logic, assuming a slightly more complex dir structure:

toplevel/
    __init__.py
    MODULE.bazel
    BUILD.bazel
    foo/
        __init__.py
        BUILD.bazel/
        bar.py
    baz.py
    example/
        __init__.py
        a.py
        b.py
        BUILD.bazel
# relative import of another module "b.py" in same package "example"
# from . import b
read "from . import b" in file "example/a.py"
what is our current target? "//example:a"
We have 1 dot, so trim off 1 component ":a". target_stem = "//example"
Add the import "b". dep target name = "//example:b"
# relative import of a different packages's module
# from ..foo import bar
read "from ..foo import bar" in file "example/a.py"
current target? "//example:a"
We have 2 dots, so trim off 2 components "example:a". target_stem = "//"
Add the name from the dots and from the import. dep_target_name = "//foo:bar
# relative import of the parent packages's module "baz"
# from .. import baz
read "from .. import baz" in file "example/a.py"
current target? "//example:a"
We have 2 dots, so trim off 2 components "example:a". target_stem = "//"
There's no name after the dots, so the thing we're importing is a module (probably).
dep_target_name = "//:baz"

The above examples assume file-level generation. More thought will be needed to support package-level generation.

Issues

It's difficult to know if from ..foo import bar should be //:foo or //foo:bar. Is bar a class/function? Or is bar another module? In the example above, it's a python module, but that's not always the case.

In the from .. import baz case, it's possible that baz is an identifier defined in the parent package's __init__.py, and thus the dep target would be //:__init__.

Describe alternatives you've considered

I tried convincing the developers to use absolute imports, but they just weren't having it :rofl:

So I also tried using the "package" generation mode, but the issue was that unit test files also used relative imports and would not include the dep.

The current workaround is to add # gazelle:include_dep //example:b annotations, but those are prone to diverging from the actual code and can be tedious to write for large projects.

dougthor42 avatar Sep 08 '24 03:09 dougthor42

Yeah, the issues you've mentioned is a real pain. That said, I think gazelle should be able to parse the python files for symbols and have an internal mapping somewhere of what symbols are in which targets.

To have something that works and is still experimental, you could implement it in stages:

  1. Allow using relative imports in the BUILD.bazel file in the gazelle: resolve directive - this may be a good way to test things manually, or have an escape hatch when the auto-detection does not work.
  2. Implement the relative import handling without handling the __init__ targets.
  3. If you went with the package generation mode, at least you would not need to handle the __init__ targets and then the semantics could be easier as you might get away without supporting the __init__ like targets.
  4. Somehow attempt to first parse the python files within the current workspace to understand what symbols are there - you only need to get the def and class and global variables. How hard could that be... famous last words...

What other things would we need to have around this issue?

aignas avatar Sep 08 '24 12:09 aignas

It's difficult to know if from ..foo import bar should be //:foo or //foo:bar. Is bar a class/function? Or is bar another module? In the example above, it's a python module, but that's not always the case.

In the from .. import baz case, it's possible that baz is an identifier defined in the parent package's init.py, and thus the dep target would be //:init.

Is this unique for relative imports? Normal python imports hit the same issue, which is resolved by the language extension walking up the module import path and looking in the gazelle rule index (["foo.bar.baz", "foo.bar", "foo"]). As long as the absolute module path to look for is constructed from the relative import, this should work.

I started working on relative imports myself, but abandoned it after hitting some issues with getting the tests to pass with other internal patches that I've made on the extension. I was able to convince our devs to migrate to absolute imports, so getting this landed upstream wasn't a priority.

I stripped out my changes to have just the relative import behavior and the tests appear to pass. If someone wants to take on getting additional tests in place and landing it, go ahead. All of the changes are on the HEAD commit of this branch.

Out-of-scope, but noteworthy for this change -- It would be much easier to maintain the parser if it used treesitter queries. The tests for the parser seem sufficient, but I'm not 100% confident that my change in the parser didn't break something. Using treesitter queries would simplify the code, but I'm not sure if there was a reason for not doing so when the extension was originally written.

alex-torok avatar Dec 03 '24 16:12 alex-torok

Is this unique for relative imports?

No, I don't think it's unique to relative imports. I do think that using a relative import for a module member is rare, though. At least, I've never seen from .. import SomeClassDefinedInInit.

I was able to convince our devs to migrate to absolute imports

Nice! Personally I'd be just fine with Gazelle not supporting relative imports (even though I opened this issue, haha). When projects get to be the size that Bazel/Gazelle is really helpful, relative imports cause, IMO, undue confusion.

It would be much easier to maintain the parser if it used treesitter queries.

TIL! I'd be happy to review a PR that switches to queries, though I admit I know very little about tree-sitter.

I'm not sure if there was a reason for not [using queries] when the extension was originally written.

That would be a question for @hunshcn who added it in #1895. It's possible it's just another "TIL" haha.

dougthor42 avatar Dec 03 '24 19:12 dougthor42

That would be a question for @hunshcn who added it in #1895. It's possible it's just another "TIL" haha.

Yep. There is no reason for not using treesitter queries. This is what I just learned today. I'm happy with this change.

hunshcn avatar Dec 04 '24 03:12 hunshcn

@dougthor42 should this issue be closed? I'm reading #2865 and looking at the unit test but not sure if this is supported yet for # gazelle:python_generation_mode file.

jimmyt857 avatar Oct 25 '25 01:10 jimmyt857