[gazelle] Support relative python imports
🚀 feature request
Relevant Rules
- gazelle
Description
Sometimes a python code base uses relative imports:
from .baz import Baz
from .same_package_other_module import cat_in_the_hat
from ..one_level_up import thing1
from ...two_levels_up import thing2
from .. import moduleA
While I don't necessarily agree with this style, it's something that Python supports (see PEP 328) and thus I believe that Gazelle should support it too.
Right now, Gazelle simply ignores these imports. Here's an example:
example/
__init__.py
a.py
b.py
BUILD.bazel
# example/a.py
from . import b
del b
# example/b.py
pass
# example/BUILD.bazel
# as generated by Gazelle with "file" generation mode
load("@rules_python//python:defs.bzl", "py_library")
py_library(
name = "a",
srcs = ["a.py"],
visibility = ["//:__subpackages__"],
)
py_library(
name = "b",
srcs = ["b.py"],
visibility = ["//:__subpackages__"],
)
If we change a.py to use an absolute import:
# example/a.py
from example import b
del b
Gazelle will add the dep:
# example/BUILD.bazel
# as generated by Gazelle with "file" generation mode
load("@rules_python//python:defs.bzl", "py_library")
py_library(
name = "a",
srcs = ["a.py"],
visibility = ["//:__subpackages__"],
deps = [":b"],
)
...
Describe the solution you'd like
I'm somewhat familiar with the Gazelle code base. I believe it will be pretty easy (for the most part - see Issues below) below) to generate the full target path for the current file being processed, and then adjust the target based on how many dots . are in the relative import.
Because of the Issues, I'd imagine that this would be an experimental, opt-in feature for a while. It would be guarded by a directive:
# gazelle:experimental_allow_relative_imports true
Here are some examples of the logic, assuming a slightly more complex dir structure:
toplevel/
__init__.py
MODULE.bazel
BUILD.bazel
foo/
__init__.py
BUILD.bazel/
bar.py
baz.py
example/
__init__.py
a.py
b.py
BUILD.bazel
# relative import of another module "b.py" in same package "example"
# from . import b
read "from . import b" in file "example/a.py"
what is our current target? "//example:a"
We have 1 dot, so trim off 1 component ":a". target_stem = "//example"
Add the import "b". dep target name = "//example:b"
# relative import of a different packages's module
# from ..foo import bar
read "from ..foo import bar" in file "example/a.py"
current target? "//example:a"
We have 2 dots, so trim off 2 components "example:a". target_stem = "//"
Add the name from the dots and from the import. dep_target_name = "//foo:bar
# relative import of the parent packages's module "baz"
# from .. import baz
read "from .. import baz" in file "example/a.py"
current target? "//example:a"
We have 2 dots, so trim off 2 components "example:a". target_stem = "//"
There's no name after the dots, so the thing we're importing is a module (probably).
dep_target_name = "//:baz"
The above examples assume file-level generation. More thought will be needed to support package-level generation.
Issues
It's difficult to know if from ..foo import bar should be //:foo or //foo:bar. Is bar a class/function? Or is bar another module? In the example above, it's a python module, but that's not always the case.
In the from .. import baz case, it's possible that baz is an identifier defined in the parent package's __init__.py, and thus the dep target would be //:__init__.
Describe alternatives you've considered
I tried convincing the developers to use absolute imports, but they just weren't having it :rofl:
So I also tried using the "package" generation mode, but the issue was that unit test files also used relative imports and would not include the dep.
The current workaround is to add # gazelle:include_dep //example:b annotations, but those are prone to diverging from the actual code and can be tedious to write for large projects.
Yeah, the issues you've mentioned is a real pain. That said, I think gazelle
should be able to parse the python files for symbols and have an internal
mapping somewhere of what symbols are in which targets.
To have something that works and is still experimental, you could implement it in stages:
- Allow using relative imports in the
BUILD.bazelfile in thegazelle: resolvedirective - this may be a good way to test things manually, or have an escape hatch when the auto-detection does not work. - Implement the relative import handling without handling the
__init__targets. - If you went with the
packagegeneration mode, at least you would not need to handle the__init__targets and then the semantics could be easier as you might get away without supporting the__init__like targets. - Somehow attempt to first parse the python files within the current workspace
to understand what symbols are there - you only need to get the
defandclassand global variables. How hard could that be... famous last words...
What other things would we need to have around this issue?
It's difficult to know if from ..foo import bar should be //:foo or //foo:bar. Is bar a class/function? Or is bar another module? In the example above, it's a python module, but that's not always the case.
In the from .. import baz case, it's possible that baz is an identifier defined in the parent package's init.py, and thus the dep target would be //:init.
Is this unique for relative imports? Normal python imports hit the same issue, which is resolved by the language extension walking up the module import path and looking in the gazelle rule index (["foo.bar.baz", "foo.bar", "foo"]). As long as the absolute module path to look for is constructed from the relative import, this should work.
I started working on relative imports myself, but abandoned it after hitting some issues with getting the tests to pass with other internal patches that I've made on the extension. I was able to convince our devs to migrate to absolute imports, so getting this landed upstream wasn't a priority.
I stripped out my changes to have just the relative import behavior and the tests appear to pass. If someone wants to take on getting additional tests in place and landing it, go ahead. All of the changes are on the HEAD commit of this branch.
Out-of-scope, but noteworthy for this change -- It would be much easier to maintain the parser if it used treesitter queries. The tests for the parser seem sufficient, but I'm not 100% confident that my change in the parser didn't break something. Using treesitter queries would simplify the code, but I'm not sure if there was a reason for not doing so when the extension was originally written.
Is this unique for relative imports?
No, I don't think it's unique to relative imports. I do think that using a relative import for a module member is rare, though. At least, I've never seen from .. import SomeClassDefinedInInit.
I was able to convince our devs to migrate to absolute imports
Nice! Personally I'd be just fine with Gazelle not supporting relative imports (even though I opened this issue, haha). When projects get to be the size that Bazel/Gazelle is really helpful, relative imports cause, IMO, undue confusion.
It would be much easier to maintain the parser if it used treesitter queries.
TIL! I'd be happy to review a PR that switches to queries, though I admit I know very little about tree-sitter.
I'm not sure if there was a reason for not [using queries] when the extension was originally written.
That would be a question for @hunshcn who added it in #1895. It's possible it's just another "TIL" haha.
That would be a question for @hunshcn who added it in #1895. It's possible it's just another "TIL" haha.
Yep. There is no reason for not using treesitter queries. This is what I just learned today. I'm happy with this change.
@dougthor42 should this issue be closed? I'm reading #2865 and looking at the unit test but not sure if this is supported yet for # gazelle:python_generation_mode file.