rules_python
rules_python copied to clipboard
TFX namespace packages are not installed properly
🐞 bug report
Affected Rule
requirement
Is this a regression?
No
Description
I'm trying to install the https://github.com/tensorflow/tfx package properly, which has two namespace packages on PyPI:
- one is tfx, which has a comprehensive list of dependencies;
- the other one is ml-pipelines-sdk, a leaner version with fewer dependencies.
Note, the first version (tfx) depends on the seconds one (ml-pipelines-sdk). The relevant parts in the setup.py
is at https://github.com/tensorflow/tfx/blob/4e08d44b4a99bfc93a6cc94b2b9c242c94b2ec8e/setup.py#L284-L297.
I am more interested in the first one.
The entry on my requirement file is like
cat requirements.in
tfx==1.6.0
When installing tfx
, it's supposed to be installed in just one tfx
folder. But with rules_python, tfx
appears in two places, e.g.
ls /path/to/bazel-bin/myapp/debug_python.runfiles/vendor_python_ml_pipelines_sdk/tfx
__init__.py dependencies.py dsl examples experimental orchestration proto py.typed types utils version.py
ls /path/to/bazel-bin/myapp/debug_python.runfiles/vendor_python_tfx/tfx
__init__.py components dependencies.py dsl examples experimental extensions orchestration proto py.typed scripts tools types utils v1 version.py
Since vendor_python_ml_pipelines_sdk/tfx
is before vendor_python_tfx/tfx
in sys.path, so import tfx
also import the first one, and results in error.
When I installed tfx without bazel with python -m pip install tfx
, there is just
ls /path/to/venv/lib/python3.8/site-packages/tfx
__init__.py components dependencies.py dsl examples experimental extensions orchestration proto py.typed scripts tools types utils v1 version.py
My understanding is that the contents of ml-pipelines-sdk
and tfx
should be merged into a single tfx
folder when installing tfx
because they are both under the namespace tfx. But rules_python
may not have handled such case properly, so two tfx
folders appear and cause confusion during import.
🔬 Minimal Reproduction
try
pip-compile --allow-unsafe --annotation-style=line --output-file=generated_requirements.txt requirements.in
with the single-entry requirements.in
shown above and then put generated_requirements.txt
in WORKSPACE
, e.g.
load("@rules_python//python:pip.bzl", "pip_install", "pip_parse")
pip_parse(
name = "vendor_python",
timeout = 1200,
quiet = False,
requirements_lock = "//path/to:generated_requirements.txt",
)
load("@vendor_python//:requirements.bzl", "install_deps")
install_deps()
🔥 Exception or Error
from tfx.components.trainer import executor
doesn't work. Note tfx.components
package is only included in the comprehensive version (tfx
), NOT in the lean version (ml-pipelines-sdk
) (See the difference between ls
outputs above).
🌍 Your Environment
Operating System:
macos
Output of bazel version
:
bazel 4.2.1
Rules_python version:
Tried both 0.4.0 and 0.6.0, neither works.
@hrfuller is this related to something you experienced at Twitter? We use Tensorflow at my company but not TFX, and so wouldn't have seen this.
This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!
This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"