tfx icon indicating copy to clipboard operation
tfx copied to clipboard

TFX 1.15 docker image contains conflicting dependencies

Open IzakMaraisTAL opened this issue 7 months ago • 2 comments

System information

  • Have I specified the code to reproduce the issue: Yes
  • Environment in which the code is executed: Linux
  • TFX Version: 1.15.0 and 1.15.1
  • Python version: 3.10.14

Describe the current behavior

The tfx docker image contains conflicting dependencies: apache-beam and google-cloud-datastore.

docker run --rm --entrypoint python tensorflow/tfx:1.15.1 -m pip list | grep -E 'apache|google-cloud-datastore'

apache-beam                              2.56.0
google-cloud-datastore                   1.15.5

From the apache-beam dependency constraints we can see that version 2.56.0 requires google-cloud-datastore>=2.0.0,<3. Having google-cloud-datastore 1.15.5 in the base image violates this constraint and may cause bugs.

Describe the expected behavior

The docker TFX image should contain no conflicting dependencies.

Name of your Organization (Optional)

Takealot.com

Other info / logs

We use the TFX docker image as a base image into which we install additional packages. For this to work, we must not install conflicting dependencies. The solution is the use a dependency solver (like pip-tools pip-compile or uv pip compile) to constrain the dependencies we install against those already in the image. However, if the dependencies already in the image are conflicting, it is impossible to install additional packages reliably.

This was not a problem in TFX 1.14

IzakMaraisTAL avatar Jul 16 '24 12:07 IzakMaraisTAL