beam icon indicating copy to clipboard operation
beam copied to clipboard

[Task]: Document compatibility asipirations & test coverage policies for optional Beam dependencies

Open tvalentyn opened this issue 1 year ago • 1 comments

What needs to happen?

Certain aspects of Beam functionality depend on actively evolving libraries, for example RunInference model handlers might require dependencies like PyTorch or Tensorflow.

We should document Beam policy on compatibility with with third-party libraries, which are not in Beam's dependency chain already.

Then, we should make sure our compatibility suites test at least against the lowest supported version, and the highest supported version: https://github.com/apache/beam/blob/master/.github/workflows/beam_PreCommit_Python_Coverage.yml

Testing the in-between versions can be done as needed. For example, we test against all supported versions of Pyarrow and Pandas, but those versions are also in our dependency chain already so we have requirements spelled-out. Dependencies like Tensorflow, on the other hand, are optional and not part of 'extras'.

Issue Priority

Priority: 2 (default / most normal work should be filed as P2)

Issue Components

  • [X] Component: Python SDK
  • [ ] Component: Java SDK
  • [ ] Component: Go SDK
  • [ ] Component: Typescript SDK
  • [ ] Component: IO connector
  • [ ] Component: Beam YAML
  • [ ] Component: Beam examples
  • [ ] Component: Beam playground
  • [ ] Component: Beam katas
  • [ ] Component: Website
  • [ ] Component: Spark Runner
  • [ ] Component: Flink Runner
  • [ ] Component: Samza Runner
  • [ ] Component: Twister2 Runner
  • [ ] Component: Hazelcast Jet Runner
  • [ ] Component: Google Cloud Dataflow Runner

tvalentyn avatar Apr 09 '24 19:04 tvalentyn

I think we should introduce additional extras, that would define allowed ranges for optional beam dependencies.

Having many extras is very common in project that support a certain usecase on multiple backends.

tvalentyn avatar May 16 '24 21:05 tvalentyn