pants icon indicating copy to clipboard operation
pants copied to clipboard

replace generated first party protobuf dependencies with third party dependencies

Open shanipribadi opened this issue 2 years ago • 3 comments

Is your feature request related to a problem? Please describe. Right now I am using pants in a project in a monorepo containing both python and protobuf code. The protobuf code that I am defining is importing a proto from 3rd party dependencies, where the runtime libraries are already provided by a pypi package.

I am vendoring the 3rd party protos as part of src/protos directory itself, to allow the generation of my proto definitions.

I need to package both the application pex_binary as well as python_distribution of my protobuf_sources for consumption. Because of that, I need to define a python_distribution for the 3rd party protobuf as well to satisfy pants requirements. But I want the python_distribution or pex_binary of my own code to actually import the pypi package, rather than the python_distribution of the 3rd party protobuf.

└───src
    ├───protos
    │   ├───google // provided by googleapis-common-proto in pypi but vendored here for proto dependencies
    │   │   └───type
    │   │           BUILD // created to satisfy pants requirement for packaging my.package python_distribution
    │   │           money.proto
    │   │
    │   └───my
    │       └───package
    │               BUILD // python_distribution for my.package to publish
    │               message.proto // import google/type/money.proto
    │
    └───python
        └───myapp
                app.py // import my.package.message_pb2
                BUILD // python_distribution and pex_binary of myapp

This repo has full example of the setup. https://github.com/shanipribadi/pants-pypb-example

Describe the solution you'd like If we can define a key in protobuf_source to map the 3rd party library that will be used as the dependency target rather than the generated target.

Maybe something like

google/type/BUILD

protobuf_sources(
    provided_by={
        python="//:root#googleapis-common-protos",

Describe alternatives you've considered What I am doing right now is

  • to add module_mapping under python_requirements
python_requirements(
    name="root",
    module_mapping={
      "googleapis-common-protos": ("google.api", "google.logging.type", "google.rpc", "google.type",)
    }
  • adding explicit dependencies=["//:root#googleapis-common-protos",] under package targets (python_distribution of my.package proto) as well as python_sources target of myapp (that imported my.package proto)
  • adding transitive exclusion "!!src/protos/google/type:type", on all package targets (python_distribution of my.package proto, myapp, and pex_binary for myapp).

This seemed to work right now, but I'm not sure if this is the right approach, and it seems unwieldy.

Additional context Add any other context or screenshots about the feature request here.

shanipribadi avatar Mar 26 '22 11:03 shanipribadi

Hey @shanipribadi , I'm really sorry this didn't get a reply sooner! I think this is an important feature and I appreciate you bringing this up.

One doubt that I have: why do you need Pants to generate the Protobuf files for you? I took a look at the requirement you're using and it looks like it already has the generated Python code included. Could you simply use the requirement like a normal third-party requirement? https://github.com/googleapis/python-api-common-protos/tree/main/google/api

Eric-Arellano avatar Aug 04 '22 19:08 Eric-Arellano

Resurrecting this issue since I have a need for it as well and are currently trying to wrap my head around how to best fit it into Pants.

One doubt that I have: why do you need Pants to generate the Protobuf files for you?

The need isn't for Pants to generate Python code from the 3rd-party protobuf, it's to be able to generate 1st-party code that have a dependency on these 3rd-party protobufs. protoc needs all imports within a .proto to be available in Pants' sandbox in order to do so (regardless if you're importing a 1st- or 3rd-party protobuf).

The only way to currently do this is (afaik) what OP does, aka. to have these 3rd-party protobufs in your repository at the expected paths (e.g. src/protos/google/type/money.proto). The problem here is that:

  1. You need to have 3rd-party code in your repository instead of including it from an URL a la http_source for file/resource.
  2. Pants will generate code for these 3rd-party protobufs even though the code is most likely already available from elsewhere (e.g. the Python dependency googleapis-common-protos in OPs case).
  3. If a Python package is published through Pants that contains this generated 1st party code, you likely want your package to depend on the PyPI package that provides the 3rd-party code rather than having your package conflict with it (assuming Pants does bundle the generated 3rd-party code, haven't tried it).

Not entirely sure how to solve it fully. The straight forward solution would be to simplest add a protobuf_dependency(source, provides) target that can be (automatically?) added as a dependency to a protobuf_source. E.g:

protobuf_dependency(
    source=http_source("https://raw.githubusercontent.com/googleapis/googleapis/91fb1b8865334acaf8f8763b6035b57cb3ce3248/google/type/date.proto", len=123, sha256="foobar",),
    provides="google/type/date.proto",
)

// any `protobuf_source` with an import of `google/type/date.proto` could now be automatically inferred to depend on this `protobuf_dependency`.

This does not explicitly solve the existing 3rd-party code package becoming a dependency of your generated 1st-party code, but if Pants spots the import of google.type.date_pb2 in your generated 1st-party code and you have a 3rd-party Python dependency that provides it, maybe it would already "just work"?

jyggen avatar Jan 09 '24 16:01 jyggen

This is become quite urgent for us, so I tried a workaround like the OP as well.

However, when I run a typecheck with Pants (2.17.0), I get this strange error:

18:53:57.50 [INFO] Starting: Building extra_type_stubs.pex
18:53:57.51 [INFO] Starting: Building mypy.pex from resource://pants.backend.python.typecheck.mypy/mypy.lock
18:53:59.39 [INFO] Completed: Building extra_type_stubs.pex
18:54:08.25 [INFO] Completed: Building mypy.pex from resource://pants.backend.python.typecheck.mypy/mypy.lock
18:54:08.66 [INFO] Starting: Building requirements_venv.pex
18:54:12.46 [INFO] Completed: Building requirements_venv.pex
Error: 2.48 [ERROR] 1 Exception encountered:

Engine traceback:
  in `check` goal

ProcessExecutionFailure: Process 'Building requirements_venv.pex' failed with exit code 1.
stdout:

stderr:
[Errno 2] No such file or directory: '../../../../../../../../../../installed_wheels/XXXSHORTHASH[271](https://github.com/XXXID/XXXID/actions/runs/XXXID/job/XXXID#step:7:272)XXXLONGHASH/opentelemetry_exporter_otlp_proto_http-1.21.0-py3-none-any.whl/opentelemetry/exporter/otlp' -> '/home/runner/.cache/pants/named_caches/pex_root/venvs/XXXLONGHASH/XXXLONGHASH.lck.work/lib/python3.11/site-packages/pex-ns-pkgs/2/opentelemetry/exporter/otlp'

It turns out that opentelemetry-exporter-otlp-proto-http depends on googleapis-common-protos, so this must be related somehow. I've tried the following:

  • Make sure we import the exact same version of the protos and have the same local copies.
  • Move our OpenTelemetry dependencies to "resolves_to_only_binary" and "resolves_to_no_binary". But no dice.

originalrkk avatar Feb 12 '24 18:02 originalrkk