protobuf icon indicating copy to clipboard operation
protobuf copied to clipboard

Compatibility of generated Python code with protoc >= 3.19.0

Open aabmass opened this issue 3 years ago • 14 comments

I have some questions regarding the compatibility guarantees of generated code with individual language APIs, specifically Python. According to https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates:

each language has its own major version that can be incremented independently of other languages, as covered later in this topic with the Python release. [...] The first instance of this new versioning scheme is the new version of the Python API, 4.21.0

As everyone is aware from https://github.com/protocolbuffers/protobuf/issues/10051

Python upb requires generated code that has been generated from protoc 3.19.0 or newer.

That makes sense going forward when trying to use upb with protobuf 4.x. Some questions:

  1. Will code generated with protoc >= 3.19.0 continue to work with Python protobuf 3.x? When will protoc stop being compatible with protobuf 3.x?
  2. Now that individual language API major versions are incremented independently, how are users expected to know which protoc version generates code compatible with which individual language API version? Right now, the latest protoc version is 3.21.10 which "targets" python 4.x afaict. The mismatch of major versions here is confusing.

I found https://github.com/protocolbuffers/protobuf/issues/4945 referencing compatibility tests, but they were deleted in https://github.com/protocolbuffers/protobuf/pull/8570. Apologies if this Is this documented somewhere and I missed it.

Additional context

https://github.com/open-telemetry/opentelemetry-python/issues/2880#issuecomment-1334556835

aabmass avatar Dec 02 '22 02:12 aabmass

Historically protobuf has been very bad about providing a crisp definition here. The good news is that I have something in progress that should define this more carefully and I will try to publish it in the next month or so (holidays may interfere)

fowles avatar Dec 02 '22 14:12 fowles

@fowles that's great to hear, looking forward to it.

In the shorter term, could I get some clarity on if code generated with latest protoc version 3.21 is backward compatible with protobuf 3.x python API? I am aware of the API breaking changes, just wondering about the generated code. Based on some experiments, things seem to work fine.

The context here is that many libraries authors would like to support both 3.x and 4.x for some time to avoid creating dependency conflicts for library users in their transitive dependencies.

aabmass avatar Dec 02 '22 17:12 aabmass

The code generated by 4.21.x happens to be backwards compatible with 3.20.x and (I think) 3.19.x. The eventual criteria we are aiming for will not guarantee that code generated from a newer compiler supports an older runtime.

fowles avatar Dec 02 '22 19:12 fowles

The code generated by 4.21.x happens to be backwards compatible with 3.20.x and (I think) 3.19.x.

Anecdotally, protoc 3.19.4 works with opentelemetry-proto >= 3.19 and gives the upb benefits for 4.x. protoc 3.20 did not work with protobuf 3.19.x.

aabmass avatar Dec 05 '22 16:12 aabmass

I have to say I am truly puzzled. How is a Python project using protobuf supposed to cleanly manage its dependency?

I would really appreciate any documentation, best practices, or at least sanctioned reference projects.

Our thoughts were this:

  • Compiled _pb2.py files don't belong under version control because they are compiled.
  • Therefore we have to detect and use protoc during setup.py
  • Since protoc doesn't come from pypi (except for unofficial short-term packages - we have to use the protoc from the developer or build environment
  • However, we cannot make too strict assumptions on the protoc version due since they come from different sources
  • So we have to support multiple protoc versions
  • But we have had incompatibilities between protobuf and the generated _pb2.py files if the protoc and protobuf versions don't match
  • So we need to set a dependency to the specific protobuf version based on the protc version during compilation
  • Previous to the 2022 versioning change we basically depend on protobuf=m.n.* based on m.n.p from protoc --version (which felt already pretty unclean)
  • But now, on the one hand, to keep ensuring protoc/protobuf compatibility we'd have to depend on protobuf=*.n.* - which to my knowledge is not possible to formulate as requirement
  • And of course this raises the question of compatibility between our code and the protobuf library due to major version changes in protobuf.
  • But how does one retain compatibility to a range of protoc versions while also specifying compatibility to protobuf?

What we came up with is to sort of hardcode a minor->major map in our setup.py, so the build works sort of like this

  • Get m.n.p from protoc
  • m' = 3 if n < 21 else 4
  • Depend on protobuf>=m'.n,<m'n+1

But this means hardcoding these major version jumps.

Also when a major jump happens, let's say to 5.25.0 and we want to support the latest protoc 3.25.0 on development environments, we need to 1) check/hope that the major version is compatible with 5. and 2) add this to the hardcoded list because there will never be a 4.25.0 - right?

tilsche avatar Feb 02 '23 10:02 tilsche

The code generated by 4.21.x happens to be backwards compatible with 3.20.x and (I think) 3.19.x. The eventual criteria we are aiming for will not guarantee that code generated from a newer compiler supports an older runtime.

Would it be possible to include the compiler version as a comment in the generated code? E.g. for python this could look like this:

# -*- coding: utf-8 -*-
# Generated by the protocol buffer compiler.  DO NOT EDIT!
# compiler: protoc 3.19.4
# requires: protobuf >= 3.19.4, == 3.* ; python_version < "2.7"
# source: some_file.proto
"""Generated protocol buffer code."""

While this won't help for "ancient" code, where the compatibility issues hurt the most, it could at least help future generations to put the puzzle pieces together.

urld avatar Oct 10 '23 14:10 urld

We have work under way to add explicit compatibility checks to our various codegen targets.

fowles avatar Oct 16 '23 14:10 fowles

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.

This issue is labeled inactive because the last activity was over 90 days ago.

github-actions[bot] avatar Jan 16 '24 10:01 github-actions[bot]

We have work under way to add explicit compatibility checks to our various codegen targets.

Has there been any progress here? This issue remains a very prominent pain point for us.

tilsche avatar Jan 17 '24 12:01 tilsche

Yes we're making progress to have "poison pills" that enforces https://protobuf.dev/support/cross-version-runtime-guarantee/ firstly in Java, C++ and Python.

The code generated by 4.21.x happens to be backwards compatible with 3.20.x and (I think) 3.19.x. The eventual criteria we are aiming for will not guarantee that code generated from a newer compiler supports an older runtime.

Would it be possible to include the compiler version as a comment in the generated code? E.g. for python this could look like this:

# -*- coding: utf-8 -*-
# Generated by the protocol buffer compiler.  DO NOT EDIT!
# compiler: protoc 3.19.4
# requires: protobuf >= 3.19.4, == 3.* ; python_version < "2.7"
# source: some_file.proto
"""Generated protocol buffer code."""

While this won't help for "ancient" code, where the compatibility issues hurt the most, it could at least help future generations to put the puzzle pieces together.

I think we now have the version info for debugging in the comments like this since 25.x. We're going to have code to validate that compatible runtime version is linked.

shaod2 avatar Jan 17 '24 20:01 shaod2

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.

This issue is labeled inactive because the last activity was over 90 days ago.

github-actions[bot] avatar Apr 18 '24 10:04 github-actions[bot]

Commenting for activity. This unsynced versioning system w/o an easy way to run protoc from python affects typeshed as well where we have to download protobuf & protoc from different sources and duplicate version requirements, where different stubs may have different proto requirements.

Edit: as for running protoc from python, I just found that https://pypi.org/project/grpcio-tools/ is a thing. python3 -m grpc_tools.protoc works fine. But doesn't solve the issue of ensuring version compatibility. For our needs, "running the latest protoc" is sufficient.

Avasam avatar Apr 18 '24 17:04 Avasam

Slightly related, but is there any reason why apt-get only has an old version of protobuf? I'm a bit surprised the latest releases aren't released as apt packages too. https://tracker.debian.org/pkg/protobuf

AndrewQuijano avatar May 27 '24 03:05 AndrewQuijano