phenopacket-schema icon indicating copy to clipboard operation
phenopacket-schema copied to clipboard

Relax protobuf dependencies further in python package

Open holtgrewe opened this issue 1 year ago • 1 comments

You are limiting the protobuf versions to >=3.20.2,<4.0.0. Is this necessary? protobuf 4.x is available since 2022.

https://github.com/phenopackets/phenopacket-schema/blob/d69fc9e6fe425ed0ca05ec81ee5b8f0380f4754d/python/pyproject.toml#L36

This prevents anything else depending on phenopackets to use a more recent protobuf version.

holtgrewe avatar Sep 07 '24 06:09 holtgrewe

@iimpulse can you look into this, please?

julesjacobsen avatar Sep 09 '24 08:09 julesjacobsen

@iimpulse This is also an issue for me. Is there a specific reason the version is pinned?

ghost avatar Jan 13 '25 16:01 ghost

Hi @gilmorera @holtgrewe

Is there a specific reason the version is pinned?

Phenopacket Schema uses Protobuf 3.20.3: https://github.com/phenopackets/phenopacket-schema/blob/d69fc9e6fe425ed0ca05ec81ee5b8f0380f4754d/pom.xml#L53

therefore, out of caution, we allow protobuf<4.0.0, per Semantic versioning. In principle, protobuf>=4.0.0 may work OK, it's hard to be sure though..


I would like to fix this issue with a PR where that checks if the messages serialized by protobuf-java==3.20.3 are decoded properly with a newer Python protobuf.

This seems to work for Python bindings generated with Protobuf compiler 3.20.3 (in Phenopacket Schema) or with 4.29.3 (latest protobuf-java) and for protobuf==5.29.3 (latest Protobuf release for Python).

I'll be posting updates here..

ielis avatar Jan 13 '25 22:01 ielis

Thanks for your quick response. I'll check back later for your updates.

ghost avatar Jan 14 '25 21:01 ghost

any progress on this?

stolpeo avatar Jul 21 '25 13:07 stolpeo

@julesjacobsen @ielis @iimpulse As far as I can see the latest Protobuf version would have some advantages for us

  1. In early proto3, you couldn't tell whether a scalar field was unset or just set to the default (e.g., 0 or ""). This led to ambiguous semantics. We can now write-optional string name = 1; 2.We can specify json names: string full_name = 1 [json_name = "fullName"];

For the planned update of phenopacket schema, we should therefore update to the current version of protobuf. I do not see anything that should break any existing code (except that we may be able to remove some code for dealing with default vs missing values in JSON output).

So I would suggest we change to "<6". Thoughts?

pnrobinson avatar Jul 30 '25 13:07 pnrobinson

As far as I can see the latest Protobuf version would have some advantages for us

👍

In early proto3, you couldn't tell whether a scalar field was unset or just set to the default (e.g., 0 or ""). This led to ambiguous semantics. We can now write-optional string name = 1;

This would be awesome!


For the planned update of phenopacket schema, we should therefore update to the current version of protobuf.

I am not sure am 100% on board here. I believe we should indicate a range of protobuf versions that phenopackets are OK with; the range tested for functionality with old and new data. That is also the rationale behind this PR. The PR adds a protobuf file created with protobuf>=3.20.2,<4.0.0 version (old) and tests if the file can be deserialized with protobuf>=3.20.2,<7.0.0 (new in the PR) to get the same value back.

Therefore, for the planned update, I do not think we should remove support for older protobuf. Please let me know if there is anything I don't see though.


So I would suggest we change to "<6"

👍 with a small change - we should change to protobuf>=3.20.2,<7.0.0.

ielis avatar Jul 31 '25 07:07 ielis