phenopacket-schema
phenopacket-schema copied to clipboard
Relax protobuf dependencies further in python package
You are limiting the protobuf versions to >=3.20.2,<4.0.0. Is this necessary? protobuf 4.x is available since 2022.
https://github.com/phenopackets/phenopacket-schema/blob/d69fc9e6fe425ed0ca05ec81ee5b8f0380f4754d/python/pyproject.toml#L36
This prevents anything else depending on phenopackets to use a more recent protobuf version.
@iimpulse can you look into this, please?
@iimpulse This is also an issue for me. Is there a specific reason the version is pinned?
Hi @gilmorera @holtgrewe
Is there a specific reason the version is pinned?
Phenopacket Schema uses Protobuf 3.20.3:
https://github.com/phenopackets/phenopacket-schema/blob/d69fc9e6fe425ed0ca05ec81ee5b8f0380f4754d/pom.xml#L53
therefore, out of caution, we allow protobuf<4.0.0, per Semantic versioning. In principle, protobuf>=4.0.0 may work OK, it's hard to be sure though..
I would like to fix this issue with a PR where that checks if the messages serialized by protobuf-java==3.20.3 are decoded properly with a newer Python protobuf.
This seems to work for Python bindings generated with Protobuf compiler 3.20.3 (in Phenopacket Schema) or with 4.29.3 (latest protobuf-java) and for protobuf==5.29.3 (latest Protobuf release for Python).
I'll be posting updates here..
Thanks for your quick response. I'll check back later for your updates.
any progress on this?
@julesjacobsen @ielis @iimpulse As far as I can see the latest Protobuf version would have some advantages for us
- In early proto3, you couldn't tell whether a scalar field was unset or just set to the default (e.g., 0 or ""). This led to ambiguous semantics. We can now write-optional string name = 1; 2.We can specify json names: string full_name = 1 [json_name = "fullName"];
For the planned update of phenopacket schema, we should therefore update to the current version of protobuf. I do not see anything that should break any existing code (except that we may be able to remove some code for dealing with default vs missing values in JSON output).
So I would suggest we change to "<6". Thoughts?
As far as I can see the latest Protobuf version would have some advantages for us
👍
In early proto3, you couldn't tell whether a scalar field was unset or just set to the default (e.g., 0 or ""). This led to ambiguous semantics. We can now write-optional string name = 1;
This would be awesome!
For the planned update of phenopacket schema, we should therefore update to the current version of protobuf.
I am not sure am 100% on board here. I believe we should indicate a range of protobuf versions that phenopackets are OK with; the range tested for functionality with old and new data. That is also the rationale behind this PR. The PR adds a protobuf file created with protobuf>=3.20.2,<4.0.0 version (old) and tests if the file can be deserialized with protobuf>=3.20.2,<7.0.0 (new in the PR) to get the same value back.
Therefore, for the planned update, I do not think we should remove support for older protobuf. Please let me know if there is anything I don't see though.
So I would suggest we change to "<6"
👍 with a small change - we should change to protobuf>=3.20.2,<7.0.0.