gapic-generator-python
gapic-generator-python copied to clipboard
Fix attribute types for autogenerated protobuf classes
Issue
- The autogenerated protobuf classes are of the following form:
class ProtobufClass(proto.Message): """... Attributes: # Primitive types attribute_1 (str):... attribute_2 (int):... attribute_3 (float):... ... # Compound types attribute_A (TYPE_A):... attribute_B (Sequence[TYPE_B]):... attribute_C (Mapping[KEY_TYPE_C, VALUE_TYPE_C]):... """ attribute_1 = proto.Field(proto.STRING, number=1) attribute_2 = proto.Field(proto.INT32, number=2) attribute_3 = proto.Field(proto.FLOAT, number=3) ... attribute_A = proto.Field(proto.MESSAGE, number=4, message="TYPE_A") attribute_B = proto.RepeatedField(proto.MESSAGE, number=5, message="TYPE_B") attribute_C = proto.MapField(...) ...- ✔️ The docstrings seem perfectly correct (they match the actual types at runtime).
- ❌ The attributes indicate protobuf types (always
proto.Fieldproto.RepeatedField, orproto.MapField?).
- As a consequence, type checkers (e.g. pyright) can report issues such as:
-
Cannot access member "MEMBER" for type "Field" Member "MEMBER" is unknown -
Expression of type "Field" cannot be assigned to return type "TYPE" "Field" is incompatible with "TYPE" -
"RepeatedField" is not iterable "__iter__" method not defined -
Argument of type "RepeatedField" cannot be assigned to parameter "__obj" of type "Sized" in function "len" "RepeatedField" is incompatible with protocol "Sized" "__len__" is not present
-
Proposal
- Adding type annotations at the attribute level (
attribute_X: TYPE_X = proto.*) may consistently fix these type checker issues. - In a nutshell, here is what it could look like:
class ProtobufClass(proto.Message): """... Attributes: attribute_1 (TYPE_1):... attribute_2 (TYPE_2):... attribute_3 (Sequence[TYPE_3]):... attribute_4 (Mapping[KEY_TYPE_4, VALUE_TYPE_4]):... """ attribute_1: TYPE_1 = proto.Field(proto.PRIMITIVE_PROTO_TYPE, number=1) attribute_2: TYPE_2 = proto.Field(proto.MESSAGE, number=2, message="TYPE_2") attribute_3: Sequence[TYPE_3] = proto.RepeatedField(proto.MESSAGE, number=3, message="TYPE_3") attribute_4: Mapping[KEY_TYPE_4, VALUE_TYPE_4] = proto.MapField(...) ...
Why is it an issue
- It's likely to generate more and more friction as static type checking becomes the norm (+ the numbers of users and client libraries keep increasing).
- Some client libraries are pretty large (e.g. google.cloud.documentai.Document) and IDE autocompletion is a must-have.
- This can lead developers to add unnecessary boilerplate (e.g.
typing.TypeAlias) or casting (e.g.typing.castor worse). - This has already been reported: SA#73126100 & gapic-generator-python#694.
Thanks @PicardParis for your request. In case you feel like you can fix it yourself, feel free to do so by opening a PR (that would be greatly appreciated given we don't have enough resources to address all open issues in timely manner). The changes are expected to be done around here:
https://github.com/googleapis/gapic-generator-python/blob/main/gapic/templates/%25namespace/%25name_%25version/%25sub/types/_message.py.j2#L17
https://github.com/googleapis/gapic-generator-python/blob/main/gapic/ads-templates/%25namespace/%25name/%25version/%25sub/types/_message.py.j2#L17
Hi @vam-google, @parthea,
- I've set up the dev environment in a VM and made a few positive tests to update/generate classes. I should be able to go on and propose a PR.
- Why are there 2 sets of templates (
templatesandads-templates)? Is it enough to modifytemplates(forgoogle.cloud.*) or should both be updated? - To avoid confusion, can you assign the issue to me? (I don't have any rights for this repo)