python-betterproto icon indicating copy to clipboard operation
python-betterproto copied to clipboard

Default values are not serialized in oneofs when calling .to_dict() or .to_json() (which is different behaviour from official protobuf impl)

Open dhendry opened this issue 4 years ago • 4 comments

Consider this protobuf message:

message OneOfDemo{
    oneof value_type {
        bool bool_value = 1;
        int64 int64_value = 2;
    }
}

Using the official protobuf codegen and library (libprotoc 3.12.3):

from google.protobuf.json_format import MessageToDict

# Official protobuf:
print("True: ", json.dumps(MessageToDict(OneOfDemo(bool_value=True))))
print("False:", json.dumps(MessageToDict(OneOfDemo(bool_value=False))))
print("1:    ", json.dumps(MessageToDict(OneOfDemo(int64_value=1))))
print("0:    ", json.dumps(MessageToDict(OneOfDemo(int64_value=0))))

Output:
True:  {"boolValue": true}
False: {"boolValue": false}
1:     {"int64Value": "1"}
0:     {"int64Value": "0"}

Using betterproto (1.2.5)

# Generated code: 
@dataclass
class OneOfDemoBetter(betterproto.Message):
    bool_value: bool = betterproto.bool_field(1, group="value_type")
    int64_value: int = betterproto.int64_field(2, group="value_type")

print("True: ", OneOfDemoBetter(bool_value=True).to_json())
print("False:", OneOfDemoBetter(bool_value=False).to_json())
print("1:    ", OneOfDemoBetter(int64_value=1).to_json())
print("0:    ", OneOfDemoBetter(int64_value=0).to_json())

Output:
True:  {"boolValue": true}
False: {}
1:     {"int64Value": "1"}
0:     {}

This obviously leads to the following inconsistency:

od = OneOfDemoBetter(bool_value=False)
print(betterproto.which_one_of(od, "value_type"))

od2 = OneOfDemoBetter().from_json(od.to_json())
print(betterproto.which_one_of(od2, "value_type"))

Output:
('bool_value', False)
('', None)

Misc

  • Python 3.8.1
  • I love what you are trying to do with this project btw - the official generated protobuf code is AWEFUL and this is a HUGE step forwards. This one difference in behaviour is blocking wholesale adoption of your project unfortunately
  • Tangentially related ideas and observations
    • Would it make sense to have values in the oneof be declared Optional? (ie: bool_value: Optional[bool]) since only one is expected to be set? Personally I would prefer this but I can understand if its getting too far away from the idea of "there are no nulls in proto3".
    • .to_json() is missing the casing arg (and include_default_values) from .to_dict()
    • Stretch/wishlist: personally I would find it super valuable if there were some way to default to Casing.SNAKE for all .to_json() operations automatically. Perhaps something like a default option you could specify on a per message basis?

Great work on this project! Keep it up!

dhendry avatar Jul 08 '20 16:07 dhendry

I hacked together a proposed fix for this in https://github.com/danielgtaylor/python-betterproto/pull/110. I'm also really impressed with this project!

bradykieffer avatar Jul 08 '20 17:07 bradykieffer

See also:

  • https://github.com/danielgtaylor/python-betterproto/issues/95 - to_dict() fails to write boolean fields without default when value is False
  • https://github.com/danielgtaylor/python-betterproto/issues/63 - oneof with enum : which_one_of returns the wrong value when enum is the first field

Thank you for your great write-up! This makes a clear case to include the default values in the output.

I'll respond further in your PR 😄

boukeversteegh avatar Jul 08 '20 20:07 boukeversteegh

As for your other suggestions, they are all valuable ideas. As it will be hard to track them being inside this issue, if you like, please create a proposal for those so we can categorize / prioritize them, and discuss them separately.

boukeversteegh avatar Jul 08 '20 20:07 boukeversteegh

I'm not sure whether this is the correct place to comment, but I've noticed that the default values are also not serialized when calling bytes() on a dataclass for a oneoff.

I have a message with a oneoff field. One of the options of the oneoff is a submessage with two required fields, an enum field and a uint32 field. when calling bytes(message) with the default value (which is 0) on the uint32 field, it is not added to the binary data.

When I do the same using the default python protobuff implementation this field is added to the binary data.

enum Enum {
    a = 1;
    b = 2;
};

enum OtherEnum {
    c = 3;
    d = 4;
};

message First{
    required OtherEnum enum_name  = 1;
    required uint32 uint32_name = 2;
};

message Second{
    required uint32 offset = 1;
    required bytes data   = 2;
}
message MyMessage{
    required Enum myEnum = 1;

    oneof params {
        First first = 2;
        Second second = 3;
    }
};

So in the case of MyMessage(myEnum=Enum.a, first=First(enum_name=OtherEnum.d, uint32_name =0) when bytes is called, the uint32_name field is not present in the binary data even though it is required.

If this belongs in a seperate issue please let me know.

sach-edna avatar Mar 29 '21 09:03 sach-edna