python-betterproto
python-betterproto copied to clipboard
Bug: Empty string overriding actual parsed value
I don't have an example of how this breaks to share (yet), but this change in 2.0.0b3 is causing issues for us:
https://github.com/danielgtaylor/python-betterproto/blob/master/src/betterproto/init.py#L749-L751
We have a deeply nested protobuf object inside an array, and the first field [string(1)] is getting SerializeToString() serialized correctly, but when FromString() is called, it parses it, assigns the string correctly, and then somehow overwrites it again with an empty string.
Sorry, I'm a developer and I understand it's not helpful without some test code to demonstrate this, but just reporting the issue we have.
I'll try and get together an example next week.
UPDATE:
I'm not sure the link I pasted above is the root cause, here's an example that shows a few issues:
from dataclasses import dataclass
from typing import List
import betterproto
@dataclass(eq=False, repr=False)
class SubItem(betterproto.Message):
count: float = betterproto.double_field(1)
@dataclass(eq=False, repr=False)
class Item(betterproto.Message):
time: str = betterproto.string_field(1)
sub_items: List["SubItem"] = betterproto.message_field(2)
@dataclass(eq=False, repr=False)
class TopItem(betterproto.Message):
item: "Item" = betterproto.message_field(1)
good_item = TopItem(
item=Item(
time='test',
sub_items=[
SubItem(count=1),
SubItem(count=2),
]
)
)
bad_item = TopItem(
item=Item(
time='test',
sub_items=[
SubItem(count=0), # A "0" here seems to cause all sorts of misbehavior.
SubItem(count=1),
]
)
)
def test_passing():
encoded_item = good_item.SerializeToString()
decoded = TopItem.FromString(encoded_item)
assert decoded.item.time == 'test'
assert len(decoded.item.sub_items) == 2
assert decoded.item.sub_items[0].count == 1
assert decoded.item.sub_items[1].count == 2
def test_breaking_no_time_if_before_list_with_0_for_value():
encoded_item = bad_item.SerializeToString()
decoded = TopItem.FromString(encoded_item)
assert len(decoded.item.sub_items) == 2
def test_breaking_with_missing_sub_item():
encoded_item = bad_item.SerializeToString()
decoded = TopItem.FromString(encoded_item)
assert len(decoded.item.sub_items) == 2
$ python --version
Python 3.8.6
$ pip list | grep proto
betterproto 2.0.0b3
googleapis-common-protos 1.52.0
proto-plus 1.13.0
protobuf 3.14.0
Seems to be an issue with Lists of Message's that have double fields when they are set to 0, they don't get a placeholder set for them in the resulting bytes.
https://github.com/danielgtaylor/python-betterproto/blob/master/src/betterproto/init.py#L722-L729
Here's an even shorter example.
from dataclasses import dataclass
from typing import List
import betterproto
def test_serializing():
@dataclass(eq=False, repr=False)
class Y(betterproto.Message):
count: float = betterproto.double_field(1)
@dataclass(eq=False, repr=False)
class X(betterproto.Message):
# For some reason this leading field is necessary to trigger this condition.
name: str = betterproto.string_field(1)
y: List[Y] = betterproto.message_field(2)
x = X(name='fun', y=[Y(count=0), Y(count=1)])
encoded_x = x.SerializeToString()
decoded_x = X.FromString(encoded_x)
assert decoded_x == X(name='fun', y=[Y(count=0), Y(count=1)])
> assert decoded_x == X(name='fun', y=[Y(count=0), Y(count=1)])
E AssertionError: assert X(name='', y=[Y(count=1.0)]) == X(name='fun', y=[Y(count=0), Y(count=1)])
Hey @harmon, could you check if you have this problem, when installing betterproto from master as well? 🙂 I had that at one point as well and I think this: https://github.com/danielgtaylor/python-betterproto/pull/417 was the root-cause of it. Unfortunately that fix is not in a pipy release yet 🙂
This was fixed in b6