Set (and other 'supported') Fields not deserialized properly in faust-streaming - worked in robinhood faust 1.8.0.
Steps to reproduce
Create faust record with field of type Set. Serialize and deserialize using json serializer (works with the default serializer). The resulting record has the field of type list instead of set. This is problematic as the documentation clearly states that Set and Tuple are supported field types.
Example:
from faust import Record
from typing import Any, Set
class RecBug(Record, coerce=True):
set_: Set[str]
r_bug = RecBug({'a', 'b', 'c'})
r_bug_d = r_bug.dumps(serializer='json')
r_bug_dl = RecBug.loads(r_bug_d, serializer='json')
if not isinstance(r_bug_dl.set_, set):
raise TypeError('Set field is of type {}.'.format(type(r_bug_dl.set_)))
Running the above results in TypeError as the set_ field is not of type set after conversion to and from json:
Traceback (most recent call last):
File "..."
raise TypeError('Set field is of type {}.'.format(type(r_bug_dl.set_)))
TypeError: Set field is of type <class 'list'>.
Expected behavior
Set type field should be of type set after being json serialized and deserialized. This used to work in robinhood/faust installed as pip install faust[fast,statsd]==1.8.0
Actual behavior
Field changed type to the list after deserialization. Set is converted to list in faust.utils.json, but is never converted back.
Remarks
A few remarks based on the observation of the problem. I think conversion back to the Set used to be done during the initialization process of the record in robinhood/faust 1.8.0. I think that SetNode, TupleNode etc served this purpose somehow, but general TypeNode is the only one used during the initialization in the above example.
Quick work around that shows how this can be solved during the initialization of the record is by creating custom FieldDescriptor
class SetField(FieldDescriptor[Set]):
def prepare_value(self, value: Any, *, coerce: bool = None) -> Optional[Set]:
if self.should_coerce(value, coerce):
if value is not None and not isinstance(value, set):
return set(value)
return value
return value
and use it as
class Rec(Record, coerce=True):
set_: Set = SetField(default=set(), required=False)
In this case, the same experiment, namely
r = Rec({'a', 'b', 'c'})
r_d = r.dumps(serializer='json')
r_dl = Rec.loads(r_d, serializer='json')
if not isinstance(r_dl.set_, set):
raise TypeError('Set field in final rec is of type {}.'.format(type(r_dl.set_)))
does not raise the TypeError.
Versions
- Python version 3.6.9 and 3.8
- Faust version 0.6.3
- Operating system Ubuntu 20.04.2 LTS