faust icon indicating copy to clipboard operation
faust copied to clipboard

Set (and other 'supported') Fields not deserialized properly in faust-streaming - worked in robinhood faust 1.8.0.

Open matejlogar opened this issue 4 years ago • 0 comments

Steps to reproduce

Create faust record with field of type Set. Serialize and deserialize using json serializer (works with the default serializer). The resulting record has the field of type list instead of set. This is problematic as the documentation clearly states that Set and Tuple are supported field types.

Example:

from faust import Record
from typing import Any, Set

class RecBug(Record, coerce=True):
    set_: Set[str]


r_bug = RecBug({'a', 'b', 'c'})
r_bug_d = r_bug.dumps(serializer='json')
r_bug_dl = RecBug.loads(r_bug_d, serializer='json')
if not isinstance(r_bug_dl.set_, set):
    raise TypeError('Set field is of type {}.'.format(type(r_bug_dl.set_)))

Running the above results in TypeError as the set_ field is not of type set after conversion to and from json:

Traceback (most recent call last):
  File "..."
    raise TypeError('Set field is of type {}.'.format(type(r_bug_dl.set_)))
TypeError: Set field is of type <class 'list'>.

Expected behavior

Set type field should be of type set after being json serialized and deserialized. This used to work in robinhood/faust installed as pip install faust[fast,statsd]==1.8.0

Actual behavior

Field changed type to the list after deserialization. Set is converted to list in faust.utils.json, but is never converted back.

Remarks

A few remarks based on the observation of the problem. I think conversion back to the Set used to be done during the initialization process of the record in robinhood/faust 1.8.0. I think that SetNode, TupleNode etc served this purpose somehow, but general TypeNode is the only one used during the initialization in the above example.

Quick work around that shows how this can be solved during the initialization of the record is by creating custom FieldDescriptor

class SetField(FieldDescriptor[Set]):
    def prepare_value(self, value: Any, *, coerce: bool = None) -> Optional[Set]:
        if self.should_coerce(value, coerce):
            if value is not None and not isinstance(value, set):
                return set(value)
            return value
        return value

and use it as

class Rec(Record, coerce=True):
    set_: Set = SetField(default=set(), required=False)

In this case, the same experiment, namely

r = Rec({'a', 'b', 'c'})
r_d = r.dumps(serializer='json')
r_dl = Rec.loads(r_d, serializer='json')
if not isinstance(r_dl.set_, set):
    raise TypeError('Set field in final rec is of type {}.'.format(type(r_dl.set_)))

does not raise the TypeError.

Versions

  • Python version 3.6.9 and 3.8
  • Faust version 0.6.3
  • Operating system Ubuntu 20.04.2 LTS

matejlogar avatar Aug 18 '21 15:08 matejlogar