python-betterproto icon indicating copy to clipboard operation
python-betterproto copied to clipboard

Integer keys in maps are not parsed correctly from JSON

Open jonmather opened this issue 4 years ago • 5 comments

JSON requires that keys be represented as strings, however proto allows map keys to be any integral or string type. So when specifying a message with a JSON representation, integer keys must be represented as strings. These are not picked up by the .from_json() method, which just runs json.loads(value).

Example: The proto message

message MapTest {
  map<string,double> string_map = 1;
  map<int32,double> int_map = 2;
}

with JSON example:

{
    "string_map": {
        "hello": 2.3,
        "there": 5.6
    },
    "int_map": {
        "1": 7.9,
        "2": 8.3
    }
}

Would be parsed by better proto as MapTest(string_map={'hello': 2.3, 'there': 5.6}, int_map={'1': 7.9, '2': 8.3})

To overcome this we just need to add a custom JSON decoder to the from_json method which looks for any keys which are integers.

An example implementation, adapted from this link would be:

class Decoder(json.JSONDecoder):
    def decode(self, s):
        result = super().decode(s)  # result = super(Decoder, self).decode(s) for Python 2.x
        return self._decode(result)

    def _decode(self, o):
        if isinstance(o, str):# or isinstance(o, unicode):
            try:
                return int(o)
            except ValueError:
                return o
        elif isinstance(o, dict):
            try:
                return {int(k): self._decode(v) for k, v in o.items()}
            except ValueError:
                return {k: self._decode(v) for k, v in o.items()}
        elif isinstance(o, list):
            return [self._decode(v) for v in o]
        else:
            return o

We'd then run json.loads(value, cls=Decoder)

jonmather avatar May 12 '21 18:05 jonmather

That however breaks strings that are integers currently

Gobot1234 avatar May 12 '21 21:05 Gobot1234

An alternative approach would be a type check for map keys in the .from_dict() method, which is then a bit more targeted and should resolve the issue.

jonmather avatar May 13 '21 11:05 jonmather

I imagine that would hurt performance pretty badly. Not too sure where's best to stand on that.

Gobot1234 avatar May 13 '21 12:05 Gobot1234

There are already a lot of type checks in .from_dict(), e.g. for timedelta and datetime. Doesn't seem like adding one for int types, and only if the field type is a map, would have that much of an impact?

jonmather avatar May 26 '21 19:05 jonmather

That's a fair point I guess we'll just have to benchmark it

Gobot1234 avatar May 26 '21 19:05 Gobot1234