python-betterproto Integer keys in maps are not parsed correctly from JSON

JSON requires that keys be represented as strings, however proto allows map keys to be any integral or string type. So when specifying a message with a JSON representation, integer keys must be represented as strings. These are not picked up by the .from_json() method, which just runs json.loads(value).

Example: The proto message

message MapTest {
  map<string,double> string_map = 1;
  map<int32,double> int_map = 2;
}

with JSON example:

{
    "string_map": {
        "hello": 2.3,
        "there": 5.6
    },
    "int_map": {
        "1": 7.9,
        "2": 8.3
    }
}

Would be parsed by better proto as MapTest(string_map={'hello': 2.3, 'there': 5.6}, int_map={'1': 7.9, '2': 8.3})

To overcome this we just need to add a custom JSON decoder to the from_json method which looks for any keys which are integers.

An example implementation, adapted from this link would be:

class Decoder(json.JSONDecoder):
    def decode(self, s):
        result = super().decode(s)  # result = super(Decoder, self).decode(s) for Python 2.x
        return self._decode(result)

    def _decode(self, o):
        if isinstance(o, str):# or isinstance(o, unicode):
            try:
                return int(o)
            except ValueError:
                return o
        elif isinstance(o, dict):
            try:
                return {int(k): self._decode(v) for k, v in o.items()}
            except ValueError:
                return {k: self._decode(v) for k, v in o.items()}
        elif isinstance(o, list):
            return [self._decode(v) for v in o]
        else:
            return o

We'd then run json.loads(value, cls=Decoder)

May 12 '21 18:05 jonmather

That however breaks strings that are integers currently

May 12 '21 21:05 Gobot1234

An alternative approach would be a type check for map keys in the .from_dict() method, which is then a bit more targeted and should resolve the issue.

May 13 '21 11:05 jonmather

I imagine that would hurt performance pretty badly. Not too sure where's best to stand on that.

May 13 '21 12:05 Gobot1234

There are already a lot of type checks in .from_dict(), e.g. for timedelta and datetime. Doesn't seem like adding one for int types, and only if the field type is a map, would have that much of an impact?

May 26 '21 19:05 jonmather

That's a fair point I guess we'll just have to benchmark it

May 26 '21 19:05 Gobot1234