python-diskcache icon indicating copy to clipboard operation
python-diskcache copied to clipboard

Enable custom type serialization in `JSONDisk`

Open randomir opened this issue 2 years ago • 2 comments

Currently in JSONDisk, json.loads and json.dumps are used for keys/values (de-)serialization (with the stdlib json module). Both the json module and arguments to load/dump are hard-coded, making it hard to (de-)serialize custom/user types, like NumPy types, Pydantic models, even Python builtins like set or datetime.

To implement custom JSON serialization, one needs to practically rewrite JSONDisk class. While, instead, providing json.JSONEncoder/json.JSONDecoder subclassed would be a lot easier.

I'm imagining new JSONDisk arguments, json_encoder and json_decoder that could be provided by the user on instantiation and then used during (de-)serialization.

randomir avatar Aug 09 '23 14:08 randomir

I would rather not add more arguments. What about changing references like “json.loads” into instance attributes like “self.json_loads” and making “json_loads = json.loads” as a class attribute. Then you can inherit and assign those attributes as you like.

grantjenks avatar Aug 09 '23 18:08 grantjenks

That sounds good as well. But if generalizing it this way, it would be nice to define:

def json_dumps(obj: Any, **kwargs) -> bytes:
    return json.dumps(obj, **kwargs).encode('utf-8')

i.e. dumps method to return bytes instead of str that needs to be encoded. That's because some libraries, like orjson, return already byte-encoded JSON.

randomir avatar Aug 09 '23 20:08 randomir