dataclasses-json icon indicating copy to clipboard operation
dataclasses-json copied to clipboard

Loading an optional, not present field causes KeyError

Open ghost opened this issue 5 years ago • 5 comments

Hello,

When defining a field as Optional and then loading a JSON that does not contain that key causes a KeyError to be thrown from core.py line 151:

I am aware that this is documented but it could be handled in a nicer way (see below).

Traceback (most recent call last):
  File "***/test.py", line 15, in <module>
    pprint.pprint(TestData.from_json("{}"))
  File "***c\venv\lib\site-packages\dataclasses_json\api.py", line 78, in from_json
    return cls.from_dict(kvs, infer_missing=infer_missing)
  File "***\venv\lib\site-packages\dataclasses_json\api.py", line 85, in from_dict
    return _decode_dataclass(cls, kvs, infer_missing)
  File "***\venv\lib\site-packages\dataclasses_json\core.py", line 150, in _decode_dataclass
    field_value = kvs[field.name]
KeyError: 'test'

Code to reproduce the issue:

import pprint
from dataclasses import dataclass
from typing import Optional

from dataclasses_json import dataclass_json, DataClassJsonMixin


@dataclass_json
@dataclass
class TestData(DataClassJsonMixin):
    test: Optional[str]


if __name__ == "__main__":
    pprint.pprint(TestData.from_json("{}"))

Python version used: 3.8.2 Operating system: Windows 10 Pro

Suggested fix:

Add a check for KeyError instead of checking for None.

ghost avatar May 17 '20 14:05 ghost

I think this is the expected behavior. Let me explain why:

  • typing.Optional indicates that something may be None, not that a parameter is actually optional in the sense that you do not have to provide it
  • Thus, you are creating TestData with an empty dict, not providing any data for the required field "test".

If you don't want the crash to happen, define test as test: Optional[str] = None or add "test": None into your initialization dict.

I do not quite understand what you mean by Add a check for KeyError instead of checking for None. If I understand correctly, you'd want dataclass_json to fill the None value out for you, which I don't think we should do. Or are you suggesting that the error message is a bit misleading, and you'd rather want something like Missing parameter 'test' when initializing ' TestData' ?

RunOrVeith avatar May 19 '20 07:05 RunOrVeith

You are correct, I would assume that if I mark a field as Optional the loading should substitute the missing value by None. I believe this would reduce redundant code. As a separate issue implementing meaningful error messages for unexpected or missing keys would also be desirable but that's probably a topic for a different issue.

ghost avatar May 25 '20 20:05 ghost

I don't think loading should substitute the missing value for you because there are many cases where you want users to always explicitly specify some value, even if it is None.

To be clear, I don't think you even have to add "test": None into your initialization dict; defining test as test: Optional[str] = None should be enough.

kevinmu avatar May 29 '20 22:05 kevinmu

Just adding some (probably non-valuable) opinion here... coming from the javascript / typescript world, this is quite frustrating ( as @janoszen details)

If you're writing a flask / web app - you never know what the client is going to send you... you need to be able to deserialize the json into a class to "validate it" before moving down the stack.

The behavior pretty much dictates that every dataclass has to define all fields as Optional[type] = None. This distorts the meaning / intent of the dataclass -- I just don't see much value there?

Thoughts?

Thanks, Chad

chadbr avatar Oct 28 '20 23:10 chadbr

I think the confusion here is the name "Optional" in python. Please see the description in the 3rd "Note" in this section about types in python.

If you want to validate, you should use the underlying schema:

@dataclass_json
@dataclass
class Foo:
    x: str
     

Foo.schema().validate(data={})
# {}
Foo.schema().validate(data={"y": 5})
# {'y': ['Unknown field.']}
Foo.schema().validate(data={"y": 5, "x": 5})
# {'x': ['Not a valid string.'], 'y': ['Unknown field.']}
Foo.schema().load({})
# KeyError: 'x'

RunOrVeith avatar Oct 29 '20 16:10 RunOrVeith

~Hi, sorry to rez this thread, but doesn't it make more sense to raise a ValidationError if a field that is expected to exist is not provided?~

Edit: I see that it doesn't raise a ValidationError because the example in the OP was using from_json, rather than load which does raise a ValidationError.

PatrickF1 avatar Mar 04 '24 05:03 PatrickF1