dataclass-wizard icon indicating copy to clipboard operation
dataclass-wizard copied to clipboard

Update `raise_on_unknown_json_key` flag to raise a more helpful error for debugging purposes

Open rnag opened this issue 2 years ago • 0 comments

  • Dataclass Wizard version: 0.21.0

Description

I want to update the UnknownJSONKey exception that gets raised when the raise_on_unknown_json_key flag is enabled to include a list of all the unknown JSON keys, rather than only the first such unknown key in the JSON object.

I also want to update the error raised to include a resolution message, with more developer-friendly details on the suggested dataclass fields to add to to the model, to resolve the issue. I envision this will be really helpful for others -- at least, I found myself needing such a feature when I was attempting to parse a (not well-documented) API response from a web service myself recently.

What I Did

Consider this simple, but rather contrived, example:

from __future__ import annotations

from dataclasses import dataclass
from dataclass_wizard import JSONWizard


@dataclass
class MainClass(JSONWizard):
    class _(JSONWizard.Meta):
        raise_on_unknown_json_key = True

    my_first_field: str


data = {
    'my-first-field': 'my-string',
    'my-second-field': '7',
    'myThirdField': [
        {
            'inner-field-1': '1.23',
            'InnerField2': True
        }
    ],
    'my-fourth-field': '2021-12-31'
}

c = MainClass.from_dict(data)   # error!

# shouldn't get this far...
print(c)

Here's the error I currently get:

dataclass_wizard.errors.UnknownJSONKey: A JSON key is missing from the dataclass schema for class `MainClass`.
  unknown key: 'my-second-field'
  dataclass fields: ['my_first_field']
  input JSON object: {"my-first-field": "my-string", "my-second-field": "7", "myThirdField": [{"inner-field-1": "1.23", "InnerField2": true}], "my-fourth-field": "2021-12-31"}

This is of course expected behavior, since we enabled the raise_on_unknown_json_key flag in the Meta config.

The problem here, however, is that there are multiple fields in the JSON object that are missing from the dataclass schema. It would be very helpful if we had an output that listed out all those unknown JSON keys, along with an auto-generated dataclass schema with the lines that should be added to the model, the last of which I imagine will be super helpful when designing a model that would be expected to specifically match 1:1 to an API output.

For example, this is a sample of the output I might expect:

dataclass_wizard.errors.UnknownJSONKey: There are 3 JSON keys missing from the dataclass schema for class `MainClass`.
  unknown keys: ['my-second-field', 'myThirdField', 'my-fourth-field']
  dataclass fields: ['my_first_field']
  input JSON object: {"my-first-field": "my-string", "my-second-field": "7", "myThirdField": [{"inner-field-1": "1.23", "InnerField2": true}], "my-fourth-field": "2021-12-31"}
  suggested resolution: Update the dataclass schema to add the new fields below.

    @dataclass
    class MainClass(JSONWizard):
        ...
        my_second_field: int | str
        my_third_field: list[MyThirdField]
        my_fourth_field: date


    @dataclass
    class MyThirdField:
        inner_field_1: float | str
        inner_field2: bool

Notes

  • There's in fact a (somewhat) trivial approach. We can use a localized import from dataclass_wizard.wizard_cli.PyCodeGenerator to generate the desired dataclass schema. Also need to ensure to strip out the fields that already exist in the dataclass before calling this method. Something like this:

    unknown_keys = {k: v for k, v in json_dict.items() if normalize_key(k) not in key_to_dataclass_field}
    
    py_code = PyCodeGenerator(file_contents=json.dumps(unknown_keys),
                              experimental=True).py_code
    

rnag avatar Feb 01 '22 01:02 rnag