dataclass-wizard
dataclass-wizard copied to clipboard
Upcoming Changes in V1
Discussed in https://github.com/rnag/dataclass-wizard/discussions/153
Originally posted by rnag November 27, 2024 I want to add my thoughts on planned (breaking) changes in the next major release V1.
Planned Changes
- [ ] There will/should be no default key transform on dump (serialization). So if dataclass fields are defined in
snake_case, then in JSON output it will also be insnake_case.- [ ] Thus we can remove helper class
JSONPyWizard, as it was only a stop-gap solution.
- [ ] Thus we can remove helper class
- [x] Similarly, there will be no "auto" key transform on load (de-serialization) anymore. See my note/comment under "Performance Improvements" below.
- [ ]
JSONWizardshould be the default class name, it is time to do away with alias toJSONSerializable, which IMO doesn't make much sense to retain in a library called Dataclass Wizard. - [x] The
__str__()will no longer be default (or at least the same) onJSONWizardsubclass.- Pretty-printing a dataclass instance as JSON is a bit unexpected and humorous to me (maybe childish?). Not sure what the new default will be. To use an example of a library,
pydanticdoes it weirdly, it prints the field names withrepr'd values separated by a space, and no class name. Maybe there's a middle ground or it could involve leveragingpprint. I'll have to think on it.
- Pretty-printing a dataclass instance as JSON is a bit unexpected and humorous to me (maybe childish?). Not sure what the new default will be. To use an example of a library,
- [x] We will no longer automatically (silently) convert
floatvalue or a float instr(ex.123.4or'12.3') to anintif the annotated type isint. There seems to be lot of concern over this and it appears to be tied around unintentional data loss, and I agree, we shouldn't lose the fractional part when converting toint, especially as Python we should strive to be more explicit and not do "silent" conversions like these. - [ ] The
@dataclassdecorator may no longer be required? For convenience, our library can use@dataclass_transformand apply it ourselves if a class isn't decorated with it. Especially true as most IDEs like PyCharm now support it. I think this would be a huge help for users, and me personally, as I sometimes forget to apply@dataclass. - [ ] All deprecated stuff should and can be removed (ex.
__pre_as_dict__()hook)
Performance improvements
- [x] Improving some helper conversion functions. For example
as_str()is unnecessary, simply using builtinstr()appears to be the fastest approach. What a shocker 😮- Though for best practice, we can also support
Nonewhen loading tostrtype. Something like'' if x is None else str(x)seems like a good middle ground to have 🤔
- Though for best practice, we can also support
- [x] Methods under
LoadMixinshould now return a string instead of be defined as regular functions, this will boost performance as we nowexecfunction anyway, so there's no need to nest functions when parsing individual fields.- [ ] If I have time, I can also do a similar thing for
DumpMixinanddumpers.py. My reasoning is, perhaps by default we can use the type annotations on a field to determine how to dump/serialize it. For example, if annotated type isstr | None, then havekwargs[field] = valuein string code to return the field value, no need to check the type of value as howdataclassesdoes it, e.g.if type(obj) in _ATOMIC_TYPES: ...each timeasdictis called. Though my follow-up thought was, it will prove tricky for cases likeOptionalandUnion. ForUniontype annotated fields, maybe it's best to check the type of value directly after all. - [ ] Coincidentally, this also means some (or all?) of the Parsers in
parsers.pycan be removed, as they will be unnecessary.
- [ ] If I have time, I can also do a similar thing for
- [ ] The default behavior should be to iterate over dataclass fields on de-serialization, instead of looping over the JSON object. This will have the minor benefit of eliminating
forloop. I am thinking maybe having aMetasetting such asinput_letter_caseor similar, so e.g. if set toinput_letter_case='CAMEL', then it will enable automatically mapmy_strdataclass field tomyStrin input JSON object. Plus of course, another setting such aswizard_mode=Trueorauto_key_transform=Truewould effectively disable "minor optimization" mode and loop over the input JSON object, as this library is currently doing it, and as the example on the frontpage of the docs clearly illustrates to users.
I had more changes planned, if I remember them I will add or jot them down here. Thanks all, and kindly let me know any comments or feedback down below! 👋