dataclassframe
dataclassframe copied to clipboard
guarantee record type invariance?
@joshlk
This package provides an effective way to create pandas dataframes from data that fits a dataclass. Thanks for writing it.
I saw this comment which echoed something I thought too: https://github.com/joshlk/dataclassframe/blob/main/dataclassframe/dataclassframe_.py#L93
i.e. should the underlying dataframe records type/schema always be the exact same dataclass? I tried the package myself and can easily add a column to the dataframe when using dataclassframe.
My suggestion would be to either force this behavior OR provide an optional "record_class_frozen" to the DataClassFrame constructor.
NOTE: I know in general python doesn’t have runtime type checking, so I’m not sure how this would be enforced
Hi @mphelp,
Thanks for your interest in the project 😃.
One of the main principles of the DataClassFrame is that it's column wise immutable i.e. the schema should not change including the type. So I would like to enforce this as much as possible.
Currently it's quite easy to break this principle by manipulating the underlying DataFrame data (the .df attribute) but there is no way I can prevent that. Could you provide some examples of how you changed the schema of a DataClassFrame?
Thanks again for your interest, Josh