datafiles
datafiles copied to clipboard
Integration with schema validation libraries
Hi,
I discovered this project 2 days ago and I thing it's pretty cool. However, I still yet haven't found how to properly integrate it with dataclasses using Schemas libraries. I am using marshmallow-dataclass but we could also think about people using Pythdantic dataclasses.
For example, how to properly:
- load-data from files, deserialize using marshmallow Shema(
).load( ) - write data to file coming from marshmallow Shema(
).dump( ,...)
I think that combining your library with such libraries would make the whole thing amazing.
Any idea on how to do this well?
I haven't looked into the implementation details of those libraries too much, but my initial thought is that one could explore registering a marchmallow.Schema subclass as a custom converter: https://datafiles.readthedocs.io/en/latest/types/custom/#converter-registration
# Python Model
import datetime as dt
class User:
def __init__(self, name, email):
self.name = name
self.email = email
self.created_at = dt.datetime.now()
def __repr__(self):
return "<User(name={self.name!r})>".format(self=self)
# Marshmallow Schema
from marshmallow import Schema, fields
class UserSchema(Schema):
name = fields.Str()
email = fields.Email()
created_at = fields.DateTime()
# Datafiles Model
from datafiles import datafile
@datafile("samples/{self.key}.yml")
class Sample:
key: int
user: User
# Datafiles Converter
from datafiles import converters
class UserConverter(UserSchema):
@classmethod
def to_preserialization_data(cls, python_value, **kwargs):
schema = cls()
return schema.dump(python_value)
@classmethod
def to_python_value(cls, deserialized_data, **kwargs):
schema = cls()
return schema.load(deserialized_data)
converters.register(User, UserConverter)
# Usage
sample = Sample(42, User(name="Monty", email="[email protected]"))
sample.user.name = "Flying Circus"
Hey, thanks for the suggestion, however I've just tried it and nothing happens. The registered converter is never used.
I'm wondering if converters work on top level Model, and not only on Model fields.
Here's what I tried:
from datafiles import datafile
from datafiles.converters import Converter, register
from marshmallow import RAISE
from marshmallow import Schema
from marshmallow_dataclass import class_schema
@datafile("./data/clusters/{self.name}.yaml")
class TestModel:
name: str
role: str
datacenter: str
environment: str
class TestModelBase(Schema):
class Meta:
unknown = RAISE
TestModelSchema = class_schema(
TestModel,
base_schema=TestModelBase,
)
class TesModelConverter(TestModelSchema):
@classmethod
def to_preserialization_data(cls, python_value, **kwargs):
return cls().dump(python_value)
@classmethod
def to_python_value(cls, deserialized_data, **kwargs):
return cls().load(deserialized_data)
register(TestModel, TesModelConverter)
The converter is never used when I iterate on TestModel.objects.all() or when I instanciate manually a TestModel.
I've looked at your source code and you invoke create_mapper in the post_init but the map_type function is indeed only called on the fields and not when an instance of the model gets created. I assume that this would only work for nested models but never on the top model.
I assume that this would only work for nested models but never on the top model.
Yeah, I think you're right. Does your example work if you nest it one layer?