datafiles icon indicating copy to clipboard operation
datafiles copied to clipboard

Integration with schema validation libraries

Open devlounge opened this issue 3 years ago • 4 comments

Hi,

I discovered this project 2 days ago and I thing it's pretty cool. However, I still yet haven't found how to properly integrate it with dataclasses using Schemas libraries. I am using marshmallow-dataclass but we could also think about people using Pythdantic dataclasses.

For example, how to properly:

  • load-data from files, deserialize using marshmallow Shema().load()
  • write data to file coming from marshmallow Shema().dump(,...)

I think that combining your library with such libraries would make the whole thing amazing.

Any idea on how to do this well?

devlounge avatar Dec 17 '21 10:12 devlounge

I haven't looked into the implementation details of those libraries too much, but my initial thought is that one could explore registering a marchmallow.Schema subclass as a custom converter: https://datafiles.readthedocs.io/en/latest/types/custom/#converter-registration

# Python Model

import datetime as dt

class User:
    def __init__(self, name, email):
        self.name = name
        self.email = email
        self.created_at = dt.datetime.now()

    def __repr__(self):
        return "<User(name={self.name!r})>".format(self=self)

# Marshmallow Schema

from marshmallow import Schema, fields

class UserSchema(Schema):
    name = fields.Str()
    email = fields.Email()
    created_at = fields.DateTime()
# Datafiles Model

from datafiles import datafile

@datafile("samples/{self.key}.yml")
class Sample:

    key: int
    user: User

# Datafiles Converter

from datafiles import converters

class UserConverter(UserSchema):

    @classmethod
    def to_preserialization_data(cls, python_value, **kwargs):
        schema = cls()
        return schema.dump(python_value)

    @classmethod
    def to_python_value(cls, deserialized_data, **kwargs):
        schema = cls()
        return schema.load(deserialized_data)

converters.register(User, UserConverter)

# Usage

sample = Sample(42, User(name="Monty", email="[email protected]"))
sample.user.name = "Flying Circus"

jacebrowning avatar Dec 17 '21 14:12 jacebrowning

Hey, thanks for the suggestion, however I've just tried it and nothing happens. The registered converter is never used.

I'm wondering if converters work on top level Model, and not only on Model fields.

Here's what I tried:

from datafiles import datafile
from datafiles.converters import Converter, register
from marshmallow import RAISE
from marshmallow import Schema
from marshmallow_dataclass import class_schema


@datafile("./data/clusters/{self.name}.yaml")
class TestModel:

    name: str
    role: str
    datacenter: str
    environment: str


class TestModelBase(Schema):

    class Meta:
        unknown = RAISE


TestModelSchema = class_schema(
    TestModel,
    base_schema=TestModelBase,
)


class TesModelConverter(TestModelSchema):

    @classmethod
    def to_preserialization_data(cls, python_value, **kwargs):
        return cls().dump(python_value)

    @classmethod
    def to_python_value(cls, deserialized_data, **kwargs):
        return cls().load(deserialized_data)


register(TestModel, TesModelConverter)

The converter is never used when I iterate on TestModel.objects.all() or when I instanciate manually a TestModel.

devlounge avatar Jan 20 '22 23:01 devlounge

I've looked at your source code and you invoke create_mapper in the post_init but the map_type function is indeed only called on the fields and not when an instance of the model gets created. I assume that this would only work for nested models but never on the top model.

devlounge avatar Jan 21 '22 00:01 devlounge

I assume that this would only work for nested models but never on the top model.

Yeah, I think you're right. Does your example work if you nest it one layer?

jacebrowning avatar Jan 21 '22 01:01 jacebrowning