marshmallow-sqlalchemy icon indicating copy to clipboard operation
marshmallow-sqlalchemy copied to clipboard

Feature Request: built-in validator that object already exists in the db (useful for foreign keys)

Open jeffwidman opened this issue 9 years ago • 6 comments

Some of my API's let people create new objects, and some of the fields are optional foreign keys. Currently when a foreign key gets passed in, I have some custom validators that check the database to verify that the FK object actually exists. I could catch the exception, but rather check it on initial POST/PATCH.

This strikes me as something that would be useful as a built-in validator for marshmallow-sqlalchemy

I think the implementation would be fairly straightforward, although it's an open question in my mind whether to use session.query(foreign object).options(load_only(pk_field)).get(pk id) or something a little more fancy like:

if not db.session.query(db.exists([GearCategory.id]).where(GearCategory.id == category_id)).scalar():
   raise Exception("Gear Category ID [%s] does not exist" % category_id)

The get() will be far faster when the object is already in the session, but slower than the EXISTS if the query has to actually hit the DB.

jeffwidman avatar Dec 18 '15 11:12 jeffwidman

Edit #2

After spending a few more days on this, I now understand the original request and would also benefit from such functionality.

@jeffwidman - if you wouldn't mind, I would greatly benefit from seeing an example foreign key validator as I'm just not smart enough to figure this out on my own.

whoiswes avatar Mar 04 '16 16:03 whoiswes

No problem, see the category_id line for how I used it:

def object_id_exists(object_id, model):
    if not db.session.query(db.exists([model.id]).where(model.id == object_id)).scalar():
        raise ValidationError('%s ID: %i does not exist' % (model, object_id))


class GearItemSchema(fmarsh.Schema):
    class Meta:
        strict = True

    id = fmarsh.Integer()
    name = fmarsh.Str(required=True)
    url = fmarsh.URLFor('api.api_gear_item_get', gear_item_id='<id>', _external=True)
    category_id = fmarsh.Integer(required=True, validate=lambda x: va.object_id_exists(x, GearCategory))

jeffwidman avatar Mar 15 '16 09:03 jeffwidman

@jeffwidman - Thanks! I actually figured out validators late last week (through brute force and sheer luck) and here was my take on this (feedback welcome).

def validate_fk(rel):
    if isinstance(rel, list):
        errors = [({'id' : '%s does not exist' % obj.id}) for obj in rel if not obj._sa_instance_state.has_identity]
        if len(errors)>0:
            raise ValidationError(errors)
    else:
        if not rel._sa_instance_state.has_identity:
            raise ValidationError({'id': '%s does not exist' % rel.id})

I'm using this for a RESTful API, and one of the shortcomings I've come across is that I need separate schemas defined for read (GET) vs write (POST/PUT) operations, as I don't want to allow nested schemas to have attributes updated. It works but it's clunky. This validator has been a huge help.

I appreciate you posting your code - I may steal some/all of your concepts once I can fully digest/understand them.

Thanks again!

Wes

whoiswes avatar Mar 15 '16 13:03 whoiswes

That works, although I wouldn't use the private api _sa_instance_state... if you don't want to use db.exists(), you can just use db.query.get().... the get() method is special in that it will check the identity map first before hitting the database... so effectively doing exactly what you're doing, but through a public api for SQLAlchemy, rather than a private one. For speed, I'd also stick in load_only(id) so that when it does the get, it only queries for the ID.

jeffwidman avatar Mar 15 '16 20:03 jeffwidman

The code below is my solution for all foreign fields. It's strange that this is not in the library. I will be glad to see suggestions for improving this code.

from marshmallow import ValidationError
import marshmallow_sqlalchemy as ma
from app import db


def object_id_exists(object_id, table, table_id):
    if not db.session.query(f"EXISTS (SELECT {table_id} FROM {table} WHERE {table_id} = {object_id})").scalar():
        raise ValidationError(f'{table} {table_id}: {object_id} does not exist')


class ObjectExistsModelConverter(ma.ModelConverter):
    def property2field(self, prop, instance=True, field_class=None, **kwargs):
        field = super(ObjectExistsModelConverter, self).property2field(prop, instance=True, field_class=None, **kwargs)
        if not hasattr(prop, 'direction'):
            column = prop.columns[0]
            if column.foreign_keys:
                for fk in column.foreign_keys:
                    table, id = fk.target_fullname.split('.')
                    field.validators.append(
                        lambda x: object_id_exists(x, table, id)
                    )
        return field


class ModelSchemaOpts(ma.ModelSchemaOpts):
    def __init__(self, meta):
        if not hasattr(meta, 'model_converter'):
            meta.model_converter = ObjectExistsModelConverter
        if not hasattr(meta, 'include_fk'):
            meta.include_fk = True
        if not hasattr(meta, 'sql_session'):
            meta.sqla_session = db.session
        super(ModelSchemaOpts, self).__init__(meta)


class ModelSchema(ma.ModelSchema):
    OPTIONS_CLASS = ModelSchemaOpts

MyGodIsHe avatar Mar 29 '17 10:03 MyGodIsHe

Inspired by @whoiswes but doesn't have any references to private members and shouldn't make additional DB calls. Currently limited to single primary keys.

`def validate_fk(rel): # Preferred: relation = RelatedList(Related(validate=validate_fk)) # Over: relation = RelatedList(Related(), validate=validate_fk) def _record_exists(rec): return inspect(rec).has_identity

def _record_doesnt_exist_msg(rec):
    primary_keys = inspect(type(rec)).primary_key
    assert len(primary_keys) == 1
    id = getattr(rec, primary_keys[0].name)
    return f"ID {id} does not exist in {type(rec).__tablename__}"

if isinstance(rel, list):
    errors = {idx: _record_doesnt_exist_msg(rec) for idx, rec in enumerate(rel) if not _record_exists(rec)}
    if len(errors) > 0:
        raise ValidationError(errors)
else:
    if not _record_exists(rel):
        raise ValidationError(_record_doesnt_exist_msg(rel))`

The-Gopher avatar Jun 23 '20 00:06 The-Gopher