marshmallow-sqlalchemy
marshmallow-sqlalchemy copied to clipboard
I suspect a breaking change related to "include_relationships=True" and schema.load() in version 1.4.1
Hi ! I recently had to do some testing on my local for some portion of our code that's been running happily on staging. I didn't specify a version number for my local container instance, as such it pulled 1.4.1 which is the latest. I noticed that it seems to enforce implicitly defined fields (from the include_relationships=True option), during a schema.load() which I THINK isn't the desired or USUAL behavior.
Typically (in previous versions specifically 1.1.0 ) we've been able to deserialize into objects with schemas defined with the include_relationships option set to True - without having to provide values for relationship fields (which intuitively makes sense), however for some reason it's raising a Validation (Missing field) error on 1.4.1 . This behavior wasn't reproducable on 1.1.0 (not that I would know if this is on later versions in: 1.1.0 < version < 1.4.1 because we've not used any other up until my recent local testing which used 1.4.1).
class UserSchema(Base):
class Meta:
model = User
include_relationships=True # includes for example an 'address' relationship attr (could be something like a 'backref' or explicitly declared field on the sqlalchemy model)
include_fk = True
load_instance = True
# usage and expected behavior
schema = UserSchema(unknown='exclude', session=session)
user = schema.load(**data)
# returns <User> object
# Observed behavior
# .. same steps
# throws 'Missing data for required 'address' field - error'
I wasn't able to reproduce this. added a test to cover this case in https://github.com/marshmallow-code/marshmallow-sqlalchemy/pull/666 and it passes. @Curiouspaul1 can you post a more complete minimal reproducible example?
Hi @sloria That's strange, are you sure you tested with a load() and not a dump() also make sure your model has a field that defines a sqlalchemy relationship (see example below and or above). I can guarantee that this is still happening because I tried again to reproduce the behavior by upgrading the version to 1.4.1 and it does the same thing. I guess additional information could be the version numbers for libraries of interest, namely:
sqlalchemy==2.0.35
flask-sqlalchemy==3.1.1
flask-marshmallow==1.2.1
marshmallow==3.22.0
I had to create a simple test to reproduce behavior outside of the running application, so it's pretty basic steps to reproduce. All i'm doing is using a sqlalchemyautoschema instance to dump a sqlalchemy instance, as described above:
# user model
class User(db.Model):
# .. define columns ..
address = db.relationship('Address', backref='user')
class UserSchema(Base):
class Meta:
model = User
include_relationships=True # includes for example an 'address' relationship attr (could be something like a 'backref' or explicitly declared field on the sqlalchemy model)
include_fk = True
load_instance = True
# usage and expected behavior
schema = UserSchema(unknown='exclude', session=session)
user = schema.load(**data)
# returns <User> object
# Observed behavior
# .. same steps
# throws 'Missing data for required 'address' field - error'
PS: The User nd Address example aren't exactly the same models in our usecase, but its representative of what exactly is happening. Here's a snippet of the error i'm getting:
raise DataValidationError(message=message, errors=exc.messages, data=data)
│ │ │ │ └ {'quantity': 1.0, 'product_id': 2631}
│ │ │ └ {'location': ['Missing data for required field.']}
│ │ └ ValidationError({'location': ['Missing data for required field.']})
│ └ 'location: Missing data for required field'
└ <class 'app.utils.exceptions.DataValidationError'>
So as you can see, the location field (which is defined as a relationship between the model i'm trying to load and a Locations table) exactly how i defined it in my example above for User and Address - is throwing a missing field error, which isn't the expected behavior. I shouldn't have to specify the value for a relationship field in version 1.4.1 but it works in 1.1.0.
Hopefully this helps.
@sloria any update on this?
sorry--haven't had much time to work on marshmallow these days. would definitely appreciate help on this, even if it's an incomplete PR.
I am currently working on updating app dependencies and came across this issue. After some digging into my problem, which may not be exactly the same, I think this PR could be the breaking change.
In my use case, we had defined a Nested relationship on the schema, which seemed to trigger the error.
Prior to that commit, with models and schemas setup like:
class User(db.Model):
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
friends = relationship(UserFriend, back_populates='user')
class UserFriend(db.Model):
user_id = ForeignKey(User.id, nullable=False)
name = Column(String, nullable=False)
user = relationship(User, back_populates='friends')
class UserFriendSchema(SQLAlchemyAutoSchema):
class Meta:
model = UserFriend
include_relationships = True
load_instance = True
class UserSchema(SQLAlchemyAutoSchema):
class Meta:
model = User
include_relationships = True
load_instance = True
friends = Nested(UserFriendSchema, many=True)
This was possible:
user_data = {'name': 'Bob', 'friends': ['name': 'Joe', 'name': 'Tom']}
new_user, errors = UserSchema.load(user_data, instance=None)
assert not errors
db.session.add(new_user)
db.session.commit()
assert new_user.name == 'Bob'
assert sorted([f.name for f in new_user.friends]) == ['Joe', 'Tom']
After that commit, this is the result:
user_data = {'name': 'Bob', 'friends': ['name': 'Joe', 'name': 'Tom']}
new_user, errors = UserSchema.load(user_data, instance=None)
# errors == {
# 'friends': {
# 0: {'user': ['Missing data for required field.']},
# 1: {'user': ['Missing data for required field.']},
# }
Since this is an add operation for a User with nested relationship data related to the object being added via load, we don't have access to the User.id to include in the user_data.
I did notice that by removing the defined Nested relationship, the friends relationship gets handled automatically as a Related field, which does seem to handle the relationship data correctly (i.e. the user object gets added to the db along with the related objects). However, the following warning is raised:
/marshmallow_sqlalchemy/fields.py:136: SAWarning: fully NULL primary key identity cannot load any object
which I have determined is caused by the fact that on line 133 of fields.py lookup_values is: [None]. The below change seems to be a valid solution because _deserialize returns self.related_model(**value) (line 113 of fields.py) as a result of raising NoResultFound:
def _get_existing_instance(self, related_model, value):
...
else:
# Use a faster path if the related key is the primary key.
lookup_values = [value.get(prop.key) for prop in self.related_keys]
# When lookup_values == [None], there is no primary_key data, so no need to call session.get
# Otherwise, SAWarning: fully NULL primary key identity cannot load any object will be raised
if len(lookup_values) == 1 and lookup_values[0] is None:
raise NoResultFound
try:
result = self.session.get(related_model, lookup_values)
except TypeError as error:
keys = [prop.key for prop in self.related_keys]
raise self.make_error("invalid", value=value, keys=keys) from error
if result is None:
raise NoResultFound
return result
Unfortunately, in my case, right now I think I need a Nested relationship instead of a Related field, so the above solution doesn't appear fix my use case. I'm by no means a marshmallow expert, so I'm not sure the best overall solution to these problems is. But I thought I would provide this info in case it helps with further troubleshooting.