mongoengine
mongoengine copied to clipboard
save() may break database consistency (concurrency issue)
save()
only saves changed fields in existing documents, This is pretty fine, but it saves with upsert=True
! So, if someone deleted the document in the meanwhile, we end-up with a document with only updated fields (and missing required fields). (We encountered to these mal-structured rows in our database.)
save()
should not use upsert=True
when updating only changed fields. It may check for results and raise an exception if no document is updated (sqlalchemy has a StaleDataError
for this porpose)
Wow, this is pretty bad... It leads to a malformed doc stored in the database:
In [16]: from mongoengine import *
In [17]: connect('testdb')
Out[17]: MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, read_preference=Primary())
In [18]: class Doc(Document):
...: num = IntField()
...: name = StringField()
...:
In [19]: Doc.drop_collection()
In [20]: Doc.objects.create(num=13, name='lucky')
Out[20]: <Doc: Doc object>
In [21]: doc = Doc.objects.first()
In [22]: Doc._get_collection().delete_one({}) # emulate another thread deleting the doc
Out[22]: <pymongo.results.DeleteResult at 0x11023baa0>
In [23]: doc.name = 'different'
In [24]: doc.save()
Out[24]: <Doc: Doc object>
In [25]: Doc._get_collection().find_one() # no num anywhere :(
Out[25]: {u'_id': ObjectId('586491ddc6e47b6b74de693d'), u'name': u'different'}
I agree this should rather fail with an error equivalent to StaleDataError
. We should, however, make sure that newly created documents still get upserted when you pre-populate their pk. For example, this should insert the doc into the database:
doc = Doc(id='aaabbbcccddd', name='lucky', num=13)
doc.save()
I'm currently using a workaround of calling save() with save_condition = {}. This causes the upsert to be False. Note that setting the pk of a new object must be done during init. otherwise the document is marked as existing and the save will fail.
I just hit this nasty bug put my db in a bad state