mongoengine
mongoengine copied to clipboard
New documents with user-defined primary keys cause a collection.update()
New Documents should be added to the collection with collection.save(). However, the test in the Document.save() method which tries to determine whether an instance is new gives the wrong result when the Document is defined with a custom primary key.
Here's a simple test which duplicates the incorrect behaviour. This fails on the dev branch.
def test_created(self):
class D1(Document):
normal_field = IntField(primary_key=False)
class D2(Document):
custom_pk = IntField(primary_key=True)
d1 = D1(normal_field = 1)
d2 = D2(custom_pk = 1)
# '_id' in self.to_mongo() is the test used by mongoengine to•
# determine whether an object is new
self.assertFalse('_id' in d1.to_mongo()) # passes
self.assertFalse('_id' in d2.to_mongo()) # fails
We were thinking that an appropriate change might be to have a "saved" flag (not persisted) on each model instance that indicates whether it was created from the database or is a new object. Then use this for the decision about whether to use save or update... thoughts?
I think your test assertions are incorrect -- you're assigning a value to the custom_pk
field, which is being "rewritten" to _id
for storage in the database. This is a good thing, as mongodb has a default unique index on _id
, so mongoengine doesn't have to create another index.
However, that said, I think that it's likely that mongoengine will issue an update with upsert=True
. Does this not work for you? (i.e. does the document get created with incorrect values?)
@drcrosta, I think you're right that mongoengine is behaving as intended when it rewrites 'custom_pk' to '_id'. However, in the Document.save() function, '_id' in d1.to_mongo() is used as the test of whether or not we are dealing with a new instance that needs to be created.
The test just illustrates this point. I wasn't clear about that - sorry.
@gareth-lloyd sure, but issuing an update with upsert=True
will still do the right thing, right? If there's a data loss or corruption issue here, let's identify the exact set of circumstances to replicate that issue, and add a test for it; if not, I don't see what we should change.
I think the source of my confusion is the difference in behaviour between unique keys and primary keys here. In fact, I came across this problem when I changed a field from a unique key to be the primary key (intending to save us an index) and everything broke.
Here's an example/test:
def test_primary_key(self):
class User(Document):
name = StringField()
email = StringField(unique=True)
User.objects.create(name='geoff', email='[email protected]')
u = User(name='frank', email='[email protected]')
self.assertRaises(OperationError, u.save)
class User2(Document):
name = StringField()
email = StringField(primary_key=True)
User2.objects.create(name='geoff', email='[email protected]')
u2 = User2(name='frank', email='[email protected]')
self.assertRaises(OperationError, u2.save)
self.assertEquals('geoff', User2.objects.get(email='[email protected]').name)
The final two assertions will currently fail on the dev branch. When I think I'm creating a new object, I would expect to fail if I chose a primary key that already existed. I also wouldn't expect to overwrite information in an existing record.
We could save with force_insert=True
, but I think my desired behaviour is a fairly common-sense interpretation of what it means to have a custom pk.
Not sure what to do here as it is native to MongoDB that save updates if exists and inserts if new.
Just because you have a custom primary key, I'm not sure we should handle this differently - I agree its odd that primary key and unique keys work differently, but that is the case with MongoDB.
Thoughts?
I can confirm this issue. When having a custom ID it is a problem that if I set ID before calling save (or by setting default on the field) it will see object as already created and proceeds to updating it.
I think thats a different issue to reported Mitar - is your issue with a custom pk when calling save() it does an update?
On Thu, Dec 20, 2012 at 2:38 PM, Mitar [email protected] wrote:
I can confirm this issue. When having a custom ID it is a problem that if I set ID before calling save (or by setting default on the field) it will see object as already created and proceeds to updating it.
— Reply to this email directly or view it on GitHubhttps://github.com/hmarr/mongoengine/issues/268#issuecomment-11575668.
Hm. I want to have UUID for document id. If I define UUID filed with primary key, I tried to set default to lambda: return uuid.uuid4()
but this have an issue in some other my code which checks for existence of pk and now it has even before document is saved. So I removed default and now I am using:
if not self.id or force_insert:
self.id = uuid.uuid4()
return super(MyDocument, self).save(force_insert=force_insert, *args, **kwargs)
but this is also not the best because from the code path it now goes to update code path, not save. It seems to work but I am nervous because it does not take the same code path as default primary key.
Perhaps not using save would be a better idea! default should be a callable so: default=uuid.uuid4 might work.
Ross
On Thu, Dec 20, 2012 at 4:38 PM, Mitar [email protected] wrote:
Hm. I want to have UUID for document id. If I define UUID filed with primary key, I tried to set default to lambda: return uuid.uuid4() but this have an issue in some other my code which checks for existence of pk and now it has even before document is saved. So I removed default and now I am using:
if not self.id or force_insert: self.id = uuid.uuid4() return super(MyDocument, self).save(force_insert=force_insert, *args, **kwargs)
but this is also not the best because from the code path it now goes to update code path, not save. It seems to work but I am nervous because it does not take the same code path as default primary key.
— Reply to this email directly or view it on GitHubhttps://github.com/hmarr/mongoengine/issues/268#issuecomment-11580396.
But still, this means that when I do MyDocument()
, id
already has a value, even if it is not saved. This does not just break my code, but has the same property as my code above of update code path once I call save
.
This should be fixed in 0.8 as we also check the MyDocument()._created flag as well to identify if we want to do a delta.
But in Django (which I am using) it is customary to check for existence of pk
to determine if object have been already saved and not. Using default
on primary key breaks this (and I need one more special case for handling MongoEngine objects).