mongoengine icon indicating copy to clipboard operation
mongoengine copied to clipboard

New documents with user-defined primary keys cause a collection.update()

Open gareth-lloyd opened this issue 12 years ago • 13 comments

New Documents should be added to the collection with collection.save(). However, the test in the Document.save() method which tries to determine whether an instance is new gives the wrong result when the Document is defined with a custom primary key.

Here's a simple test which duplicates the incorrect behaviour. This fails on the dev branch.

def test_created(self):
    class D1(Document):
        normal_field = IntField(primary_key=False)

    class D2(Document):
        custom_pk = IntField(primary_key=True)

    d1 = D1(normal_field = 1)
    d2 = D2(custom_pk = 1)

    # '_id' in self.to_mongo() is the test used by mongoengine to•
    # determine whether an object is new
    self.assertFalse('_id' in d1.to_mongo())    # passes
    self.assertFalse('_id' in d2.to_mongo())    # fails

gareth-lloyd avatar Aug 23 '11 13:08 gareth-lloyd

We were thinking that an appropriate change might be to have a "saved" flag (not persisted) on each model instance that indicates whether it was created from the database or is a new object. Then use this for the decision about whether to use save or update... thoughts?

colinhowe avatar Aug 23 '11 15:08 colinhowe

I think your test assertions are incorrect -- you're assigning a value to the custom_pk field, which is being "rewritten" to _id for storage in the database. This is a good thing, as mongodb has a default unique index on _id, so mongoengine doesn't have to create another index.

However, that said, I think that it's likely that mongoengine will issue an update with upsert=True. Does this not work for you? (i.e. does the document get created with incorrect values?)

dcrosta avatar Aug 23 '11 15:08 dcrosta

@drcrosta, I think you're right that mongoengine is behaving as intended when it rewrites 'custom_pk' to '_id'. However, in the Document.save() function, '_id' in d1.to_mongo() is used as the test of whether or not we are dealing with a new instance that needs to be created.

The test just illustrates this point. I wasn't clear about that - sorry.

gareth-lloyd avatar Aug 23 '11 16:08 gareth-lloyd

@gareth-lloyd sure, but issuing an update with upsert=True will still do the right thing, right? If there's a data loss or corruption issue here, let's identify the exact set of circumstances to replicate that issue, and add a test for it; if not, I don't see what we should change.

dcrosta avatar Aug 23 '11 16:08 dcrosta

I think the source of my confusion is the difference in behaviour between unique keys and primary keys here. In fact, I came across this problem when I changed a field from a unique key to be the primary key (intending to save us an index) and everything broke.

Here's an example/test:

def test_primary_key(self):
    class User(Document):
        name = StringField()
        email = StringField(unique=True)

    User.objects.create(name='geoff', email='[email protected]')

    u = User(name='frank', email='[email protected]')
    self.assertRaises(OperationError, u.save)

    class User2(Document):
        name = StringField()
        email = StringField(primary_key=True)
    User2.objects.create(name='geoff', email='[email protected]')

    u2 = User2(name='frank', email='[email protected]')
    self.assertRaises(OperationError, u2.save)
    self.assertEquals('geoff', User2.objects.get(email='[email protected]').name)

The final two assertions will currently fail on the dev branch. When I think I'm creating a new object, I would expect to fail if I chose a primary key that already existed. I also wouldn't expect to overwrite information in an existing record.

We could save with force_insert=True, but I think my desired behaviour is a fairly common-sense interpretation of what it means to have a custom pk.

gareth-lloyd avatar Aug 23 '11 18:08 gareth-lloyd

Not sure what to do here as it is native to MongoDB that save updates if exists and inserts if new.

Just because you have a custom primary key, I'm not sure we should handle this differently - I agree its odd that primary key and unique keys work differently, but that is the case with MongoDB.

Thoughts?

rozza avatar Nov 30 '11 15:11 rozza

I can confirm this issue. When having a custom ID it is a problem that if I set ID before calling save (or by setting default on the field) it will see object as already created and proceeds to updating it.

mitar avatar Dec 20 '12 14:12 mitar

I think thats a different issue to reported Mitar - is your issue with a custom pk when calling save() it does an update?

On Thu, Dec 20, 2012 at 2:38 PM, Mitar [email protected] wrote:

I can confirm this issue. When having a custom ID it is a problem that if I set ID before calling save (or by setting default on the field) it will see object as already created and proceeds to updating it.

— Reply to this email directly or view it on GitHubhttps://github.com/hmarr/mongoengine/issues/268#issuecomment-11575668.

rozza avatar Dec 20 '12 16:12 rozza

Hm. I want to have UUID for document id. If I define UUID filed with primary key, I tried to set default to lambda: return uuid.uuid4() but this have an issue in some other my code which checks for existence of pk and now it has even before document is saved. So I removed default and now I am using:

        if not self.id or force_insert:
            self.id = uuid.uuid4()
        return super(MyDocument, self).save(force_insert=force_insert, *args, **kwargs)

but this is also not the best because from the code path it now goes to update code path, not save. It seems to work but I am nervous because it does not take the same code path as default primary key.

mitar avatar Dec 20 '12 16:12 mitar

Perhaps not using save would be a better idea! default should be a callable so: default=uuid.uuid4 might work.

Ross

On Thu, Dec 20, 2012 at 4:38 PM, Mitar [email protected] wrote:

Hm. I want to have UUID for document id. If I define UUID filed with primary key, I tried to set default to lambda: return uuid.uuid4() but this have an issue in some other my code which checks for existence of pk and now it has even before document is saved. So I removed default and now I am using:

    if not self.id or force_insert:
        self.id = uuid.uuid4()
    return super(MyDocument, self).save(force_insert=force_insert, *args, **kwargs)

but this is also not the best because from the code path it now goes to update code path, not save. It seems to work but I am nervous because it does not take the same code path as default primary key.

— Reply to this email directly or view it on GitHubhttps://github.com/hmarr/mongoengine/issues/268#issuecomment-11580396.

rozza avatar Dec 20 '12 16:12 rozza

But still, this means that when I do MyDocument(), id already has a value, even if it is not saved. This does not just break my code, but has the same property as my code above of update code path once I call save.

mitar avatar Dec 20 '12 16:12 mitar

This should be fixed in 0.8 as we also check the MyDocument()._created flag as well to identify if we want to do a delta.

rozza avatar Dec 20 '12 16:12 rozza

But in Django (which I am using) it is customary to check for existence of pk to determine if object have been already saved and not. Using default on primary key breaks this (and I need one more special case for handling MongoEngine objects).

mitar avatar Dec 20 '12 16:12 mitar