A more conservative approach to cleaning and updating models
I haven't used the issue template as it seemed geared to bugs.
Please correct me if I'm wrong! But it seems that django-anon will run Model.clean() on a model instance, and update its corresponding database row, even if no changes have been made.
In some circumstances this can be very wasteful, e.g.
class Foo(models.Model):
bar = models.CharField(default="", max_length=10)
Foo.objects.bulk_create([Foo() for _ in range(100)])
Foo.objects.filter(pk=1).update(bar="baz")
class FooAnonymizer(anon.BaseAnonymizer):
bar = ""
class Meta:
model = Foo
Has optimising for this case been explored?
I am aware that this can be overcome by defining a custom get_queryset(), but this feels more complex, more error-prone, and less declarative.
My understanding is that Django's QuerySet.bulk_update() does not make an attempt to clear out 'effectively noop' updates from being sent through to the database, nor would it be appropriate for it to do so.
My thinking is that django-anon's specific use case could allow for some assumptions to be made, would would make it viable/safe to do this. Namely, assuming that a model instance's underlying database row is not changed by 'something else' between the time that the row is fetched to the time when the model is written to the database (which'd result in an incorrect delta calculation). This means that e.g. patch_object() caller could determine the delta for the model instance it's being called for, only call clean() in cases where something is determined to have changed, and report back to its caller whether or not anything was changed, which can be used to determine the instance's addition to the bulk_update() call.
If this is viable at all, it certainly adds complexity, which given this package's context feels like a risk. To me it seems like this'd work in all but the most esoteric uses of django-anon / Django's ORM.
I'm happy to look into this myself but first wanted to check if you/anyone else sees any issues with this, or has already tried it and found that it didn't work :)