Support for queryset filtering and relationship anonymization
Hello! Thank for the module!
I am using it for a slight different use and I need to anonymize selectively some instance.
Instance filtering
In my use case, I would like to anonymize only part of the table. I know you can change the get_queryset, but I want to filter it dynamically (for instance, given value from a form, etc.)
I am interesting in something like:
# run anonymizer: be cautious, this will affect your current database!
person = Person.objects.last()
# anonymize full table:
PersonAnonymizer().run()
# anonymize only some instance
PersonAnonymizer().run(pks=[person.pk])
I simply updated the code at 2 places:
class BaseAnonymizer(object):
def run(self, pks=None, select_chunk_size=None, **bulk_update_kwargs):
self._declarations = self.get_declarations()
queryset = self.get_queryset(pks=pks) # here
...
def get_queryset(self, pks=None):
"""Override this if you want to delimit the objects that should be
affected by anonymization
"""
qs = self.get_manager().all()
if pks:
qs = qs.filter(pk__in=pks) # and here
return qs
That would be great if this could be merge! I don't want to maintain a fork only for this small change... Are you interested in a PR?
Relationship
Similarly, when anonymizing an instance, I would like to anonymize also one2one relation (or m2m, or possibly FK). I can get the instance, and run the anonymizer on every relationship like:
obj = Person.objects.last()
ProfileAnymizer().run(pks=[obj.profile.pk])
But it would be great to be able to define everything in the PersonAnonymizer like for instance:
class PersonAnonymizer(BaseAnonymizer):
birthdate = "1900-01-01"
birthcity = ""
class Meta:
model = Person
onetoone = {
"profile": "my_app.anonymizers.ProfileAnonymizer",
}
class ProfileAnonymizer(BaseAnonymizer):
field_1 = None
class Meta:
model = Profile
In my current fork I did something like:
def custom_import(name):
components = name.split(".")
mod = __import__(components[0])
for comp in components[1:]:
mod = getattr(mod, comp)
return mod
class BaseAnonymizer(object):
def run(self, pks=None, select_chunk_size=None, **bulk_update_kwargs):
...
# Cascade to one to one relation
if hasattr(self.Meta, "onetoone"):
for relation, class_import in self.Meta.onetoone.items():
anonymizer = custom_import(class_import)
for obj in objs:
related_model = getattr(obj, relation)
if related_model:
anonymizer().run(pks=[related_model.pk])
This is a bigger feature than the first one, but would you also be interested in?