django-anon icon indicating copy to clipboard operation
django-anon copied to clipboard

Support for queryset filtering and relationship anonymization

Open weber-s opened this issue 10 months ago • 0 comments

Hello! Thank for the module!

I am using it for a slight different use and I need to anonymize selectively some instance.

Instance filtering

In my use case, I would like to anonymize only part of the table. I know you can change the get_queryset, but I want to filter it dynamically (for instance, given value from a form, etc.)

I am interesting in something like:

# run anonymizer: be cautious, this will affect your current database!
person = Person.objects.last()
# anonymize full table:
PersonAnonymizer().run()
# anonymize only some instance
PersonAnonymizer().run(pks=[person.pk])

I simply updated the code at 2 places:

class BaseAnonymizer(object):
    def run(self, pks=None, select_chunk_size=None, **bulk_update_kwargs):
        self._declarations = self.get_declarations()

        queryset = self.get_queryset(pks=pks) # here
...
    def get_queryset(self, pks=None):
        """Override this if you want to delimit the objects that should be
        affected by anonymization
        """
        qs = self.get_manager().all()
        if pks: 
            qs = qs.filter(pk__in=pks)  # and here
        return qs

That would be great if this could be merge! I don't want to maintain a fork only for this small change... Are you interested in a PR?

Relationship

Similarly, when anonymizing an instance, I would like to anonymize also one2one relation (or m2m, or possibly FK). I can get the instance, and run the anonymizer on every relationship like:

obj = Person.objects.last()
ProfileAnymizer().run(pks=[obj.profile.pk])

But it would be great to be able to define everything in the PersonAnonymizer like for instance:

class PersonAnonymizer(BaseAnonymizer):
    birthdate = "1900-01-01"
    birthcity = ""

    class Meta:
        model = Person
        onetoone = {
            "profile": "my_app.anonymizers.ProfileAnonymizer",
        }



class ProfileAnonymizer(BaseAnonymizer):
    field_1 = None
    class Meta:
        model = Profile

In my current fork I did something like:

def custom_import(name):
    components = name.split(".")
    mod = __import__(components[0])
    for comp in components[1:]:
        mod = getattr(mod, comp)
    return mod


class BaseAnonymizer(object):
    def run(self, pks=None, select_chunk_size=None, **bulk_update_kwargs):
       ...
        # Cascade to one to one relation
        if hasattr(self.Meta, "onetoone"):
            for relation, class_import in self.Meta.onetoone.items():
                anonymizer = custom_import(class_import)
                for obj in objs:
                    related_model = getattr(obj, relation)
                    if related_model:
                        anonymizer().run(pks=[related_model.pk])

This is a bigger feature than the first one, but would you also be interested in?

weber-s avatar Feb 17 '25 15:02 weber-s