django-filter icon indicating copy to clipboard operation
django-filter copied to clipboard

Add group validation & filtering

Open rpkilby opened this issue 5 years ago • 36 comments

This PR introduces a Meta.groups option to the FilterSet, which allows users to define groups of Filters that override the validation and filtering behavior for the filters in that group. Example:

class UserFilter(django_filters.FilterSet):
    class Meta:
        model = User
        fields = ['username', 'first_name', 'last_name']
        groups = [
            RequiredGroup(['first_name', 'last_name']),
        ]

In the above, both first_name and last_name are mutually required, so if one of the filter params is provided, then both must be provided.

This PR should resolve #1020 and is an alternative to #1164 (cc @jonathan-golorry, any thoughts on this PR as an alternative to yours?).

Context:

Users periodically request how to validate and filter against multiple parameters. The goto advice has been to implement something based on MultiWidget, however there are three main deficiencies in my mind:

  • It's verbose, users will typically need to subclass MultiWidget, MultiValueField, and Filter.
  • It's difficult to implement. For new Django users, the MultiWidget/MultiValueField API is somewhat confusing (maybe I just didn't grok it for some reason, but I remember struggling with it in the past)
  • As mentioned elsewhere, validation errors don't account for sub-widgets and are raised under the unsuffixed parameter name. This is acceptable for forms usage, where all sub widgets and errors are rendered together, but this mismatch doesn't work in an API context. You might have query params foo_min and foo_max, but the errors are reported as foo.

I think this ultimately stems from a mismatch in expectations. Filters are generally designed to operate on a single query parameter, while MultiWidget is not. Aside from the difficulties in implementing, there are other limitations elsewhere (e.g., lack of schema support).

Proposal:

In contrast to a MultiWidget-based approach, Meta.groups are specially handled by the filterset and provided its entire set of filter params. Groups must implement both validation and filtering, however they are slightly different.

  • Group validation is only intended to validate the qualities of the group. For example, an ExclusiveGroup would validate that only one of its mutually exclusive parameters is provided. Validation of the individual filter parameters is still performed by the filter's underlying form field.
  • In contrast, group filtering completely overrides how filtering is performed for that set of filters. Internally, a group may still delegate to the underlying Filter.filter methods, however this is up to the group's implementation. For example, RequiredGroup calls the filter methods of its individual filters, while CombinedRequiredGroup constructs Q objects that are combined via & or | (or some other function).

In total, this PR implements four groups:

Name Description
ExclusiveGroup(filters) A group of mutually exclusive filters.
RequiredGroup(filters) A group of mutually required filters.
CombinedGroup(filters[, combine]) A group of filters that result in a combined query (a Q object).
CombinedRequiredGroup(filters[, combine]) A group of mutually required filters that result in a combined query.

Notes:

  • RequiredGroup effectively provides an alternative for the old Meta.together option.
  • Rendered docs can be found at https://rpkilby.github.io/django-filter/ref/groups.html
  • One objective of the PR was to merge docstrings into the actual site docs. The idea being that this reduces potential duplication. The docs page provides some structure/prose, and class/method docstrings are inserted where appropriate. If this style is 👍, then I'd want to eventually want to look into overhauling the rest of the reference docs in a similar manner.
  • Code changes/additions are actually fairly minimal. Most of the lines are docs/docstrings/tests.

TODO:

  • [ ] How is the clarity of the docs? What parts don't make sense?
  • [ ] Are better code examples needed?

rpkilby avatar Jan 28 '20 01:01 rpkilby

Hi @carltongibson. What do you want to do about the Django 1.11 failures? Q objects were made comparable in https://github.com/django/django/pull/7517, however this change wasn't added until Django 2.0. I could either rework the failing tests, or we could drop support for Django 1.11 per the official recommendations.

rpkilby avatar Jan 28 '20 21:01 rpkilby

I think this is a much better approach.

It looks like this would allow a single filter to be in multiple groups. Is that right?

One concern I have is how you handle excludes in the CombinedGroup. Consider:

m = Model.objects.create()
Related.objects.create(reverse=m, foo=True)
Related.objects.create(reverse=m, foo=False)
# 1
Model.objects.exclude(related__id__lt=0).exclude(related__foo=True)  # zero results
# 2
Model.objects.exclude(related__id__lt=0, related__foo=True)  # one result
# 3
Model.objects.filter(~Q(related__id__lt=0) & ~Q(related__foo=True))  # zero results

Excludes on related objects are weird.

jonathan-golorry avatar Jan 29 '20 15:01 jonathan-golorry

It looks like this would allow a single filter to be in multiple groups. Is that right?

Yes, but it's complicated. Validation should work, but filtering depends on the setup.

First, note the changes to filter_queryset. A group's params are 'extracted' from the rest of the cleaned_data. This is necessary so that the group and regular filters aren't both ran. e.g., with a combined group, you might expect a query like:

MyModel.objects.filter(Q(field_a=1) | Q(field_b=2))

Without extraction though, you would end up with something like:

MyModel.objects \
    .filter(Q(field_a=1) | Q(field_b=2)) \
    .filter(field_a=1) \
    .filter(field_b=2)

With a filter being shared across multiple groups, what would happen is that the filter would be extracted for the first group, but not the second. Some groups (like RequiredGroup) would fail, since they expect all their parameters to be provided, but this would work for ExclusiveGroups. e.g., you could have a setup where a filter is present in multiple mutually exclusive filter groups. Since that filter param would only be called once, there's no issue.

Long story short, filtering is going to be dependent on the implementations of the various groups. I don't think there is any rule that can be applied generally. Groups will just have to assert their expectations (e.g., as a mutually required group, I expect to filter on all my parameters) and provide as much helpful information as possible (maybe one of my params was preempted by another group).

One concern I have is how you handle excludes in the CombinedGroup.

I don't know if there's anything we can do here. That ORM call is not necessarily invalid, and exclusion across relationships generally requires consideration on the part of the developer. The CombinedGroup reference does include:

This is useful for enabling OR filtering, as well as combining filters that span multi-valued relationships (more info).

But maybe we could more specifically call out exclusion across relationships.

Excludes on related objects are weird.

Oh yes. I also ran into this with DRF filters. See https://github.com/philipn/django-rest-framework-filters/issues/217

rpkilby avatar Jan 29 '20 21:01 rpkilby

Hmm. I don't like having the order of the group definitions matter for duplicate field use. Is there a downside to allowing a single field to be in multiple groups? The data extraction be rewritten as:

individual_data = cleaned_data.copy()
for group in self.groups:
    group_data = group.extract_data(cleaned_data)
    for name in group_data:
        individual_data.pop(name, None)
    queryset = group.filter(queryset, **group_data)

for name, value in individual_data.items():

Also, do you know if there's an in depth explanation of expected exclude behaviour? Now I'm wondering if my second example would be considered a bug in django.

jonathan-golorry avatar Jan 30 '20 05:01 jonathan-golorry

Is there a downside to allowing a single field to be in multiple groups?

I don't think the queries will generally make sense. For example, what would it mean for a filter to be a member of an ExclusiveGroup and an OR-based CombinedGroup? e.g.,

class MyFilter(FilterSet):
    class Meta:
        model = MyModel
        fields = ['field_a', 'field_b', 'field_c']
        groups = [
            ExclusiveGroup(filters=['field_a', 'field_b']),
            CombinedGroup(filters=['field_b', 'field_c'], combine=operator.or_),
        ]

MyFilter({'field_b': 'b', 'field_c': 'c'}).qs would have a call like

MyModel.objects.filter(field_b='b').filter(Q(field_b='b') | Q(field_c='c'))

The above negates the combined query, and is somewhat pointless. Another example, take two exclusive groups (e.g., two mutually sets of arguments, of which some are shared).

        groups = [
            ExclusiveGroup(filters=['field_a', 'field_b']),
            ExclusiveGroup(filters=['field_b', 'field_c']),
        ]

While the validation is sensible, filtering on field_b would generate a call like

MyModel.objects.filter(field_b='b').filter(field_b='b')

The duplicate filter condition doesn't hurt the query, but it's not helping either.

I'd like to see some concrete examples for overlapping groups. This can always be improved later, and in the meantime, maybe we could make a note in the docs?


Also, do you know if there's an in depth explanation of expected exclude behaviour? Now I'm wondering if my second example would be considered a bug in django.

It's described in a note about exclusion here. However, I'm not sure if your case is a bug or not.

rpkilby avatar Jan 30 '20 23:01 rpkilby

Wouldn't your first example currently result in:

MyModel.objects.filter(field_b='b').filter(Q(field_c='c'))

That would be significantly different from:

MyModel.objects.filter(field_b='b').filter(Q(field_b='b') | Q(field_c='c'))
MyModel.objects.filter(field_b='b')  # simplified

jonathan-golorry avatar Jan 31 '20 08:01 jonathan-golorry

Codecov Report

Merging #1167 into master will decrease coverage by 1.07%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1167      +/-   ##
==========================================
- Coverage   99.43%   98.35%   -1.08%     
==========================================
  Files          15       16       +1     
  Lines        1235     1340     +105     
  Branches        0      226     +226     
==========================================
+ Hits         1228     1318      +90     
- Misses          7        8       +1     
- Partials        0       14      +14
Impacted Files Coverage Δ
django_filters/filters.py 98.84% <100%> (-1.16%) :arrow_down:
django_filters/filterset.py 98.98% <100%> (-1.02%) :arrow_down:
django_filters/groups.py 100% <100%> (ø)
django_filters/__init__.py 81.25% <0%> (-12.5%) :arrow_down:
django_filters/utils.py 97.2% <0%> (-2.8%) :arrow_down:
django_filters/fields.py 98.83% <0%> (-1.17%) :arrow_down:
django_filters/widgets.py 99.32% <0%> (-0.68%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e6d5f2d...fb54361. Read the comment docs.

codecov-io avatar Jan 31 '20 14:01 codecov-io

Wouldn't your first example currently result in:

MyModel.objects.filter(field_b='b').filter(Q(field_c='c'))

Sorry, I wasn't clear. I was assuming your proposed code changes, where a filter can be applied by overlapping groups. In this case, field_b is used by both the exclusive and combined OR groups, generating:

MyModel.objects.filter(field_b='b').filter(Q(field_b='b') | Q(field_c='c'))

rpkilby avatar Jan 31 '20 19:01 rpkilby

Yeah, either implementation could trip people up. Looking at your example again, my interpretation of the intent is that MyFilter({'field_b': 'b', 'field_c': 'c'}).qs would produce:

MyModel.objects.filter(Q(field_b='b') | Q(field_c='c'))

But neither implementation does that. Mine results in:

MyModel.objects.filter(field_b='b').filter(Q(field_b='b') | Q(field_c='c'))

And I think yours results in:

MyModel.objects.filter(field_b='b').filter(Q(field_c='c'))

Maybe we could split validation groups and filtering groups into two different concepts. I don't really see why ExclusiveGroup and RequiredGroup need to extract data and run their own filter logic. If the group data extraction were non-destructive, then you could have as many combinations of ExclusiveGroup and RequiredGroup as you wanted.

    def get_data(self, cleaned_data):
        return {
            name: cleaned_data[name]
            for name in self.filters
            if name in cleaned_data
        }

Then only do group filtering for CombinedGroup. The actual filtering on ExclusiveGroup and RequiredGroup can be done by the individual filters.

        individual_data = cleaned_data.copy()
        for group in self.groups:
            if isinstance(group, CombinedGroup):
                group_data = group.get_data(cleaned_data)
                for name in group_data:
                    individual_data.pop(name, None)
                queryset = group.filter(queryset, **group_data)

        for name, value in individual_data.items():

(edited to fix a mistake in the filtering)

jonathan-golorry avatar Feb 02 '20 05:02 jonathan-golorry

Maybe we could split validation groups and filtering groups into two different concepts. I don't really see why ExclusiveGroup and RequiredGroup need to extract data and run their own filter logic.

The idea was ease of mental model. In general, django-filter does two things:

  • validate query params
  • construct a query from those params

And I didn't want groups to deviate from this, where some groups do one thing, others something else. That said, this "ease of mental model" kind of falls apart when users need to actually know what's going on under the hood.

What I'm leaning towards now is that a group may validate and it may filter, if it implements the respective methods. So, the ExclusiveGroup and RequiredGroup would just implement validation, and a CombinedGroup would just implement filtering. CombinedRequiredGroup would implement both. This should alleviate most issues with overlapping groups. Although, in cases where filtering does overlap, users would need to take care that the queries still make sense.

rpkilby avatar Feb 04 '20 00:02 rpkilby

Sounds good to me. Two important points to note in the docs:

  1. If a field is in one or more groups that implement filter, the field's individual filter or method will not run.
  2. If a field is in multiple groups that implement filter, it will get passed to all of them.

More complex behaviour than that should use a custom FilterGroup.

jonathan-golorry avatar Feb 04 '20 03:02 jonathan-golorry

Hey @rpkilby.

What do you want to do about the Django 1.11 failures?

Can we just ditch 1.11 I think. Let's go 2.2+ only. Simplifies everything. (You up to do that? 😀)

carltongibson avatar Mar 04 '20 20:03 carltongibson

Definitely. What is your timeline on the release? Aside from the Django failures, I'd still like to implement changes discussed above.

rpkilby avatar Mar 04 '20 20:03 rpkilby

What is your timeline on the release?

No rush. Bang it into shape. Pull it together. Go. End of the month say?

carltongibson avatar Mar 04 '20 20:03 carltongibson

Oh yeah, I can manage that. If I'm being optimistic, end of week... but I've also got a lot of other stuff on my plate.

rpkilby avatar Mar 04 '20 20:03 rpkilby

No rush! 🙂

carltongibson avatar Mar 04 '20 20:03 carltongibson

Sounds good. Did you have any thoughts on the documentation? If you could review the overall structure/idea, that would be useful.

  • Rendered docs can be found at https://rpkilby.github.io/django-filter/ref/groups.html
  • One objective of the PR was to merge docstrings into the actual site docs. The idea being that this reduces potential duplication. The docs page provides some structure/prose, and class/method docstrings are inserted where appropriate. If this style is 👍, then I'd want to eventually want to look into overhauling the rest of the reference docs in a similar manner.

rpkilby avatar Mar 04 '20 21:03 rpkilby

I literally just ran into this use case. I am filtering through a many to many relationship and I do NOT want it to chain. I will give this a whirl. Sounds like it is basically ready.

guzzijones avatar Mar 21 '20 01:03 guzzijones

I am your first satisfied customer. THANK YOU! This is exactly what I needed. For now I added this branch as a requirement in my pipfile.

guzzijones avatar Mar 21 '20 02:03 guzzijones

Desperately seeking this functionality... Is there any update about whether / when this might be merged into the master?

kense20xx avatar Apr 09 '20 14:04 kense20xx

I added this branch to my requirements.txt for now.

On Thu, Apr 9, 2020, 10:33 kense [email protected] wrote:

Desperately seeking this functionality... Is there any update about whether / when this might be merged into the master?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/carltongibson/django-filter/pull/1167#issuecomment-611560901, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZ5TIPNGM2SFJHJXWOWHZLRLXMB7ANCNFSM4KMK6LWQ .

guzzijones avatar Apr 10 '20 17:04 guzzijones

any plan to merge this feature?

AsheKR avatar Apr 20 '20 08:04 AsheKR

Same demand here, hope to merge this feature into new master branch.

Aaxom avatar May 24 '20 08:05 Aaxom

+1, I am hoping this would solve multiple to-many filters.

TaipanRex avatar May 24 '20 11:05 TaipanRex

Same demand, any plan?

Detavern avatar Jun 26 '20 13:06 Detavern

Does anybody know why this PR was forgotten? :)

alex-pobeditel-2004 avatar Feb 17 '21 20:02 alex-pobeditel-2004

Any updates on this?

MuriloScarpaSitonio avatar Jun 02 '21 20:06 MuriloScarpaSitonio

For anyone who still needs many-to-many filters to AND instead of OR, we were able to get similar functionality using this PR as a guide - I dropped a rough implementation here: https://gist.github.com/russmatney/7c757989ea3d6b1343df841ce5f33bc4

russmatney avatar Nov 30 '21 21:11 russmatney

@carltongibson it's been 9 months since the docs were provided. Do you need help reviewing https://rpkilby.github.io/django-filter/ref/groups.html ? I need CombinedGroup too, but I understand you may be busy with other stuff, maybe there is something we could do to help you?

ppolewicz avatar Dec 14 '21 03:12 ppolewicz

@ppolewicz The issue is that I'm (still) not sure whether this is the way I'd like to go here, not a lack of time to hit the merge button.

carltongibson avatar Dec 15 '21 11:12 carltongibson