drf-dynamic-fields icon indicating copy to clipboard operation
drf-dynamic-fields copied to clipboard

Looking to contribute nested field filtering and DB optimization

Open mlissner opened this issue 8 months ago • 9 comments

Hi!

We use this package on our API at CourtListener.com, where we provide a public API used by many people and organizations.

We're getting two related requests from users that we want to address:

  1. Folks want to filter nested fields as described in https://github.com/dbrgn/drf-dynamic-fields/issues/15. We think this would be easy from an API perspective (just use double-underscores to indicate sub-items).

  2. Folks want this package to optimize the queries sent to the DB (probably using django's defer() method on the backend).

    This change is needed because we have a field in our API that contains the text of large documents and some users do not want to pull that from the DB in the first place.

I notice that this package hasn't been updated in several years. If we provide a clean PR with tests for each of these features, would that be welcome? If not, I think our plan will be to make a little fork with the functionality we need.

Thank you!

mlissner avatar Jun 16 '25 23:06 mlissner

I'd be curious to see what you mean by the optimised queries. We often do our optimisation at the queryset level rather than in the serializer. So if a client has requested fewer fields, you'd dynamically change the queryset to exclude the unnecessary fields.

Especially if this was added in a way that doesn't introduce new logic to the existing code, perhaps existing as an extra piece that can be mixed in, or injected that would be nice.

jtrain avatar Jun 17 '25 05:06 jtrain

if a client has requested fewer fields, you'd dynamically change the queryset to exclude the unnecessary fields

This is the idea, unless you're suggesting there's already a way to do this?

mlissner avatar Jun 17 '25 13:06 mlissner

        # NOTE: drf test framework builds a request object where the query
        # parameters are found under the GET attribute.
        params = getattr(request, "query_params", getattr(request, "GET", None))
        if params is None:
            warnings.warn("Request object does not contain query parameters")

        try:
            filter_fields = params.get("fields", None).split(",")
        except AttributeError:
            filter_fields = None

        try:
            omit_fields = params.get("omit", None).split(",")
        except AttributeError:
            omit_fields = []

        # Drop any fields that are not specified in the `fields` argument.
        existing = set(fields.keys())
        if filter_fields is None:
            # no fields param given, don't filter.
            allowed = existing
        else:
            allowed = set(filter(None, filter_fields))

        # omit fields in the `omit` argument.
        omitted = set(filter(None, omit_fields))

This is the current code on the mixin for finding which fields are truly required to be omitted. Note that it requires knowledge of the serializer. Of these omitted fields not all will correspond to a model field. Perhaps some will be related to a model field via a source.

Where do you envision the queryset mutation happening?

I think if you could work this in as a new feature that devs could optionally include in their codebases that would be well received and I would support you but this isn't my project, so I don't really have final say over that.

jtrain avatar Jun 18 '25 01:06 jtrain

We haven't really figured that out yet. I'm on the project management side. Depending on how this conversation goes we'll assign a dev and start working out the design.

mlissner avatar Jun 18 '25 02:06 mlissner

There are multiple ways you could take this. One idea off the top of my head is a new mixin for the viewset that would define a get_queryset method that takes the parent queryset and applies any simple deferrals required according to what is found in the request params.

The tricky part is that get_queryset doesn't have knowledge of the serializer. So you could fly blind and apply the deferrals based one some lookups on the viewset itself. Or you could perhaps instantiate a serializer for the purpose of using its fields property.

Another way is to wrap up similar behaviour into a helper function that someone could pass in the queryset and get a deferred one returned. This would have the benefit of avoiding a mixin.

jtrain avatar Jun 18 '25 05:06 jtrain

but this isn't my project, so I don't really have final say over that

Note: Due to new responsibilities at work and at home, I currently have very little free capacity for OSS review and maintenance. My personal requirement would be that any new features are conceptually simple (including the implementation) so that the maintenance burden remains low.

However, in case @jtrain would be interested in doing more maintenance, I'm open to transfering the project to an organization, like https://jazzband.co/ for example.

(And of course, forking and extending a project with new features is always a good option as well.)

dbrgn avatar Jun 18 '25 21:06 dbrgn

Thanks @dbrgn!

We won't know how simple the PR is until we do the work, so it sounds like what we should do is code it up and submit a PR. If you like it, great, we'll merge. If not, we'll fork.

And if @jtrain has ideas for Jazzband or other maintainership, we can adjust course as needed. But top priority, for this at least, seems to be to do the thing.

mlissner avatar Jun 18 '25 21:06 mlissner

Keeping things sync'ed here. We're planning to analyze this issue in our next sprint, then do the dev work in the following one, so we should have an update in about 3-4 weeks, I'd say.

mlissner avatar Jun 23 '25 18:06 mlissner

Here is the PR for this issue: https://github.com/dbrgn/drf-dynamic-fields/pull/42

albertisfu avatar Jul 09 '25 21:07 albertisfu