django icon indicating copy to clipboard operation
django copied to clipboard

Refs #16022 - Use a weak reference for file field memory optimization

Open massover opened this issue 2 years ago • 3 comments

Ticket

I believe that the tests should all pass. However, I have added a test to my example project that fails. I believe this is breaking if the weak ref is lost while trying to use FileField.save. The documented api itself is cyclic which is going to be tricky.

Let me show the performance difference I observed.

before:

(the drop is a new deployment)

image

after:

(the drop is a new deployment)

image

For my application, the memory utilization and latency saw such huge improvements. The memory utilization is evident in the graphs. The latency improvements come mostly in P99 (requests that are unlucky to catch gc). As @carltongibson notes, it's not a leak per se. It is eventually garbage collected. However, in my application, we were building up so much gen 2 garbage that when gc eventually starts, the latency takes a huge hit. With no reference cycle, there is less garbage.

Since this seems like a breaking change as far as I can see, I would hope that some other people can try this out and see if it makes a noticeable difference in their application (as it did for mine). Here is some code that should patch:

def patch_field_file():
    import weakref
    from django.db.models.fields.files import FieldFile

    def __init__(self, instance, field, name):
        super(FieldFile, self).__init__(None, name)
        self.instance = weakref.proxy(instance)
        self.field = field
        self.storage = field.storage
        self._committed = True

    FieldFile.__init__ = __init__

The way to verify it would be using your existing memory metrics (!) or using something like psutil in code (such as a middleware) and printing it.

class psutilMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response
        self.process = psutil.Process(os.getpid())
        self.memory_start = process.memory_full_info().rss / 1024 ** 2

    def __call__(self, request):
        response = self.get_response(request)
        memory_end = process.memory_full_info().rss / 1024 ** 2
        memory_diff = memory_end - self.memory_start
        print(f"{self.memory_start=}, {memory_end=},  {memory_diff=}")
        return response

I get that we all choose python and make tradeoffs on performance in this interpreted garbage collected language. For this improvement, I'm looking for data next, not to imply that because it made a huge impact on my project that it's the way to go. It would be nice if someone else could give their input on a real project! As an example, I believe that something like the following could be affected:

A User model and api endpoint with an "avatar" file field. If an instance of user has lots of other data on it, I believe that every get request to api/user/:id/ would be creating lots of garbage, which could be increasing request times and worker memory usage non trivially.

If the api can't break, and we'd like to move forward, here is a suggestion

  • descriptor based compatibility flag
  • global settings compatibility flag
class FieldFile(File):
    def __init__(self, instance, field, name):
        super().__init__(None, name)
        
        if field.weak:
           self.instance = weakref.proxy(instance)
        elif settings.FILE_FIELD_WEAKREF_COMPAT:
           self.instance =  weakref.proxy(instance)
        else:
           self.instance = instance     
        self.field = field
        self.storage = field.storage
        self._committed = True
        
 # where we can initialize a file field like:
 
 class Example(models.Model):
     f = models.FileField(weak=True)

massover avatar Apr 13 '22 15:04 massover

Hi @massover — Thanks again for this: good stuff.

However, I have added a test to my example project that fails. I believe this is breaking if the weak ref is lost while trying to use FileField.save. The documented api itself is cyclic which is going to be tricky.

So — yes this is the tricky bit — if you have only the reference to the FieldFile in scope then both the save() and delete() will be broken, since both reach back to save the instance by default. This is documented and is presumably the intended usage, since otherwise we'd just have save the instance after updating the file, which sounds clunky.

It's an open question as to how often folks would hit this: i.e. in real-life is the instance normally still in scope, and so still live, so this problem wouldn't come up? But I think a decent error message in each case, and tests on that would be a good starting point.

I like the idea of allowing the option to use a weakref on FileFiled here.

Without new API (but with new docs...) we could subclass FileFiled and set attr_class to a (say) WeakFieldFile — but it may be your approach is cleaner. I'd like to get other opinions before we settle — I think it depends on what, if any, changes are needed in save() and delete()... (I did consider using ref() and then having a @property instance to deference that, but likely proxy is sufficient.)

In summary:

  • I think there's a clear case to offer the weak option.
  • Maybe there's a stronger case to make it the default.
  • And if so, maybe we could deprecate the non-weak version.

Both the latter two need wider discussion/agreement I think.

Are you happy to push forward on the first part at least?

Thanks again! 🏅

carltongibson avatar Jun 22 '22 13:06 carltongibson

Hi @massover — Thanks again for this: good stuff.

However, I have added a test to my example project that fails. I believe this is breaking if the weak ref is lost while trying to use FileField.save. The documented api itself is cyclic which is going to be tricky.

So — yes this is the tricky bit — if you have only the reference to the FieldFile in scope then both the save() and delete() will be broken, since both reach back to save the instance by default. This is documented and is presumably the intended usage, since otherwise we'd just have save the instance after updating the file, which sounds clunky.

It's an open question as to how often folks would hit this: i.e. in real-life is the instance normally still in scope, and so still live, so this problem wouldn't come up? But I think a decent error message in each case, and tests on that would be a good starting point.

I like the idea of allowing the option to use a weakref on FileFiled here.

Without new API (but with new docs...) we could subclass FileFiled and set attr_class to a (say) WeakFieldFile — but it may be your approach is cleaner. I'd like to get other opinions before we settle — I think it depends on what, if any, changes are needed in save() and delete()... (I did consider using ref() and then having a @property instance to deference that, but likely proxy is sufficient.)

In summary:

  • I think there's a clear case to offer the weak option.
  • Maybe there's a stronger case to make it the default.
  • And if so, maybe we could deprecate the non-weak version.

Both the latter two need wider discussion/agreement I think.

Are you happy to push forward on the first part at least?

Thanks again! 🏅

Yes I can push forward on the first part. Thank you for the review!

massover avatar Jun 24 '22 19:06 massover

@carltongibson This pr now includes a FileField with a weak kwarg.

  • [ ] documentation
  • [ ] should we raise a custom exception or rely on the built in one?
======================================================================
ERROR: test_weak_file_field_save_out_of_scope (model_fields.test_filefield.FileFieldTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/massover/Personal/django/tests/model_fields/test_filefield.py", line 204, in test_weak_file_field_save_out_of_scope
    weakfile.save("new-name", new)
  File "/Users/massover/Personal/django/django/db/models/fields/files.py", line 97, in save
    setattr(self.instance, self.field.attname, self.name)
ReferenceError: weakly-referenced object no longer exists

Lastly, I updated the little example project with a command that shows the data sizes where the reference cycle starts to matter

1kb, memory_start=144.12109375MB, memory_end=144.8359375MB,  memory_diff=0.71484375MB
10kb, memory_start=144.83984375MB, memory_end=145.79296875MB,  memory_diff=0.953125MB
100kb, memory_start=145.89453125MB, memory_end=150.484375MB,  memory_diff=4.58984375MB
1mb, memory_start=148.53125MB, memory_end=203.13671875MB,  memory_diff=54.60546875MB
10mb, memory_start=183.9921875MB, memory_end=724.546875MB,  memory_diff=540.5546875MB
100mb, memory_start=534.40234375MB, memory_end=4619.51171875MB,  memory_diff=4085.109375MB

massover avatar Jun 29 '22 02:06 massover