toolbelt
toolbelt copied to clipboard
MultipartEncoder assumes total_len(file) is cheap
We were uploading files from Django to Box using https://github.com/box/box-python-sdk, which uses requests-toolbelt MultipartEncoder, and things were really slow, orders-of-magnitude slower than they should be, despite assurances that there was no throttling in place between the AWS servers and Box's upload server.
A profile explains why: Thread Profile for Box Upload.pdf
We're spending lots time making size
requests on the file being uploaded. Looks like multiple for every pass through the network loop.
This can be real slow if the "file" represents a network resource, and especially with Django's current implementation of the method. See related bug https://code.djangoproject.com/ticket/9631#comment:8
Does Part.bytes_left_to_write
need to recompute total_len(self.body)
every time? That is, does it assume that Part.body
may be changing while it's in process, or can we do this only once in __init__
where part.len
is set?
This is related to #171 , but I filed this under a separate issue, because it's not manifesting as "infinite loop" here.
In #171 there's a comment:
Is there anyway to determine the remaining amount? [...] As it stands, Django is breaking the way file-like objects work in Python.
but file objects have no support for __len__
at all, and several of the implementations in multipart.encoder.total_len
do return the size of the entire file (as is what I would expect of a method called total_len
), not something measured relative to the current seek position.
Does Part.bytes_left_to_write need to recompute total_len(self.body) every time?
Frankly, I don't remember. It's been years since I touched that part of this code. Do you have time to experiment/git history splunk to determine if it is necessary?
but file objects have no support for len at all,
It seems like you're making assumptions of what total_len
does. It's a simple way of grabbing the length of a string/bytes/file/IO object based on what we're doing. I also haven't reviewed #171.