cvat icon indicating copy to clipboard operation
cvat copied to clipboard

CVAT server doesn't handle interrupted uploads well

Open SpecLad opened this issue 3 years ago • 2 comments

When I was working on #4839, I checked how the server would handle an upload that was interrupted (e.g. by a network disconnection), and the result was not good. The server tries to read the entire request body into memory at once:

https://github.com/opencv/cvat/blob/e5d01359aa09521c9ca87f0ec77a6dede097211b/cvat/apps/engine/mixins.py#L94

If the upload is interrupted, that raises an exception, and all data that has already been transmitted is discarded. This is obviously suboptimal - the whole point of TUS is to support upload resumption, so if some data has been transmitted successfully, it should be written to the file, and the upload should then be able to resume from that point.

The current behavior is also problematic from another perspective: by reading the entire upload into memory, we might potentially consume an unbounded amount of memory.

I think the right thing to do here is to read the uploaded file in small chunks, and write each chunk to the destination file before handling the next one. That will solve both problems.

SpecLad avatar Nov 09 '22 09:11 SpecLad

The current behavior is also problematic from another perspective: by reading the entire upload into memory, we might potentially consume an unbounded amount of memory.

Okay, that wasn't quite accurate: the amount of memory is bounded by the DATA_UPLOAD_MAX_MEMORY_SIZE Django setting, which is currently set to 100 MiB. Still, we could avoid consuming even that much by reading the request body in chunks.

SpecLad avatar Nov 09 '22 10:11 SpecLad

Come to think of it, this also means that we don't accept TUS upload requests with a chunk size greater than 100 MB. Which is not great, but also not terrible, since neither the UI nor the SDK should send chunks bigger than that.

SpecLad avatar Nov 09 '22 16:11 SpecLad