uppy icon indicating copy to clipboard operation
uppy copied to clipboard

Rethink the rate limited queue

Open Murderlon opened this issue 1 year ago • 16 comments

Initial checklist

  • [X] I understand this is a feature request and questions should be posted in the Community Forum
  • [X] I searched issues and couldn’t find anything (or linked relevant results below)

Problem

Uppy has multiple uploaders: tus (used in transloadit), xhr, aws-s3, and aws-s3-multipart. These uploaders make different HTTP requests per file.

Example uploading remote files (Google Drive and such) with Tus and Companion

  1. POST requests to Companion (sends back token, starts timeout timer for websocket connection)
  2. WebSocket request to /api/:token (the download is started in Companion once the socket connects, and reports progress back over the websocket)

Example uploading local files with AWS S3 Multipart

  1. POST create
  2. POST signing the part to be uploaded
  3. PUT request to upload the file
  4. POST request to mark the upload as finished

We also have a rate limited queue, a separate implementation reused in all these plugins. The rate limited queue holds individual requests, such as one POST create request for S3. If a user uploads 10K files, this helps to manage the load on the server as well as the load on the browser.

Problem

For simplicity, let's say the queue limit is 1 and we upload 2 big files with AWS S3 Multipart. The flow is create, sign (returns pre-signed urls which can expire), put, finish. The queue lifecycle would look like this (empty means success and the request is popped from the queue):

  1. [create] (file 1)
  2. []
  3. [create] (file 2)
  4. []
  5. [sign] (file 2)
  6. []
  7. [sign] (file 2)
  8. []
  9. [put] (file 1)
  10. ... many more put's for file 1
  11. before we get to file 2, the presigned URLs are expired

The same problem occurred with remote uploads, the websocket would timeout because the big files take to long before a websocket connection was made. (This particular problem was "solved" by changing the order just in time, but that only works because there are two requests to coordinate, and it's more a hack than a guaranteed order).

Case in point: the queue is not aware of which requests belong to the same file. The order is incorrect and causes timeouts.

Solution

One solution could be to have an array of arrays in the queue, for which the inner array holds the requests needed to fulfill a single file for the uploader in use.

Tus (local files): [[HEAD, POST, PUT], [HEAD, POST, PUT]] AWS S3 Multipart: [[create, sign, put, finish], [create, sign, put, finish]] etc

The limit does not have to change. The default is 5 and we recommend to not go higher than 20, in these cases the amount of parallel requests is still acceptable.

Alternatives

It is important to note that the rate limited queue was introduced without thinking things through enough. This has resulted in many unforeseen bugs and patches over the months. I urge us to deeply think about any conclusion we reach here before going into implementation to solve this for once and all.

Scope

One thing to take into account is a similar issue in the code base. See this comment:

https://github.com/transloadit/uppy/blob/84894b9b7a02b61df436b7933199be645350af06/packages/%40uppy/aws-s3/src/index.js#L1-L26

The scope of this issue may be limited to only the RateLimitedQueue, but it in my opinion solutions must take into account a bigger problem in the codebase: the pipeline and upload logic duplication.

Three things are important to note:

  1. @uppy/aws-s3 currently has an entire copy of @uppy/xhr-upload, but modified in some parts for S3. Any fixes in core uploading logic in either plugin means very carefully looking into the other plugin as well the patch it there too. Naturally, this is error prone and often forgotten.
  2. The uppy pipeline, which is this.preProcessors, this.uploaders, and this.postProcessors (all files go through each stage, instead of individual files going through these stages), seems to be partially at odds with the rate limited queue, at least conceptually.
  3. The remote uploading logic is almost completely duplicated between Tus, XHR, and AWS S3 but slightly different. This is as well highly error prone and duplicated.

Therefor I think my proposed solution is not good enough in itself, and seems to further go into only patching a bigger problem. For the longevity of the project, I can't help but feel it's a bit of business risk to not tackle these problems relatively soon with the current team.

Known quick wins

  • Use @uppy/aws-s3 instead of @uppy/aws-s3-multipart, which has these issues less.
  • Set high timeouts for everything that has timeouts...

Questions / things to think about

  • Conceptually, how do you think the pipeline plays together with a rate limited queue? Uppy's original intent was that the pipeline is used for pre-signing and such, rather than baking more complexity into a queue.
  • How much effort would it be to properly solve this?
  • How do we prioritize this?

Murderlon avatar Oct 25 '22 15:10 Murderlon