tus-node-server icon indicating copy to clipboard operation
tus-node-server copied to clipboard

Race condition saving files to S3

Open DJWassink opened this issue 5 years ago • 8 comments

When saving using tus-node-server with the S3 backend the meta data gets saved in a .info file its metadata. This file also gets used to reply to a HEAD request which uppy uses to determine if an upload succeeded (which it does after a PATCH resulted in 502?). So there is a race condition where tus-node-server actually failed in saving the file (hence the 502) but the HEAD call returns that there is a file since tus-node-server only check the meta .info file.

Any idea how to solve this? Would an extra call to S3 to check if the main file also exists be a solution?

DJWassink avatar May 07 '19 14:05 DJWassink

Instead of an extra call to to check if the file actually exists, instead maybe it would be a better idea to write to the .info file with a flag notifying that the upload is completely done? This could happend in the _finishMultipartUpload method:

https://github.com/tus/tus-node-server/blob/4b44018d1cb1cc7c0c5c770f96581ef28c3cf1d9/lib/stores/S3Store.js#L330-L350

DJWassink avatar May 08 '19 07:05 DJWassink

After some more debugging it seems like the there exists a case that the _finishMultipartUpload never gets called but the part is uploaded. Thus the meta data call will think that all parts are uploaded but the multipart upload never actually finished.

No idea how to fix this. For now I have made a clone of tus-node-server and also check if there is an ongoing multipart upload ongoing with the current file_id:

    isMultipartUploadFinished(file_id) {
        return this.client.listMultipartUploads({
            Bucket: this.bucket_name,
            Prefix: file_id
        }).promise().then((data) => {
            return data.Uploads.some(upload => upload.Key === file_id) === false;
        }).catch((err) => {
            throw err;
        })
    }

This piece of code is added to the S3Store which I call from the HeadHandler which will return a 404 if the mutlipart upload is still ongoing resulting in Uppy creating a complete new upload. Sadly this is a very dirty solution since S3 will keep the previous upload reserved thus we get chared for "unused" storage.

DJWassink avatar May 08 '19 09:05 DJWassink

As far as I know, tus-node-server has little protection against concurrent access which can result in these race conditions as you mentioned. I also don't know how to solve this but we're happy to accept a PR if you could supply one.

Alternatively, I would recommend you to consider using another tus server (such as the more mature https://github.com/tus/tusd) as we tus-node-server is in a bad shape right now (see https://github.com/tus/tus-node-server/blob/master/README.md).

Acconut avatar May 10 '19 06:05 Acconut

Hey, thanks for pointing me to tusd, will certainly have another look at it (gotta sharpen my go skill haha). Anyway the main issue is the 'Complete Multipart Upload' no being called correctly after an upload is done. I tried a few different options but couldn't solve it reliably. I hope I can get tusd running, cheers!

DJWassink avatar May 10 '19 07:05 DJWassink

It's been a while since I last touched the S3 data store. I will try to have a look at it in the next few weeks so we can refactor it and add tests :)

rictorres avatar May 10 '19 09:05 rictorres

Hey guys. When do you think this issue could be resolved?

xpepermint avatar Jul 24 '19 07:07 xpepermint

@xpepermint Frankly, I am not sure if anyone is currently actively working on tus-node-server, so it's quite likely that no support is available from our side. Would it be possible for you to help?

Acconut avatar Jul 25 '19 11:07 Acconut

OK, thanks.

xpepermint avatar Jul 25 '19 11:07 xpepermint

This should be fixed by now

Murderlon avatar Dec 13 '22 13:12 Murderlon