tus-node-server
tus-node-server copied to clipboard
Race condition saving files to S3
When saving using tus-node-server with the S3 backend the meta data gets saved in a .info
file its metadata.
This file also gets used to reply to a HEAD
request which uppy uses to determine if an upload succeeded (which it does after a PATCH resulted in 502?). So there is a race condition where tus-node-server actually failed in saving the file (hence the 502) but the HEAD call returns that there is a file since tus-node-server only check the meta .info
file.
Any idea how to solve this? Would an extra call to S3 to check if the main file also exists be a solution?
Instead of an extra call to to check if the file actually exists, instead maybe it would be a better idea to write to the .info
file with a flag notifying that the upload is completely done? This could happend in the _finishMultipartUpload
method:
https://github.com/tus/tus-node-server/blob/4b44018d1cb1cc7c0c5c770f96581ef28c3cf1d9/lib/stores/S3Store.js#L330-L350
After some more debugging it seems like the there exists a case that the _finishMultipartUpload
never gets called but the part is uploaded. Thus the meta data call will think that all parts are uploaded but the multipart upload never actually finished.
No idea how to fix this. For now I have made a clone of tus-node-server
and also check if there is an ongoing multipart upload ongoing with the current file_id:
isMultipartUploadFinished(file_id) {
return this.client.listMultipartUploads({
Bucket: this.bucket_name,
Prefix: file_id
}).promise().then((data) => {
return data.Uploads.some(upload => upload.Key === file_id) === false;
}).catch((err) => {
throw err;
})
}
This piece of code is added to the S3Store which I call from the HeadHandler which will return a 404 if the mutlipart upload is still ongoing resulting in Uppy creating a complete new upload. Sadly this is a very dirty solution since S3 will keep the previous upload reserved thus we get chared for "unused" storage.
As far as I know, tus-node-server has little protection against concurrent access which can result in these race conditions as you mentioned. I also don't know how to solve this but we're happy to accept a PR if you could supply one.
Alternatively, I would recommend you to consider using another tus server (such as the more mature https://github.com/tus/tusd) as we tus-node-server is in a bad shape right now (see https://github.com/tus/tus-node-server/blob/master/README.md).
Hey, thanks for pointing me to tusd, will certainly have another look at it (gotta sharpen my go skill haha). Anyway the main issue is the 'Complete Multipart Upload' no being called correctly after an upload is done. I tried a few different options but couldn't solve it reliably. I hope I can get tusd running, cheers!
It's been a while since I last touched the S3 data store. I will try to have a look at it in the next few weeks so we can refactor it and add tests :)
Hey guys. When do you think this issue could be resolved?
@xpepermint Frankly, I am not sure if anyone is currently actively working on tus-node-server, so it's quite likely that no support is available from our side. Would it be possible for you to help?
OK, thanks.
This should be fixed by now