anonlink-entity-service
anonlink-entity-service copied to clipboard
resumable CLK file upload
Users experience problems with the current file upload if the internet connection is not that great. Eventually there will be a timeout and all the progress is for naught.
Related problem: The clk uploads a collected in total in memory before being written to minio storage. This will eventually lead to memory shortages.
Proposal 1:
As we need something quick-ish... Why not use the features of minio directly.
They provide the server functionality and a client library to perform data upload which is resumable.
Basically, something along those lines: https://docs.minio.io/docs/upload-files-from-browser-using-pre-signed-urls
Calling the POST
on the clk
endpoint returns the pre-signed url for the bucket to upload to.
In clkhash, use the minio python client to perform the upload.
Towards proposal 2:
Google drive has a nice rest api for file uploads which also allows resume. https://developers.google.com/drive/api/v3/resumable-upload I don't know if they open source any of the components involved, though. Maybe someone else implemented something somewhere?
Proposal 1 is #20 - I agree it would be the easier approach.
Okay, although minio does provide a client for resumable uploads, it unfortunately does not work with the pre-signed upload urls. :( For this to work, the user would have to instantiate a full minio client with credentials and everything.
Another problem is that we do not want to expose the minio root. Thus, minio is address differently from within the cluster than from outside. However, minio will sign the download url with the inside host and port, with no option to provide the outside address.
This seems too much of a security nightmare... We should probably look at something else...
For proposal 2: The google client api python library is available under Apache license here: https://github.com/google/google-api-python-client/blob/master/googleapiclient/http.py. So the client code for this proposal could then easily be adopted from the google lib.