nodejs-storage
nodejs-storage copied to clipboard
Support batch requests
I've had a look around but I don't think this library uses the batch endpoint. It looks like its supported in other languages such as Java, but not node. Any plans to implement it?
Also from looking at the docs its not completely clear, but is object copying supported in a batch request? (The Java docs don't have a copy
method from what I can see)
We had a request for this a couple of years ago, and we turned away from it because only a few operations allowed batching (https://github.com/googleapis/google-cloud-node/issues/2457). Specifically, file uploads and downloads aren't batch-able, only metadata changes and deleting files.
My only guess for why copy
wouldn't be available is because it creates a new file, which would be similar to uploading, but a concrete answer on that would be great.
@frankyn what do you think?:
- Should our API introduce batching support?
- Is there a list of remote operations that support batching?
@stephenplusplus thanks for the quick reply.
Even if copy
wasn't supported, it could still be used to delete a large number of files... Number of API requests aside and if there was a batch node API would it be quicker / better / preferable if you used bucket.deleteFiles
(which does a .getFiles
and then a delete on each in batches of 10) or bucket.getFiles
then do a batch delete on the files? (This would be from inside a Google Cloud (Firebase) function)
I'm assuming batch would be better but not going to bother looking at implementing it if its not worth it..!
What about batching for uploading large numbers of small files? this seems to be probably the main reason for doing so and can make an enormous performance difference. This is supported by gsutil command already.
"Each Cloud Storage upload transaction has some small overhead associated with it, which in bulk scenarios, can quickly dominate the performance of the operation. For example, when uploading 20,000 files that are 1kb each, the overhead of individual upload takes more time than all the entire upload time altogether. This concept of overhead-per-operation is not new, nor is the solution: batch your operations together"
Source: https://cloud.google.com/blog/products/gcp/optimizing-your-cloud-storage-performance-google-cloud-performance-atlas
@gWOLF3 the batch API we’re talking about here doesn’t support file creation
What batch api is this... is there a batch api which does support file creation?
How is this implemented by gsutil? https://github.com/GoogleCloudPlatform/gsutil/blob/2bab315919b4aba8b2a95732571396803c1776db/gslib/commands/cp.py
I am also curious about parallel composite uploads for large files.
Same problem as @gWOLF3. We have to upload more than 1000 html files simultaneously and the performance is awful.
Batch requests in this case is specifically for the Storage Batch API: https://cloud.google.com/storage/docs/json_api/v1/how-tos/batch
Helps with:
- Updating metadata, such as permissions, on many objects.
- Deleting many objects.
I think you're looking more for performent parallel uploads, is that right?
I think you're looking more for performent parallel uploads, is that right?
Yes exactly. I know that currently the storage batch API doesn't support batch operations for up- and downloads, so i'm wondering if there is any plan to support that eventually. I found the following feature request https://issuetracker.google.com/issues/142641783 but i'm not sure if there is any progress beeing made. Further In my research i stumbled upon this video https://www.youtube.com/watch?v=oEto_3jr1ec, which suggests using composite objects to speed up things (https://goo.gle/2mh4Ei0 ). I didn't try that suggestion yet, but it seems to complicate things quite a bit.
Adding my 2 cents: I'd like support for this in order to improve the performance of deleting a large number of files.
Currently I have to do something like this
const bucket = firebaseAdmin.storage().bucket();
const promises = files.map((filePath) => {
return bucket
.file(filePath)
.delete()
.catch((e) => console.log("error deleting file", e.message));
});
await Promise.all(promises);
Ideally, I could just call bucket.deleteFiles(files)
I’d forgotten about this issue...
I think you're looking more for performent parallel uploads, is that right?
Ideally yes but as the batch API doesn’t support that I’d like to see batch deletions please!
Adding my 2 cents: I'd like support for this in order to improve the performance of deleting a large number of files.
Ideally, I could just call
bucket.deleteFiles(files)
I've got the exact same use-case and would love to be able to use the batch API to delete multiple files.
This issue is duplicated and now being tracked by https://github.com/googleapis/nodejs-storage/issues/1868
@shaffeeullah not sure if that is a duplicate - only mentions uploads / downloads and copying
@rhodgkins Good callout. This issue includes things like delete
which are supported by cloud storage (https://cloud.google.com/storage/docs/batch#overview). We'll look into it.
This would be super useful for bulk deletions!
The current guidance for bulk deletions is to leverage OLM. Further additions for batch operations are unlikely to be implemented and as such I am going to close this issue.