aws-sdk-rust
aws-sdk-rust copied to clipboard
S3 - Delete all files under a folder (recursive delete)
Describe the feature
Delete all files underneath a folder (recursive delete) using a prefix.
Use Case
It would be nice to be able to give the prefix and delete under it versus trying to iterate through all the objects and deleting them individually.
ex. Bucket structure
- folder1
- file 1
- file 2
- file 3
- folder2
- file 4
- folder3
Delete everything under "folder1/"
Proposed Solution
Ideally it would be nice if delete_objects took the prefix builder function argument.
ex.
// Deletes all files in folder1
let bucket = "my-bucket";
let prefix = "folder1/";
let s3res = s3.delete_objects()
.bucket(bucket)
.prefix(prefix)
.send();
A possible workaround for right now might be, iterating through all the objects then building the delete_objects vec using the iterated page keys. Though I'm not sure if there's any gotchas or issues with this approach.
use std::error::Error;
use tokio_stream::StreamExt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let shared_config = aws_config::load_from_env().await;
let s3 = aws_sdk_s3::Client::new(&shared_config);
let prefix = "prefix";
let bucket = "bucket";
let mut pages = s3.list_objects_v2()
.bucket(bucket)
.prefix(prefix)
.into_paginator()
.send();
let mut delete_objects: Vec<ObjectIdentifier> = vec![];
while let Some(page) = pages.next().await {
let obj_id = ObjectIdentifier::builder().set_key(Some(page?.key)).build();
delete_objects.push(obj_id);
}
let delete = Delete::builder().set_objects(Some(delete_objects)).build();
s3.delete_objects()
.bucket(bucket)
.delete(delete)
.send()
.await?;
println!("Objects deleted.");
Ok(())
}
Other Information
Possible temporary workaround provided in proposed solution.
Acknowledgements
- [ ] I may be able to implement this feature request
- [ ] This feature might incur a breaking change
A note for the community
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue, please leave a comment
makes sense! This isn't something that will be included in the SDK, but is a good candidate for a high level S3 library based on the AWS SDK
This could also be a helpful example
Though I'm not sure if there's any gotchas or issues with this approach.
With a sufficiently large list of objects you may run out of RAM with this approach as you're building up a large Vec of all of the objects. The basic approach itself is fine, and this issue could be avoided with a little tweaking. Perhaps by taking the paginated objects a few thousand at a time and deleting those, ensuring you never have to deal with a huge Vec.
I think you want to do something like:
- list objects/versions for your target prefix with pagination
- take paginated results in chunks of 1000
- invoke the DeleteObjects API for each chunk
Based on the API examples, it looks like you can delete objects while holding a pagination cursor. This method avoids building a big vector when operating on large buckets.