aws-sdk-rust
aws-sdk-rust copied to clipboard
[request]: S3 get folder (directory) size
A note for the community
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue, please leave a comment
Tell us about your request
I'd like for the s3 package to also include a way to get the total size for a folder (directory).
Something similar to
aws s3 ls --summarize --human-readable --recursive s3://bucket/folder
or from boto3
import boto3
def get_folder_size(bucket, prefix):
total_size = 0
for obj in boto3.resource('s3').Bucket(bucket).objects.filter(Prefix=prefix):
total_size += obj.size
return total_size
ex. bucket_name/folder - 4Gb (total size including subdirectories)
Tell us about the problem you're trying to solve.
We have limits on how much data can be uploaded to a folder so knowing the total size and by extension being able to show it to the user is important. Another issue is if the size of the folder is really high (ex. 100Gb) we'd like to prevent the user from downloading everything in one shot vs a folder of 30mb and a few dozen files.
Are you currently working around this issue?
Still in the process of migrating some services to rust and by extension aws-sdk-rust, so not working around it but it is blocking us from moving over fully.
Additional context
No response
Hello and thank for the feature request! The S3 client only contains generated code from models, but this would certainly be good functionality for a high-level library for S3. I'd suggest porting the boto code to Rust for the time being via list_objects_v2: https://docs.rs/aws-sdk-s3/0.6.0/aws_sdk_s3/client/struct.Client.html#method.list_objects_v2
Be sure that you use into_paginator() to ensure you read all pages of results.
use std::error::Error;
use tokio_stream::StreamExt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let shared_config = aws_config::load_from_env().await;
let s3 = aws_sdk_s3::Client::new(&shared_config);
let prefix = "prefix";
let bucket = "bucket";
let mut pages = s3
.list_objects_v2()
.bucket(bucket)
.prefix(prefix)
.into_paginator()
.send();
let mut total: i64 = 0;
while let Some(page) = pages.next().await {
total += page?
.contents()
.unwrap_or_default()
.iter()
.map(|obj| obj.size)
.sum::<i64>();
}
println!("total size: {}", total);
Ok(())
}
Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.