s3proxy icon indicating copy to clipboard operation
s3proxy copied to clipboard

Add a S3 storage backend that uses the AWS SDK

Open gaul opened this issue 1 year ago • 4 comments

Similar to Azure in #606 and Nio.2 in #697, using the AWS SDK instead of jclouds might allow better S3 compatibility. Given Apache jclouds will soon move to the attic it will not be possible to further evolve its S3 provider.

~~Adding an SDK-based storage backend poses challenges since there is a v1 and v2 API. AwsSdkTest currently uses v1 and migrating appears difficult even using the provided OpenRewrite recipe. I am not sure if you can have both the v1 and v2 API exist in the same application so migrating might be a prerequisite.~~

gaul avatar Dec 27 '24 17:12 gaul

S3Proxy now uses both the AWS SDK v1 and v2 for tests so there is no blocker to writing a provider using the latter.

gaul avatar Sep 04 '25 21:09 gaul

@klaudworks tagging you here if you are interested. There are a couple ways to approach this but looking at the initial commit for Azure b33f3e2826211b282c1c163438b1be452693f24c and its evolution might be helpful. The latter shows that there are a long tail of compatibility fixes that s3-tests uncovered. The former implemented all the operations including multi-part upload and copying blobs that may not be important to you. I would happily merge an incomplete implementation that only supported PutBucket, DeleteBucket, ListBuckets, PutObject, DeleteObject, GetObject, and ListObjects. The one sticking point for me is ensuring that we have adequate testing through the existing Java-based tests and s3-tests but you can selectively disable failing tests. I also want to ensure that we use the supported AWS v2 SDK and not the deprecated v1 SDK.

gaul avatar Oct 11 '25 22:10 gaul

@gaul Thank you for linking me here. I will not be able to work on another provider for the next month. However, I'm still interested and there is a good chance that I pick this up again.

I'm also thinking about what the right approach for new providers would be. At some point we have to migrate of Jcloud and I am not sure if that can even be done gradually. It might be a better idea to build the next provider from scratch without any jcloud interfaces. I assume once the architecture is set in stone is it's quite easy to also migrate the azureblob-sdk and the transient provider to the same jcloud free architecture.

We might be able to start with the aws sdk provider because noone relies on it. When it works fine, the other providers can probably be migrated quite quickly using some AI tooling such as claude code. The test coverage is quite good due to the s3-tests. Therefore, I wouldn't be too worried about the migration of individual providers.

Do you see any other path towards getting rid of jcloud anytime soon?

klaudworks avatar Oct 18 '25 16:10 klaudworks

@gaul I will look into this issue now and likely implement a solution.

klaudworks avatar Dec 11 '25 08:12 klaudworks