s3fs icon indicating copy to clipboard operation
s3fs copied to clipboard

Switch to v2 of the aws sdk

Open guenhter opened this issue 3 years ago • 15 comments

Amazon has a version 2 of the api which is already the official predecessor of v1.

The main advantages of the new API are performance improvements, easier error handling, easier credentials handling, ...

https://aws.amazon.com/blogs/developer/client-updates-in-the-preview-version-of-the-aws-sdk-for-go-v2/

guenhter avatar Feb 07 '22 13:02 guenhter

This is of course a breaking change.

guenhter avatar Feb 07 '22 13:02 guenhter

According to pkg.go.dev, as of today v1 is imported by 4672 pkgs while v2 by only 462. S3FS in particular won't benefit from the updated lib that much so I don't think it makes sense to break current users code just to update the lib version. It's too early, we can revisit this once the adoption of v2 is higher, or v1 of s3 becomes deprecated (currently it's still maintained)

jszwec avatar Feb 07 '22 17:02 jszwec

Fair enough. I'll keep this MR open if you don't mind ... just in case.

guenhter avatar Feb 08 '22 05:02 guenhter

This can either just be published as a new major version (not breaking any current users) or be published as a separate package, maybe by @guenhter just like aws-go-sdk-v2 is.

thomasf avatar Feb 10 '22 21:02 thomasf

Yes it could be published as a v2, but then v1 is far from being deprecated due to the fact how popular aws sdk v1 is. This also means that there will be two versions of the library to maintain. Considering the current popularity of aws sdk v1, is it really worth it?

jszwec avatar Feb 10 '22 22:02 jszwec

@jszwec The v2 of the api has some big benefits. It has a nicer configuration handling (not in regard to the go api but v2 allows e.g. to define the region in ~/.aws/config). So having v2 is usually a good way. I still see your point that v1 is still heavily used. Do you want me just to start off a own library (like @thomasf suggested) named s3fs-v2?

guenhter avatar Feb 15 '22 13:02 guenhter

@guenhter Since this is a breaking change I would suggest adding a context argument to s3fs.New while you are at it so we get better canceling.

thomasf avatar Feb 15 '22 14:02 thomasf

@thomasf Good idea. Done. Could you have another look if you like it that way.

guenhter avatar Feb 15 '22 15:02 guenhter

@guenhter

The v2 of the api has some big benefits. It has a nicer configuration handling (not in regard to the go api but v2 allows e.g. to define the region in ~/.aws/config).

You can already do it in v1: https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html

So having v2 is usually a good way.

I am not saying that we shouldn't use it, I am saying that we don't need to rush it, since it's not popular yet

@thomasf @guenhter What is the point of providing context to New? FS is supposed to be a long lived var, so it's not like you are going to be recreating FS for every request to have cancelations, and fs.FS doesn't support context per Read. Could you please elaborate?

jszwec avatar Feb 15 '22 16:02 jszwec

@thomasf @guenhter What is the point of providing context to New? FS is supposed to be a long lived var, so it's not like you are going to be recreating FS for every request to have cancellations, and fs.FS doesn't support context per Read. Could you please elaborate?

It doesn't have to be a long lived variable, a struct like this is cheap and can be instantiated every time a request is handled if you want all AWS client communication to cancel if an HTTP request is closed by a client in a request handler (or whatever). You are correct that it is a kind of a blunt tool because it doesn't work on a per file system request basis but it is still useful.

thomasf avatar Feb 15 '22 16:02 thomasf

Adding the context at FS creation was suggested as a good solution during the io/fs proposal https://github.com/golang/go/issues/41190#issuecomment-690819876

thomasf avatar Feb 15 '22 16:02 thomasf

I think you might to check the context for errors before new calls. Not sure if this is the best way to do it.

diff --git a/fs.go b/fs.go
index fe74df5..7ca2179 100644
--- a/fs.go
+++ b/fs.go
@@ -119,6 +119,13 @@ func (f *S3FS) Open(name string) (fs.File, error) {

 // Stat implements fs.StatFS.
 func (f *S3FS) Stat(name string) (fs.FileInfo, error) {
+       if err := f.context.Err(); err != nil {
+               return nil, &fs.PathError{
+                       Op:   "stat",
+                       Path: name,
+                       Err:  err,
+               }
+       }
        fi, err := stat(f.context, f.s3Api, f.bucket, name)
        if err != nil {
                return nil, &fs.PathError{
diff --git a/fs_integration_test.go b/fs_integration_test.go
index b0d0f49..ce43d6e 100644
--- a/fs_integration_test.go
+++ b/fs_integration_test.go
@@ -567,6 +567,32 @@ func TestFS(t *testing.T) {
                        testFn(t, f.s3fs)
                })
        }
+
+       t.Run("test cancel context", func(t *testing.T) {
+               ctx, cancel := context.WithCancel(context.Background())
+               fsys := s3fs.New(ctx, s3cl, *bucket)
+               {
+                       fi, err := fsys.Stat(".")
+                       if err != nil {
+                               t.Fatal(err)
+                       }
+                       if fi == nil {
+                               t.Fatal("file info is nil")
+                       }
+               }
+               cancel()
+               {
+                       fi, err := fsys.Stat(".")
+                       if err == nil {
+                               t.Fatal("expected error")
+                       }
+                       if fi != nil {
+                               t.Fatal(fi)
+                       }
+               }
+
+       })
+
 }

 func newClient(t *testing.T) s3fs.S3Api {

thomasf avatar Feb 15 '22 21:02 thomasf

fair enough

context errors dont need to be checked like you propose. S3 client should be responsible for doing that.

jszwec avatar Feb 15 '22 21:02 jszwec

Yeah I am not sure why the test doesn't fail unless I add it explicitly, maybe there is a problem somewhere else. It is probably a bug in the aws sdk or it is intended to work like that for some weird reason.

thomasf avatar Feb 15 '22 22:02 thomasf

DOH!. I arbitrarily tested a stat call for . which is the only hard coded return of a dir entry struct in fs.go. So no issue at all, works as expected when the code actually uses AWS API calls.

diff --git a/fs_integration_test.go b/fs_integration_test.go
index b0d0f49..d7356c0 100644
--- a/fs_integration_test.go
+++ b/fs_integration_test.go
@@ -567,6 +567,34 @@ func TestFS(t *testing.T) {
                        testFn(t, f.s3fs)
                })
        }
+
+       t.Run("test cancel context", func(t *testing.T) {
+               ctx, cancel := context.WithCancel(context.Background())
+               fsys := s3fs.New(ctx, s3cl, *bucket)
+               {
+                       fi, err := fsys.ReadDir(".")
+                       if err != nil {
+                               t.Fatal(err)
+                       }
+                       if fi == nil {
+                               t.Fatal("file info is nil")
+                       }
+               }
+               cancel()
+               {
+                       fi, err := fsys.ReadDir(".")
+                       if err == nil {
+                               t.Fatal("expected error")
+                       }
+                       if fi != nil {
+                               t.Fatal(fi)
+                       }
+               }
+
+       })
+
 }

thomasf avatar Feb 15 '22 22:02 thomasf