spring-cloud-aws
spring-cloud-aws copied to clipboard
Read bucket region from redirect header
:loudspeaker: Type of change
- [ ] Bugfix
- [ ] New feature
- [x] Enhancement
- [ ] Refactoring
:scroll: Description
Read the bucket region from the PermanentRedirect error message headers and cache a client for that region.
Unfortunately this was not doable with the LruCache, since it does not permit manual insertion of entries. If there is a better way i am happy to alter the impl.
:bulb: Motivation and Context
Fixes #452
:green_heart: How did you test it?
Running with local project using bucket in region without owner access.
:pencil: Checklist
- [x] I reviewed submitted code
- [x] I added tests to verify changes
- [ ] I updated reference documentation to reflect the change
- [x] All tests passing
- [x] No breaking changes
:crystal_ball: Next steps
I also noticed that i seem to have edited the generated version of the class instead of the template. When #456 is merged ill redo these changes on top of the template instead.
@maciejwalkowiak Unfortunately the ConcurrentLRUMap class is needed both by the spring-cloud-aws-s3-codegen and spring-cloud-aws-s3-cross-region-client projects in order to compile. How would you like to fix the project structure so this will work?
@fjmacagno can you bring back fallback when header is not present?
@maciejwalkowiak Have you considered getting rid of the autogen functionality? It is making what i am trying to do harder (since some client operations behave differently than others), and that combined with having to inline ConcurrentLruMap seems to suggest that it has outlived its usefulness.
Client generation was meant to make things easier - and it did for me when creating initial cross client implementation. What is the problem you're facing?
Well, since we know that the header will be present for certain client operations, we could make the assumption that HeadBucket and ListObject operations will always have the header, and throw an error otherwise. Then, if the operation being done is anything else, and the 301 error occurs and the header isnt present, we could just call HeadBucket on the bucket in use. If it returns successfully, we know that it was able to retrieve the region and cache the client, and we can continue. This way, we can still avoid explicitly doing the annoying GetBucketLocation operation which only bucket owners can do.
The complexity here is that in order to avoid infinite recursion, those special operations need to not call themselves on an error. The methods i see currently are:
- add a flag to
executeInBucketRegion
controlling the recursive call, defaulting to true and being false when called recursively - duplicate the error-code checking and header-retrieval logic into another method that just handles the HeadBucket part
Both of these solutions though could result in a duplicate head or list operation being called, since we cant tell what operation the passed function is. If the client code wasnt generated, we could have the head and list operations call a simplified version of executeInBucketRegion
which does not have the recursive logic, or even just pass a false flag from option 1, and so surface an exception faster.
A compromise option might be to have the code generation instead create an interface or abstract class with default methods for each of the S3Client methods, which would call the abstract method executeInBucketRegion
. Then, the actual implementation could have overrides of specific S3Client methods (the HeadBucket and ListObjects operations). This would also eliminate one of the mvn projects, and in principle the code generation could be an earlier maven step and not have to have the result committed, especially since the generated code would never change.
This is definitely not at all necessary, but i noticed what i was writing didn't feel right and i thought i would bring it up. i definitely may also be missing something obvious. I'll try to get one of simple options written as a reference.
~ Edited to correct HeadObject to HeadBucket
A compromise option might be to have the code generation instead create an interface or abstract class with default methods for each of the S3Client methods, which would call the abstract method executeInBucketRegion. Then, the actual implementation could have overrides of specific S3Client methods (the HeadBucket and ListObjects operations). This would also eliminate one of the mvn projects, and in principle the code generation could be an earlier maven step and not have to have the result committed, especially since the generated code would never change.
This sounds like a good way to go. Feel free to go this path and when I am back from holidays I'll dig deeper.
Thanks a lot for for analysis!
@fjmacagno great job thanks a lot! I did tiny polishing with reducing the surface area of public APIs.
One thing left that I think we should do but want to consult with you before - wouldn't renaming CrossRegionS3Client
to AbstractCrossRegionS3Client
and DefaultCrossRegionS3Client
to CrossRegionS3Client
make more sense? From the user point of view I think having DefaultCrossRegionS3Client
may suggest like there are possibilities for other implementations and I don't think it is true in the scope of Spring Cloud AWS at least.
Thanks!
I just always leave the possibility open out of habit, especially when working with Spring, just to leave things flexible; i would argue a user should just be concerned with having an instance of the abstract class, and the library can do the details behind the scenes. I'm happy to do this however you would prefer though.
edit: an option could be to rename DefaultCrossRegionS3Client
to something more evocative of this specific implementation, maybe ExceptionalCrossRegionS3Client
I made one more set of simplification changes. I could also fully get rid of the CrossRegionS3ClientTemplate if desired and just fully generate the class, but i dont know if that is a net-positive.
This is the cleanest i think i can make it without changing project structure.
especially when working with Spring, just to leave things flexible
That's true, Spring is very flexible, but we cannot afford such flexibility. For us hiding more means less effort to maintain it later (which is quite important for a side project).
i would argue a user should just be concerned with having an instance of the abstract class, and the library can do the details behind the scenes.
Users should ideally use S3Client
interface.
I think it's good as it is - except naming, but I'll leave it open till @MatejNedic is back from holidays only to agree on naming.
Makes sense to me, i've renamed the classes.
If you change your mind, i am happy to revert.
Kudos, SonarCloud Quality Gate passed!
0 Bugs
0 Vulnerabilities
0 Security Hotspots
2 Code Smells
No Coverage information
0.0% Duplication
I tried out some project changes in https://github.com/fjmacagno/spring-cloud-aws/pull/1 .