Refactor S3ProxyHandler into component parts
Breaking the almost 3,000 line S3ProxyHandler into component authentication, bucket, object, and multipart classes will make it easier to understand.
After adding V4 authentication the complexity of its code scares me.
For use case of "embedding into Java applications", it is better to isolate implementation of api from the source of content and authorization. This will help in case a custom application wants to provide custom implementation/source of blob data and wants to do run its own authorization logic. Further, dependency on jclouds will become optional (in this case).
So my suggestion it to create a generic java interface defining the contract between s3 api implementation and blob/authorization/provider. The default will be jclouds based impl but it can be changed (programatically) if required.
You have done similar thing with jetty. IMHO, Jclouds connection should also be loosely coupled.
Is the idea worthwhile and will you solicit such a PR?
@szquadri This issue only deals with refactoring the internal S3Proxy APIs, but please open a new issue with your suggestions for embedding. Unfortunately most embedded uses seem to be closed-source so it is hard for me to imagine them, but I am open to making S3Proxy more useful in this way. I suspect that disentangling S3Proxy from jclouds would be less useful than embedding users implementing the BlobStore interface directly. Pluggable authentication seems well-scoped and could improve S3Proxy just by creating a cleaner separation of logic.
I will create a separate issue for embedding use case.
In our use case we have a SaaS document management (digital asset management) product and we want to provide raw content/files/metadata access through s3 interface as it is the most popular blob api right now. The system is multi-tenant so we need servlet request access to be able to do custom authorization and bucket determination etc.
For such a case, we do not need (or want to) implement jclouds blob store and will be better off with pure s3 implementation that simply uses servlet request / response to interface with outside world and a generic interface for data store.
I think many people may have such a requirement and having loosely coupled jclouds will help.
Please open a separate issue for this. Note that decoupling jclouds is a large task given all the domain objects, Blob, BlobMetadata, etc. and you would need to demonstrate its value. For multi-tenancy you should look at the recent multi-config work via S3Proxy.setBlobStoreLocator.
Interesting Hint. Is there any chance locateBlobStore method of interface BlobStoreLocator is passed on current ServletRequest so it can take decision based on factors like hostname/virtualhost (in our case) or any other factor/info contained in request that developer can use to select BlobStore?
We can consider broadening this interface. Let's track it in a separate issue or pull request.