s3proxy icon indicating copy to clipboard operation
s3proxy copied to clipboard

s3proxy for production use cases using NFS as backend

Open rajivml opened this issue 1 year ago • 2 comments

HI @gaul

Should we anticipate any issues using s3proxy with nfs as backend in production for light weight use cases (few hundred GB's) ? I have burned my hands with rook-ceph on kubernetes as it's quite complicated to operate, I mean it's very difficult to hide the complexity of rook-ceph in production especially when it's running on kubernetes

I have deployed s3proxy with nfs as backend and wrote few 100 GB's of data and I didn't faced any issue but just want to take some expert opinion here on what kind of issues/challenges that one should be aware of while using this combination

Thanks

rajivml avatar Apr 29 '23 05:04 rajivml

Generally this will work but you may encounter two performance problems:

  • Large number of objects: S3Proxy 2.0.0 enumerates the entire bucket underneath a given subdirectory. The upcoming 2.1.0 release will fix this by including JCLOUDS-1371 but you can compile from source for now. This will only enumerate the children of a subdirectory, not all its grandchildren, great-grandchildren, etc..
  • Writing large multi-part uploads: S3Proxy maintains a 1:1 correspondence between an object and a file. When uploading an MPU with many parts, it must join all these parts to create the final file. Thus S3Proxy will do 3x IOs: write all the parts, read everything back, and rewrite as one file. This can make the final CompleteMultiPartUpload time out for some clients.

gaul avatar May 19 '23 03:05 gaul

Thanks a lot for your reply @gaul ,

Regarding #1, I can build it from source not a problem

Regarding #2, We don't have objects greater than 8GB in size and uploading of such objects is not that frequent . Would it still be a problem? at least in my load test I haven't noticed any problems

One other issue that's stopping me from considering it for production is encryption at rest, I trie transparent encryption feature but I find it buggy, for small uploads it's working fine but for files greater than 100mb, the uploads are always failing and if I disable transparent encryption it's absolutely fine

And another problem I noticed is failed multiparts aren't getting cleaned up automatically which will risk filling up NFS server. Off course this can be solved by writing a k8's cronjob to clean up the same, this is not a P0 but transparent encryption with large files upload/download is definitely a concern

rajivml avatar May 21 '23 18:05 rajivml