s3proxy
s3proxy copied to clipboard
s3proxy for production use cases using NFS as backend
HI @gaul
Should we anticipate any issues using s3proxy with nfs as backend in production for light weight use cases (few hundred GB's) ? I have burned my hands with rook-ceph on kubernetes as it's quite complicated to operate, I mean it's very difficult to hide the complexity of rook-ceph in production especially when it's running on kubernetes
I have deployed s3proxy with nfs as backend and wrote few 100 GB's of data and I didn't faced any issue but just want to take some expert opinion here on what kind of issues/challenges that one should be aware of while using this combination
Thanks
Generally this will work but you may encounter two performance problems:
- Large number of objects: S3Proxy 2.0.0 enumerates the entire bucket underneath a given subdirectory. The upcoming 2.1.0 release will fix this by including JCLOUDS-1371 but you can compile from source for now. This will only enumerate the children of a subdirectory, not all its grandchildren, great-grandchildren, etc..
- Writing large multi-part uploads: S3Proxy maintains a 1:1 correspondence between an object and a file. When uploading an MPU with many parts, it must join all these parts to create the final file. Thus S3Proxy will do 3x IOs: write all the parts, read everything back, and rewrite as one file. This can make the final CompleteMultiPartUpload time out for some clients.
Thanks a lot for your reply @gaul ,
Regarding #1, I can build it from source not a problem
Regarding #2, We don't have objects greater than 8GB in size and uploading of such objects is not that frequent . Would it still be a problem? at least in my load test I haven't noticed any problems
One other issue that's stopping me from considering it for production is encryption at rest, I trie transparent encryption feature but I find it buggy, for small uploads it's working fine but for files greater than 100mb, the uploads are always failing and if I disable transparent encryption it's absolutely fine
And another problem I noticed is failed multiparts aren't getting cleaned up automatically which will risk filling up NFS server. Off course this can be solved by writing a k8's cronjob to clean up the same, this is not a P0 but transparent encryption with large files upload/download is definitely a concern