mountpoint-s3 icon indicating copy to clipboard operation
mountpoint-s3 copied to clipboard

Make fewer `lookup` requests when inode type might be known

Open jamesbornholt opened this issue 2 years ago • 3 comments

Right now our lookup implementation does two concurrent ListObjectsV2 requests. List requests are expensive, both cost-wise and performance-wise. We should think about a better strategy here, perhaps starting with a HeadObject request. If there's an existing inode, we could also use that as a hint as to which requests are most likely to succeed (only as a hint, because the bucket might have changed).

jamesbornholt avatar Oct 29 '22 19:10 jamesbornholt

#69 improved this by replacing one ListObjects with a HeadObject. What's left to do is to use cached state as a hint to optimize the requests here—if we already suspect something is a file (or directory), we can start with the HeadObject (or ListObjects) and only do the other request if it fails.

jamesbornholt avatar Feb 07 '23 01:02 jamesbornholt

Not sure if applicable, I'm not familiar with internals of FUSE, however:

In mount.fuse(8) I see two options:

entry_timeout=T
       The timeout in seconds for which name lookups will be cached. The default is 1.0 second. For all the timeout options, it is possible to give fractions of a second as well (e.g. entry_timeout=2.8)
attr_timeout=T
       The timeout in seconds for which file/directory attributes are cached.  The default is 1.0 second.

Which both sounds excellent, if mount-s3 could set/use these?

We have a very static object structure, so caching would be great and work very well. Did some measurements and for one test it's 30-50% list-type requests, which are costly both in round trip time, but also money :-)

(Thanks for a very exciting and promising project btw!)

plundra avatar Sep 06 '23 15:09 plundra

Yeah, we actually do set those timeouts, here:

https://github.com/awslabs/mountpoint-s3/blob/b632bbe9645f1f6af26ed839e791b8a34ab74b36/mountpoint-s3/src/fs.rs#L201-L216

We set them very low because we want to preserve S3's strong consistency model by default. But we know for some workloads, the bucket doesn't change very much/at all, and so we could cache that metadata much longer. We're tracking that as a roadmap item in #255.

This issue is tracking a smaller improvement we could make: if we've listed a file/directory previously, then when the cache expires we could speculate that it's still a file/directory when we try to look it up from S3 again. I think with the way that lookup works right now, that would allow us to skip some HeadObject requests, but probably not any ListObjects requests.

jamesbornholt avatar Sep 06 '23 16:09 jamesbornholt