go-nfs icon indicating copy to clipboard operation
go-nfs copied to clipboard

Make the cookie verifier stable across multiple readdirplus calls

Open fwiesel opened this issue 3 years ago • 6 comments

At least one NFS client seems to expect the nfs verifier cookie to be stable accross more than one call as the current implemenentation does.

There are two options, the default one being using the stat call to the dir to get the modification time-stamp and use that as a verifier.

The advantage is that doesn't need to calculate any checksum, the downside is, that it breaks memfs, which returns always now() as the timestamp.

fwiesel avatar Sep 23 '22 14:09 fwiesel

This should fix #48

fwiesel avatar Sep 23 '22 14:09 fwiesel

How bad would it be to keep an LRU map of path->verifier? verifier could then just be a randomly chosen number but would be stable until either it falls out of cache, or a mutation to that path causes invalidation.

It would be great to have a verifier that always works and where we aren't worried about the introduced inefficiency.

willscott avatar Sep 23 '22 15:09 willscott

I do not see any correct and more efficient way of invalidating the cache than doing a checksum over the whole data without more support from the file-system layer. And the only option billy.FS seems to be provide seems to me the Stat call, but then we might as well use that directly.

Looking at the work in 9cea9c376f2ddeed2a0f81229c61699deb291c58, the MemFS used there does provide correct timestamps, so it might may make sense to build the timestamp based cookie code on top of it and rather do one breaking change instead of two.

fwiesel avatar Sep 27 '22 07:09 fwiesel

If this library keeps a cache, it could cache the previous read of the directory contents for the verifier. While that read stays in memory, the verifier remains valid for subsequent reads

Does that seem like a plausible path?

willscott avatar Oct 06 '22 01:10 willscott

Well, yes. The library could keep the directory contents and verifier in memory.

How to implement that greatly depends on the behavior of the client (or clients). Just to fix it for the most simple case, a LRU would definitely work.

But the size of the LRU would practically limit the number or ReadDir requests in flight to at most the size of the LRU. Practically even lower, as LRU is a very simple approach to estimate the operations in flight. Smarter eviction logic could improve the tradeoff of the memory overhead vs cache failures. At a certain failure rate, the whole thing would break down to what I am reporting, the clients wouldn't be able to read the directories, as they are invalidating the requests of "others".

While it would be a way to improve things over the billyFS interface, the question though is, how sensible it is to invest the required work on an "old" interface, considering the io.FS maps much more easily to NFSv3, and if the effort is not rather spend there on moving that forward. I am a bit unclear on what is missing for moving that forward and how desirable backward compatibility is for you.

For me, it isn't :)

fwiesel avatar Oct 06 '22 08:10 fwiesel

I believe there is a partial migration to the io.FS interface that's living in a branch: https://github.com/willscott/go-nfs/tree/feat/iofs

I think the main thing we ran into is that like billy, io.FS is an incomplete interface, especially on the mutation/metadata side of things, so there'll need to be some additional interface designed for what methods are needed there.

willscott avatar Oct 08 '22 20:10 willscott

Closing in favor of #52

fwiesel avatar Dec 05 '22 09:12 fwiesel