avoid unnecessary fsync when sync'ing extents with server
In unifyfs_sync_extents, the client fsyncs all data to disk before it calls the rpc to sync extents with the server:
https://github.com/LLNL/UnifyFS/blob/fddedf29335cf3359c51be47017616a61a1f51cd/client/src/unifyfs-fixed.c#L192
We sync data to disk by calling the logio_sync call here:
https://github.com/LLNL/UnifyFS/blob/fddedf29335cf3359c51be47017616a61a1f51cd/client/src/unifyfs-fixed.c#L226
Then we tell the server to copy extents from the client extent buffer with an rpc call here:
https://github.com/LLNL/UnifyFS/blob/fddedf29335cf3359c51be47017616a61a1f51cd/client/src/unifyfs-fixed.c#L235
There are two main reasons we sync extents with the server:
- the application has written enough extents that we've exhausted the extent buffer, in which case, we flush that buffer to the server so that the client can handle more writes
- the application has explicitly called something like fsync() or fflush()
In case 1), when we're using a POSIX filesystem as our backing store, which will be the case for tmpfs and the file system on most SSDs, then we can avoid syncing data to disk completely. POSIX ensures that the server process will read the correct data even though it was written by another process, without requiring an fsync.
I think we could similarly avoid fsync'ing data to disk on calls to close(). We currently sync on close, but that's mainly to send extents to the server, not really to store data to disk.
Even in case 2), we could likely sync data to disk after we call the rpc to expose the extents to the server. That might allow for some overlap in letting the server fetch the extents while data is being sync'd to disk so that those two operations can happen in parallel. It may require adding a second rpc to actually allow overlap.
The fix for case 1) to avoid calling fsync at all should speed up UnifyFS performance for applications that write lots of extents and rarely explicitly call fsync.
To support this, we likely need to add another argument to unifyfs_sync_extents that tells the function whether an fsync is needed. Then all callers need to be updated to add that argument in their call. The implementation could use the new unifyfs client library, which has split those into separate calls.
Addressed by #651?
@MichaelBrim , here is the write up on what I was thinking. This is saying the client only needs to call fsync on logio when the user explicitly calls fsync (or something equivalent). If the user does not call fsync, then we can avoid calling fsync in the client. This should help in apps that write lots of data and/or extents but never call fsync.
Additionally, when we do have to call fsync, we could flush the extents to the server in parallel with the fsync. That would be a smaller improvement, and it would likely require us to split the extent flush into two parts. That is, change what we had:
fsync logio
rpc to tell server to copy extents
to something like:
rpc to tell server to start copying extents
fsync logio (while server is copying extents in background)
wait for server to tell us it's done copying the extents
Or we could spawn a thread on the client to do the fsync call or something.
The other item we might consider here is to provide a setting that lets a user disable our fsync of logio. That would correspond to the user saying something like "even though my app calls fsync(), please act like it didn't".
That would be an easy way for a user to get improved performance if they happen to be running an app that has a bunch of fsync calls, but they don't actually care about data being sync'd to disk. Flip a switch to get faster performance without having to modify the code and rebuild the app.
I can easily add the user configuration option. Splitting the sync-extents code into two parts is a little more difficult given the current abstraction at the fid layer. However, it's fairly easy to have the logio fsync() run in a ULT to make sure that doesn't slow the client.
Thanks, @MichaelBrim . I think coding the logio fsync to happen in the background with the extent flush will be much less significant performance-wise than simply avoiding the logio fsync when it's not required. We could tackle the background fsync as a separate item.
@adammoody, do you know if this was resolved?
Yes, I think this is resolved. We can close it out.