buildkit
buildkit copied to clipboard
allows setting volatile overlay mount option
given that buildkit is generally very IO intensive, some preliminary tests have shown that adding the volatile option could yield up to a 15% performance improvements in some cases.
ref: https://docs.kernel.org/filesystems/overlayfs.html#volatile-mount
Signed-off-by: Marcos Lilljedahl [email protected]
Bit confused. What sets the volatile option for these mounts? Additionally it looks like containerd libs have some code to intentionally remove it before mounting atm. https://github.com/containerd/containerd/blob/v1.7.22/mount/temp.go#L66
Do we capture fsync error on unmount for this? If upperdir fails to sync, what's the point where the error would appear?
What sets the
volatileoption for these mounts?
You can set this in containerd via snapshotter options.
[plugins."io.containerd.snapshotter.v1.overlayfs"]
root_path = ""
upperdir_label = false
mount_options = ["volatile"]
Additionally it looks like containerd libs have some code to intentionally remove it before mounting atm. https://github.com/containerd/containerd/blob/v1.7.22/mount/temp.go#L66
IIUC that's only for temp mounts, I see that WriteUpperDir uses those https://github.com/moby/buildkit/blob/master/util/overlay/overlay_linux.go#L115. Having said that, when running some black-box performance tests with Dagger. I was getting consistent ~+15% performance increase in build times when using the volatile option. Could it be possible that some other non-temp overlay mounts could be benefited from this setting?
I could have also been bad measure from the benchmark even though I ran the same test multiple times and took the p99 value out of all samples. I'll try re-running the tests again to see if I still get the same results.
You can set this in containerd via snapshotter options.
If we think this is safe(for buildkit's use-cases) then we should add it to all our overlay mounts, not only with containerd worker, and not requiring custom config from the user.
Still, this would need some clarification:
Do we capture fsync error on unmount for this? If upperdir fails to sync, what's the point where the error would appear?
If we think this is safe(for buildkit's use-cases) then we should add it to all our overlay mounts, not only with containerd worker, and not requiring custom config from the user.
The scope of this change is much smaller and just allows Dagger to even try this in the first place without a full fork of buildkit. All it's doing is stopping one part of code that's looking for upperdir/lowerdir in buildkit from erroring out if it sees volatile in the list of options. The change is not turning volatile on anywhere, just removing an error from happening if a mount happens to have it.
Do we capture fsync error on unmount for this? If upperdir fails to sync, what's the point where the error would appear?
From the kernel docs Marcos linked to:
If any writeback error occurs on the upperdir’s filesystem after a volatile mount takes place, all sync functions will return an error. Once this condition is reached, the filesystem will not recover, and every subsequent sync call will return an error, even if the upperdir has not experience a new error since the last sync call.
@sipsma I get that but I much more like that we implement a feature in buildkit instead of half-complete updates that only work if library is called in specific context and can easily break as there isn't anything in BuildKit actually using it. If you want to split work up into multiple PRs and create follow-up issues then that fine if properly documented this way.
From the kernel docs Marcos linked to:
Do I understand correctly from this that in order for volatile option to be safe for buildkit we need to manually call fsync before we call umount for these mounts as that is the place where we can capture the error if some writes were broken. This shouldn't affect performance as afaik umount implicitly calls fsync anyway (but wouldn't let us capture the error).