rust-vfs icon indicating copy to clipboard operation
rust-vfs copied to clipboard

Feature: Merging / flattening Filesystems into each other

Open bengsparks opened this issue 2 years ago • 3 comments

I have tried to use this library for transactional filesystem operations; specifically, when my project is usually invoked, when using stdlib file system operations, intermediate artifacts could be left on the physical filesystem in the case of an error.

To avoid this, I use the vfs::OverlayFS with a read-only vfs::PhysicalFS and a vfs::MemoryFS with read/write access. This is great because all modifications to the filesystem end up in the vfs::MemoryFS, i.e. all changes induced by the operation can be found there. These changes, ultimately have to be stored and should be applied to the vfs::PhysicalFS for persistence.

let phys_loc: Path = ...;
let (update_fs, read_fs) = (
    vfs::MemoryFS::new(), 
    vfs::PhysicalFS::new(&phys_loc)
);
let proj_fs = vfs::OverlayFS::new(&[
    update_fs.into(), 
    read_fs.into()
]);

{
// Transaction begins here; 
// Failure triggers the error path, terminating a transaction prior to the next method.
// Success means the expected changes have been written into the `update_fs` by virtue of `vfs::OverlayFS`
fsys_ops1(&proj_fs)?;
fsys_ops2(&proj_fs)?;
fsys_ops3(&proj_fs)?;

// When we reach here, all operations were successfully, and the
// results should now be persisted 
update_fs.persist_into(&read_fs)?;
}

The final method call of persist_into is what this feature request is all about; being able to persist the contents of a memory only filesystem into a physical one. Perhaps this could even be extended to a trait implementation, e.g.

pub trait FileSystem {
    fn persist_into(self, fs: &vfs::PhysicalFS) -> VfsResult<()>;
}

Apologies if this falls outside of the scope of this library :) I hope I have adequately described my idea for a feature request.

bengsparks avatar Oct 17 '22 22:10 bengsparks

Hi there, very interesting use case. To be honest, I am not sure the library is quite up to the task. Ideally the persist_into() would be atomic, to prevent partial updates (i.e. when encountering a full disk or when the process is terminated abruptly). I am not quite sure how to make that work. An intent-log that could be replayed in case anything goes wrong maybe?

For now I think this is probably out of scope for this project, but I'd be very interested in hearing if you develop this idea further.

AFAICT persist_into could be implemented as a standalone function. This might be a viable course of action. Let me know if you hit any snags implementing it that way.

manuel-woelker avatar Oct 20 '22 17:10 manuel-woelker

Yes, the blockade I encountered when thumbing this over was that should the persistence operation fail, the original FS state must be restored;

Should creation fail, then nothing must be done for the file, and a rewind shall occur. Files created prior thereto would have to be removed. This sounds relatively simple.

However, restoring the state of files that have been deleted or updated is not so simple. In order to restore files, their original contents must exist in memory, effectively forcing the contents of the PhysicalFS into memory, which is precisely what I have been trying to avoid.

bengsparks avatar Oct 23 '22 12:10 bengsparks

Could the merge tool from the overlayfs-tools project help here?

They apparently use mv -T upper_dir lower_dir. Since mv is based on the rename() syscall (which is atomic unless the system crashes) it might be worth looking into the rename() syscall for merging the filesystems?

DrRuhe avatar Oct 25 '22 14:10 DrRuhe