woof-CE icon indicating copy to clipboard operation
woof-CE copied to clipboard

snapmergepuppy: Avoid opened files 2.0

Open rizalmart opened this issue 2 years ago • 11 comments

Avoid opened files based from @dimkr's recommendation from previous PR #3359

rizalmart avatar Sep 01 '22 13:09 rizalmart

@01micko I see nothing wrong with this second revision of this PR, but I'm not sure I understand the motivation for this, as it doesn't solve #3187 but slows down snapmergepuppy a bit.

dimkr avatar Sep 04 '22 05:09 dimkr

I'll wait for @rizalmart 's response as to whether to merge or not.

01micko avatar Sep 04 '22 06:09 01micko

@01micko I see nothing wrong with this second revision of this PR, but I'm not sure I understand the motivation for this, as it doesn't solve #3187 but slows down snapmergepuppy a bit.

Just a thought here. Since aufs and overlayfs creates modified version of files at writable layer/upper folder. It much better to avoid to those actively opened files when moving the modified files to save file/folder to avoid I/O errors between process and target file. Almost the same concern as dimkr's #3187.

rizalmart avatar Sep 04 '22 09:09 rizalmart

I think I proposed this in V1, but why not just copy the file to pup_ro1 without deleting it from pup_rw, if a file is writing to it? This way, if the file is corrupt because it was copied in some intermediate state (before all changes are flushed), it will get "fixed" in a future snapmergepuppy.

#3187 happens not because files are copied while a process is writing to them, but because they're deleted while a process is writing to them. snapmergepuppy copies file revision x to pup_ro1 while the process is writing revision x+1, then snapmergepuppy deletes the file, the data written to the file is discarded (so revision x+1 is lost), and next time snapmergepuppy runs it doesn't have a file to copy. Copying a file while a process is writing to it can produce a corrupt file, but that corruption is only temporary if the file is kept in pup_rw and gets copied later.

dimkr avatar Sep 04 '22 09:09 dimkr

but why not just copy the file to pup_ro1 without deleting it from pup_rw, if a file is writing to it? This way, if the file is corrupt because it was copied in some intermediate state (before all changes are flushed), it will get "fixed" in a future snapmergepuppy.

The problem with this approach was it will fill the ramdisk space quite easily.

rizalmart avatar Sep 04 '22 11:09 rizalmart

#3187 happens not because files are copied while a process is writing to them, but because they're deleted while a process is writing to them. snapmergepuppy copies file revision x to pup_ro1 while the process is writing revision x+1, then snapmergepuppy deletes the file, the data written to the file is discarded (so revision x+1 is lost), and next time snapmergepuppy runs it doesn't have a file to copy. Copying a file while a process is writing to it can produce a corrupt file, but that corruption is only temporary if the file is kept in pup_rw and gets copied later.

Thats why the file that was currently opened by process should be avoided in the first place for copying files to save file

rizalmart avatar Sep 04 '22 11:09 rizalmart

The problem with this approach was it will fill the ramdisk space quite easily.

I'm not sure, because the ramdisk already holds all files opened for writing. And you don't save memory by deleting files if they're currently open: they're can't be re-opened, but they remain there (although they're inaccessible) until all currently running processes close them.

dimkr avatar Sep 04 '22 11:09 dimkr

I'm not sure, because the ramdisk already holds all files opened for writing.

Probably not, aufs and overlayfs only resides modified/altered/deleted file on writable layer/upperdir.

And you don't save memory by deleting files if they're currently open: they're can't be re-opened, but they remain there (although they're inaccessible) until all currently running processes close them.

What about inert added/modified files which was not opened by any process? They consume ramdisk space

rizalmart avatar Sep 04 '22 11:09 rizalmart

What about inert added/modified files which was not opened by any process? They consume ramdisk space

They won't appear in the output of lsof, so you'll just copy and delete them.

EDIT: in case my proposal wasn't clear, I propose to copy and delete files that don't appear in lsof output, and copy the files that do appear there without deleting them.

dimkr avatar Sep 04 '22 11:09 dimkr

They won't appear in the output of lsof, so you'll just copy and delete them.

EDIT: in case my proposal wasn't clear, I propose to copy and delete files that don't appear in lsof output, and copy the files that do appear there without deleting them.

Oh I get it, however the concern here was since the file was currently opened. There is a chance that the file copied on save file/folder might be corrupted since the file was currently active with the process and accidentally copy that file in the middle of I/O operation of a process with the file

rizalmart avatar Sep 04 '22 11:09 rizalmart

There is a chance that the file copied on save file/folder might be corrupted

Exactly, and if you don't delete the file from pup_rw, the next snapmergepuppy run will copy it again. If this time it's not corrupted, then this second run will fix the corruption in pup_ro1.

dimkr avatar Sep 04 '22 11:09 dimkr