plotman icon indicating copy to clipboard operation
plotman copied to clipboard

Archivation seems to send the file multiple times via rsync

Open ceecko opened this issue 3 years ago • 4 comments

Describe the bug The plot_util.py returns a list of completed k32 plots in list_k32_plots It checks the size by validating if file_size > 0.95 * k32_size in https://github.com/ericaltendorf/plotman/blob/main/src/plotman/plot_util.py#L55

Since there's still 5% (~5GB) to be written, Chia can still be writing to the file. The rsync process is started though regardless if Chia is still writing. Rsync behavior with --remove-source-files is that the source file is not removed if the file has changed during the file transfer. With simulation, I ran the rsync command 3 times in a row, the following errors are produced:

$ rsync -P -r --bwlimit=10000 --remove-source-files fromdir/ todir/
sending incremental file list
a.txt
    107,028,047 100%    9.84MB/s    0:00:10 (xfr#1, to-chk=0/2)
rsync: read errors mapping "/tmp/rsync/fromdir/a.txt": No data available (61)
WARNING: a.txt failed verification -- update discarded (will try again).
a.txt
    107,028,059 100%    9.75MB/s    0:00:10 (xfr#2, to-chk=0/2)
ERROR: Skipping sender remove for changed file: a.txt
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1207) [sender=3.1.3]

$ rsync -P -r --bwlimit=10000 --remove-source-files fromdir/ todir/
sending incremental file list
a.txt
    107,028,059 100%    9.79MB/s    0:00:10 (xfr#1, to-chk=0/2)

$ rsync -P -r --bwlimit=10000 --remove-source-files fromdir/ todir/
sending incremental file list

It appears the destination file will be moved 3 times which is quite a lot for a 108GB file. Two times during the first rsync. Once during the second rsync since the first rsync did not delete the source file.

This ticket is more about understanding why 0.95 constant is being used. Maybe there's a very good reason which I may not be aware of.

Expected behavior Rsync should transfer the file only once

System setup:

  • OS: Ubuntu 20.04
  • rsync

ceecko avatar May 16 '21 08:05 ceecko

Closing, not a bug.

ceecko avatar May 16 '21 10:05 ceecko

Why is this not a bug? Based on your explanation it strikes me as one.

I didn't write the 0.95 but in my head it was 'the plot should be done by the time we get around to finishing the transfer'. But, then you point out the nuance about rsync detecting the change and not deleting the source. Combine this with the fact we use --partial and it becomes an extra mess.

I feel like we could add a check that the file is both sufficiently sized (not all of them are exactly the same size down to the byte, afaik) as well as that no process has the file open. There's work elsewhere on improving the estimation of the file size so we could additionally close to 0.99 or something perhaps. Together it seems like we could avoid a lot more false 'the plot is done' checks.

I'm going to reopen this to maybe help avoid it getting lost if it is really a bug. If I'm missing something in my understanding of your explanation, let me know. @ericaltendorf, was my assumption about the intent here correct or is there more detail you can bring to this discussion?

@ceecko, thanks for raising the issue along with the detailed analysis and explanation. This saves me a lot of time figuring out the underlying issue and lets us get straight to discussing the fix.

altendky avatar May 16 '21 15:05 altendky

After a while I figured Chia renames the plot from filename <filename>.plot.2.tmp to <filename>.plot as the last step. This way the .plot file is always complete and no more writing shall occur.

If this is correct the 0.95 check is not needed. It only helps detect corrupt files. In case it's corrupted and below 0.95 I don't expect Chia writes to it anymore so the file is going to stay there stuck either way.

ceecko avatar May 16 '21 18:05 ceecko

Ah, ok. If that's the case for all combinations of tmp/tmp2/dst then it seems we are ok.

altendky avatar May 17 '21 00:05 altendky