plotman
plotman copied to clipboard
Archivation seems to send the file multiple times via rsync
Describe the bug
The plot_util.py
returns a list of completed k32 plots in list_k32_plots
It checks the size by validating if file_size > 0.95 * k32_size
in https://github.com/ericaltendorf/plotman/blob/main/src/plotman/plot_util.py#L55
Since there's still 5% (~5GB) to be written, Chia can still be writing to the file.
The rsync process is started though regardless if Chia is still writing.
Rsync behavior with --remove-source-files
is that the source file is not removed if the file has changed during the file transfer.
With simulation, I ran the rsync command 3 times in a row, the following errors are produced:
$ rsync -P -r --bwlimit=10000 --remove-source-files fromdir/ todir/
sending incremental file list
a.txt
107,028,047 100% 9.84MB/s 0:00:10 (xfr#1, to-chk=0/2)
rsync: read errors mapping "/tmp/rsync/fromdir/a.txt": No data available (61)
WARNING: a.txt failed verification -- update discarded (will try again).
a.txt
107,028,059 100% 9.75MB/s 0:00:10 (xfr#2, to-chk=0/2)
ERROR: Skipping sender remove for changed file: a.txt
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1207) [sender=3.1.3]
$ rsync -P -r --bwlimit=10000 --remove-source-files fromdir/ todir/
sending incremental file list
a.txt
107,028,059 100% 9.79MB/s 0:00:10 (xfr#1, to-chk=0/2)
$ rsync -P -r --bwlimit=10000 --remove-source-files fromdir/ todir/
sending incremental file list
It appears the destination file will be moved 3 times which is quite a lot for a 108GB file. Two times during the first rsync. Once during the second rsync since the first rsync did not delete the source file.
This ticket is more about understanding why 0.95
constant is being used. Maybe there's a very good reason which I may not be aware of.
Expected behavior Rsync should transfer the file only once
System setup:
- OS: Ubuntu 20.04
- rsync
Closing, not a bug.
Why is this not a bug? Based on your explanation it strikes me as one.
I didn't write the 0.95
but in my head it was 'the plot should be done by the time we get around to finishing the transfer'. But, then you point out the nuance about rsync detecting the change and not deleting the source. Combine this with the fact we use --partial
and it becomes an extra mess.
I feel like we could add a check that the file is both sufficiently sized (not all of them are exactly the same size down to the byte, afaik) as well as that no process has the file open. There's work elsewhere on improving the estimation of the file size so we could additionally close to 0.99
or something perhaps. Together it seems like we could avoid a lot more false 'the plot is done' checks.
I'm going to reopen this to maybe help avoid it getting lost if it is really a bug. If I'm missing something in my understanding of your explanation, let me know. @ericaltendorf, was my assumption about the intent here correct or is there more detail you can bring to this discussion?
@ceecko, thanks for raising the issue along with the detailed analysis and explanation. This saves me a lot of time figuring out the underlying issue and lets us get straight to discussing the fix.
After a while I figured Chia renames the plot from filename <filename>.plot.2.tmp
to <filename>.plot
as the last step.
This way the .plot
file is always complete and no more writing shall occur.
If this is correct the 0.95
check is not needed. It only helps detect corrupt files. In case it's corrupted and below 0.95
I don't expect Chia writes to it anymore so the file is going to stay there stuck either way.
Ah, ok. If that's the case for all combinations of tmp/tmp2/dst then it seems we are ok.