When generating mutiple checksums, Only Read Files once, but generate multiple checksums simultaneously to speed up checksumming.
Say I tell clonezilla to generate md5 , sha256 and b2sum of individual files on a drive containing 2 partitions. As of clonezilla-live-3.1.1-27-amd64.iso , this is how generating mutiple checksums in Clonezilla works.
- Clone Partition 1.
- Generate md5sum of files in partition 1 by reading the files.
- Generate sha256sum of files in partition 1 by reading the files AGAIN.
- Generate b2sum of files in partition 1 by reading the files AGAIN.
- Clone Partition 2.
- Generate md5sum of files in partition 2 by reading the files.
- Generate sha256sum of files in partition 2 by reading the files AGAIN.
- Generate b2sum of files in partition 2 by reading the files AGAIN.
Clonezilla is reading the same files three times in steps 2,3,4 and in steps 6,7,8.
This significantly increases wear on the disk.
This disk could fail at Step 2,3 or 4. And failing steps 5-8 altogether.
The way it should work is...
- Clone Partition 1.
- Clone Partition 2.
- Generate md5sum,sha256sum, b2sum of files in partition 1 simultanously.
- Generate md5sum,sha256sum, b2sum of files in partition 2 simultanously.
Why not just read the files once and generate multiple checksums simultaneously using cat and tee ?
Generate list of files with full path in a given partition and store in /tmp/list_of_files_in_dev_sda1.txt
I'm not a shell expert but something like this ?
$ IFS=$'\n' ;
for i in $(cat /tmp/list_of_files_in_dev_sda1.txt) ;
do cat "$i" | tee >(md5sum >> /tmp/md5sum_of_files_in_dev_sda1.txt) | tee >(sha256sum >> /tmp/sha256sum_of_files_in_dev_sda1.txt) | b2sum >> /tmp/b2sum_of_files_in_dev_sda1.txt ;
done
unset IFS
The above command will not append the name of the files themselves into the checksum file list . But I'm sure there's a way in shell to also get the names of the files into the files.
This would significantly speed up checksumming and the overall cloning process and most importantly minimize wear on the disk.
Also related #126
Thanks!
Thanks for this idea. However, I believe actually you should just choose one of the checksum methods. I suggest that b2sum is good enough. Of course, this can be improved. We will try to do that in the future.
Steven
Thanks for your suggestion. This feature has been implemented in Clonezilla live >= 3.2.0-27 or 20241213-*: https://clonezilla.org/downloads.php Let us know the results if you test that. Thanks.
Steven
Tested clonezilla-live-3.2.0-32-amd64.iso .
I selected md5sum and b2sum in expert mode.
After successful completion of the cloning process, it started catting files(including binaries!) into the terminal.
Please fix. Thanks!
Could you please show the files list in your image dir by running: ls -lh /home/image/IMAGE (replace IMAGE with your image name). Thanks.
Steven
$ ls -lh
total 463G
-rwxrwxrwx+ 1 Administrators Administrators 979 Jan 12 14:08 B2SUMS
-rwxrwxrwx+ 1 Administrators Administrators 1.3K Jan 12 14:08 blkdev.list
-rwxrwxrwx+ 1 Administrators Administrators 943 Jan 12 14:08 blkid.list
-rwxrwxrwx+ 1 Administrators Administrators 222 Jan 12 12:03 dev-fs.list
-rwxrwxrwx+ 1 Administrators Administrators 4 Jan 12 14:08 disk
-rwxrwxrwx+ 1 Administrators Administrators 13 Jan 12 14:08 dmraid.table
-rwxrwxrwx+ 1 Administrators Administrators 307 Jan 12 14:08 MD5SUMS
-rwxrwxrwx+ 1 Administrators Administrators 20 Jan 12 14:08 parts
-rwxrwxrwx+ 1 Administrators Administrators 33 Jan 12 09:39 sda1.info
-rwxrwxrwx+ 1 Administrators Administrators 26M Jan 12 09:39 sda1.ntfs-ptcl-img.uncomp
-rwxrwxrwx+ 1 Administrators Administrators 48G Jan 12 10:06 sda2.ntfs-ptcl-img.uncomp
-rwxrwxrwx+ 1 Administrators Administrators 208G Jan 12 12:03 sda3.ntfs-ptcl-img.uncomp
-rwxrwxrwx+ 1 Administrators Administrators 512 Jan 12 14:08 sda4-ebr
-rwxrwxrwx+ 1 Administrators Administrators 208G Jan 12 14:08 sda5.ntfs-ptcl-img.uncomp
-rwxrwxrwx+ 1 Administrators Administrators 37 Jan 12 14:08 sda-chs.sf
-rwxrwxrwx+ 1 Administrators Administrators 1.0M Jan 12 14:08 sda-hidden-data-after-mbr
-rwxrwxrwx+ 1 Administrators Administrators 512 Jan 12 14:08 sda-mbr
-rwxrwxrwx+ 1 Administrators Administrators 535 Jan 12 14:08 sda-pt.parted
-rwxrwxrwx+ 1 Administrators Administrators 458 Jan 12 14:08 sda-pt.parted.compact
-rwxrwxrwx+ 1 Administrators Administrators 381 Jan 12 14:08 sda-pt.sf
Also
$ cat B2SUMS
bXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX2 blkdev.list
2XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX67 blkid.list
4XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX21 dev-fs.list
7XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXf3 disk
9XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX9 dmraid.table
8XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX8f parts
6XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXff sda1.info
$ cat MD5SUMS
8XXXXXXXXXXXXXXXXXXXXXXXXX6 blkdev.list
aXXXXXXXXXXXXXXXXXXXXXXXXX1 blkid.list
3XXXXXXXXXXXXXXXXXXXXXXXXX5 dev-fs.list
8XXXXXXXXXXXXXXXXXXXXXXXXXb disk
fXXXXXXXXXXXXXXXXXXXXXXXXX6 dmraid.table
kXXXXXXXXXXXXXXXXXXXXXXXXX5 parts
1XXXXXXXXXXXXXXXXXXXXXXXXX7a sda1.info
The binary catting appears to have happened while reading the *uncomp files, because that's when I panicked and powered off the computer.
Please give testing Clonezilla live >= 3.2.0-33 or 20250114-* a try: https://clonezilla.org/downloads.php This issue should have been fixed. If you test, please let us know the results. Thanks.
Steven