clonezilla icon indicating copy to clipboard operation
clonezilla copied to clipboard

When generating mutiple checksums, Only Read Files once, but generate multiple checksums simultaneously to speed up checksumming.

Open barkoder opened this issue 1 year ago • 6 comments

Say I tell clonezilla to generate md5 , sha256 and b2sum of individual files on a drive containing 2 partitions. As of clonezilla-live-3.1.1-27-amd64.iso , this is how generating mutiple checksums in Clonezilla works.

  1. Clone Partition 1.
  2. Generate md5sum of files in partition 1 by reading the files.
  3. Generate sha256sum of files in partition 1 by reading the files AGAIN.
  4. Generate b2sum of files in partition 1 by reading the files AGAIN.
  5. Clone Partition 2.
  6. Generate md5sum of files in partition 2 by reading the files.
  7. Generate sha256sum of files in partition 2 by reading the files AGAIN.
  8. Generate b2sum of files in partition 2 by reading the files AGAIN.

Clonezilla is reading the same files three times in steps 2,3,4 and in steps 6,7,8.

This significantly increases wear on the disk.

This disk could fail at Step 2,3 or 4. And failing steps 5-8 altogether.

The way it should work is...

  1. Clone Partition 1.
  2. Clone Partition 2.
  3. Generate md5sum,sha256sum, b2sum of files in partition 1 simultanously.
  4. Generate md5sum,sha256sum, b2sum of files in partition 2 simultanously.

Why not just read the files once and generate multiple checksums simultaneously using cat and tee ?

Generate list of files with full path in a given partition and store in /tmp/list_of_files_in_dev_sda1.txt

I'm not a shell expert but something like this ?

$ IFS=$'\n' ;
for i in $(cat /tmp/list_of_files_in_dev_sda1.txt) ;
	do cat "$i" | tee >(md5sum >> /tmp/md5sum_of_files_in_dev_sda1.txt) | tee >(sha256sum >> /tmp/sha256sum_of_files_in_dev_sda1.txt) | b2sum  >> /tmp/b2sum_of_files_in_dev_sda1.txt ;
done
unset IFS

The above command will not append the name of the files themselves into the checksum file list . But I'm sure there's a way in shell to also get the names of the files into the files.

This would significantly speed up checksumming and the overall cloning process and most importantly minimize wear on the disk.

Also related #126

Thanks!

barkoder avatar Oct 24 '24 19:10 barkoder

Thanks for this idea. However, I believe actually you should just choose one of the checksum methods. I suggest that b2sum is good enough. Of course, this can be improved. We will try to do that in the future.

Steven

stevenshiau avatar Dec 08 '24 12:12 stevenshiau

Thanks for your suggestion. This feature has been implemented in Clonezilla live >= 3.2.0-27 or 20241213-*: https://clonezilla.org/downloads.php Let us know the results if you test that. Thanks.

Steven

stevenshiau avatar Dec 14 '24 01:12 stevenshiau

Tested clonezilla-live-3.2.0-32-amd64.iso .

I selected md5sum and b2sum in expert mode. After successful completion of the cloning process, it started catting files(including binaries!) into the terminal.

Please fix. Thanks!

barkoder avatar Jan 12 '25 18:01 barkoder

Could you please show the files list in your image dir by running: ls -lh /home/image/IMAGE (replace IMAGE with your image name). Thanks.

Steven

stevenshiau avatar Jan 12 '25 23:01 stevenshiau

$ ls -lh

total 463G
-rwxrwxrwx+ 1 Administrators Administrators  979 Jan 12 14:08 B2SUMS
-rwxrwxrwx+ 1 Administrators Administrators 1.3K Jan 12 14:08 blkdev.list
-rwxrwxrwx+ 1 Administrators Administrators  943 Jan 12 14:08 blkid.list
-rwxrwxrwx+ 1 Administrators Administrators  222 Jan 12 12:03 dev-fs.list
-rwxrwxrwx+ 1 Administrators Administrators    4 Jan 12 14:08 disk
-rwxrwxrwx+ 1 Administrators Administrators   13 Jan 12 14:08 dmraid.table
-rwxrwxrwx+ 1 Administrators Administrators  307 Jan 12 14:08 MD5SUMS
-rwxrwxrwx+ 1 Administrators Administrators   20 Jan 12 14:08 parts
-rwxrwxrwx+ 1 Administrators Administrators   33 Jan 12 09:39 sda1.info
-rwxrwxrwx+ 1 Administrators Administrators  26M Jan 12 09:39 sda1.ntfs-ptcl-img.uncomp
-rwxrwxrwx+ 1 Administrators Administrators  48G Jan 12 10:06 sda2.ntfs-ptcl-img.uncomp
-rwxrwxrwx+ 1 Administrators Administrators 208G Jan 12 12:03 sda3.ntfs-ptcl-img.uncomp
-rwxrwxrwx+ 1 Administrators Administrators  512 Jan 12 14:08 sda4-ebr
-rwxrwxrwx+ 1 Administrators Administrators 208G Jan 12 14:08 sda5.ntfs-ptcl-img.uncomp
-rwxrwxrwx+ 1 Administrators Administrators   37 Jan 12 14:08 sda-chs.sf
-rwxrwxrwx+ 1 Administrators Administrators 1.0M Jan 12 14:08 sda-hidden-data-after-mbr
-rwxrwxrwx+ 1 Administrators Administrators  512 Jan 12 14:08 sda-mbr
-rwxrwxrwx+ 1 Administrators Administrators  535 Jan 12 14:08 sda-pt.parted
-rwxrwxrwx+ 1 Administrators Administrators  458 Jan 12 14:08 sda-pt.parted.compact
-rwxrwxrwx+ 1 Administrators Administrators  381 Jan 12 14:08 sda-pt.sf

Also

$ cat B2SUMS

bXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX2  blkdev.list
2XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX67  blkid.list
4XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX21  dev-fs.list
7XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXf3  disk
9XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX9  dmraid.table
8XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX8f  parts
6XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXff  sda1.info

$ cat MD5SUMS

8XXXXXXXXXXXXXXXXXXXXXXXXX6  blkdev.list
aXXXXXXXXXXXXXXXXXXXXXXXXX1  blkid.list
3XXXXXXXXXXXXXXXXXXXXXXXXX5  dev-fs.list
8XXXXXXXXXXXXXXXXXXXXXXXXXb  disk
fXXXXXXXXXXXXXXXXXXXXXXXXX6  dmraid.table
kXXXXXXXXXXXXXXXXXXXXXXXXX5  parts
1XXXXXXXXXXXXXXXXXXXXXXXXX7a  sda1.info

The binary catting appears to have happened while reading the *uncomp files, because that's when I panicked and powered off the computer.

barkoder avatar Jan 13 '25 12:01 barkoder

Please give testing Clonezilla live >= 3.2.0-33 or 20250114-* a try: https://clonezilla.org/downloads.php This issue should have been fixed. If you test, please let us know the results. Thanks.

Steven

stevenshiau avatar Jan 15 '25 10:01 stevenshiau