xbps icon indicating copy to clipboard operation
xbps copied to clipboard

lib: replace xbps_file_hash_check_dictionary with transaction_file lookup

Open Duncaen opened this issue 4 years ago • 2 comments

xbps_file_hash_check_dictionary was called for every file that is getting unpacked, because the files list is an array it iterated over the whole files array to find the matching file. With a lot of files this is really slow and a lot of time was spend in locking the proplib array and iterating over it.

Time from my Ryzen 3700X system with nvme disk updating texlive-fontsextra with files 86375:

time xbps-install -y texlive-fontsextra
    6m34.61s real     6m27.71s user     0m03.95s system

And with this patch:

time xbps-install -y texlive-fontsextra
    0m08.40s real     0m07.34s user     0m00.98s system

Duncaen avatar Dec 12 '20 22:12 Duncaen

For the record, a good test case here seems to be updating papirus-icon-theme. It took 2m37s going from 20201001_1 to 20210201_1, all of it inside xbps-install, not any hooks.

I will build this locally and see how better it is.

EDIT: I was testing on a loaded machine, so it isn't a clean test, but this PR still took 52s. Without load, 31s.

ericonr avatar Feb 10 '21 04:02 ericonr

This should greatly reduce the time needed, not sure why this example with more files is faster, maybe its just a lot more symlinks or smaller files that would need to be checksumed.

For the implementation, I think I would rename a few things, xbps_transaction_file_new does not really represent what the function actually does and maybe it would make sense to call the structure xbps_file instead og xbps_transaction_file as the struct could be reused for other things in the future. Might also make sense to move some of the fields like "bool update" back into the transaction_files item structure as some of those are not important outside of the context of finding conflicts in the transaction.

Duncaen avatar Feb 10 '21 14:02 Duncaen