borg icon indicating copy to clipboard operation
borg copied to clipboard

optimise calls to iter_objects (repo v2)

Open ThomasWaldmann opened this issue 3 years ago • 1 comments

if a repo only has PUT2 (and no PUT any more), e.g. replay_segments should call iter_objects(..., read_data=False).

with PUT tags, a CRC32 check was only possible when reading the data (even if we didn't really need the data). with PUT2, the CRC32 check is now possible without reading the data, so we can get much faster.

replay_segments is called e.g. if there is:

  • an index_transaction N
  • a segments_transaction M
  • N > M
  • that means: some segment files at the end were deleted (I just did that for a repo on a disk with I/O errors - in that case it will completely read ALL the segment files..................................).

besides replay_segments also check all other callers of iter_objects which do not already give read_data=False - maybe they could!

ThomasWaldmann avatar Apr 11 '22 15:04 ThomasWaldmann

Guess we need to shift that to N+1 release (not 2.0), because 2.0 still deals with old repos (that do not have PUT2 yet).

ThomasWaldmann avatar Jul 27 '22 21:07 ThomasWaldmann