borg icon indicating copy to clipboard operation
borg copied to clipboard

Include (un)changed file stats in --stats output

Open Arcovion opened this issue 6 years ago • 2 comments

It would be nice to see the total number of file cache hits with --stats so you could tell how many files were (not) changed at a glance. Currently there is no indication of whether the files cache is active or not unless you use --list and see that files are marked as unchanged. When --no-files-cache is used, it could show something like disabled instead of 0 to again give more insight to what's going on behind the scenes.

...
Duration: 10 minutes 48.32 seconds
Number of files: 154658
File cache hits: 154658
Utilization of max. archive size: 0%
...

Would mean 0 changed files.

Arcovion avatar Feb 15 '19 03:02 Arcovion

cache hits != changed files. the cache lookup is via the file name, so all files that have been seen before are a hit.

what you want is that it tells the number of files with potentially changed content.

ThomasWaldmann avatar Feb 15 '19 06:02 ThomasWaldmann

Strongly in favor of this!

what you want is that it tells the number of files with potentially changed content.

Also half true, since the meaning of "potentially changed" depends on the actual parameter setting of --files-cache and level of understanding of the interpreter.

This is actually the part two of "making files cache more transparent" thoughts I currently have here.

I am using borgbackup for a year now, and the existence of a files cache and how borgbackup is detecting a change went unnoticed. Adding a stats entry makes it more obvious what is going on.

The name of the metric should make clear to the reader, that these files were not investigated in deep for a content change. In the end it is not only a performance issue but also a data safety issue.

Ideas for the metric name:

  • skipped files (via files cache)
  • seen before (in files cache)

or

  • chunked files (not in files cache)
  • files with detected modification

cruftex avatar Apr 03 '19 08:04 cruftex

Current master branch (borg2):

% export BORG_REPO=/tmp/b2
% borg rcreate -e none

% borg create arch docs --stats
Repository: /tmp/b2
Archive name: arch
Archive fingerprint: c78bab78f32b9fc0b6650cc9e741966d9809c7b0a64bff6f2c178487f725269f
Time (start): Thu, 2023-02-02 23:06:14 +0100
Time (end):   Thu, 2023-02-02 23:06:14 +0100
Duration: 0.12 seconds
Number of files: 458
Original size: 15.73 MB
Deduplicated size: 14.63 MB
Time spent in hashing: 0.01 seconds
Time spent in chunking: 0.05 seconds
*** Added files: 458
*** Unchanged files: 0
*** Modified files: 0
*** Error files: 0
Bytes read from remote: 0
Bytes sent to remote: 0

% borg create arch2 docs --stats
Repository: /tmp/b2
Archive name: arch2
Archive fingerprint: 06b2e87642407f79dd4b1d2e9a51454d35ca41d21cdd93456acc140baf3df2a6
Time (start): Thu, 2023-02-02 23:06:37 +0100
Time (end):   Thu, 2023-02-02 23:06:37 +0100
Duration: 0.03 seconds
Number of files: 458
Original size: 15.73 MB
Deduplicated size: 475 B
Time spent in hashing: 0.00 seconds
Time spent in chunking: 0.00 seconds
*** Added files: 1
*** Unchanged files: 457
*** Modified files: 0
*** Error files: 0
Bytes read from remote: 0
Bytes sent to remote: 0

So, guess we have this already.

ThomasWaldmann avatar Feb 02 '23 22:02 ThomasWaldmann