Improve `borg2 check` reporting of missing chunks
/kind enhancement
Suggestion:
Currently Borg (both Borg 1 and Borg 2 actually) reports missing chunks in an archive - file - chunk relation, like the following:
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/repo/objects/49/2e580bd7ac5842e9bfe34a382520bfd2190b08759adc52b837014a631cc4df.file: Missing file chunk detected (Byte 0-823, Chunk 6e99d76a3a8a8f7c42cd10a3c6abdc9b6ef62d2f02e6c7dd27583d7c66b81e1b).
I was thinking about whether this relation could be inversed: chunk - file - archive.
From an user's perspective, if chunks are missing, I primarily want to know which files are affected. For example, chunk 6e99d76a3a8a8f7c42cd10a3c6abdc9b6ef62d2f02e6c7dd27583d7c66b81e1b above isn't just missing in the Tuxedo-InfinityBook-14-2022-03-31T20:05:51 archive, but actually in 11 archives spanning from 2021-09-30 to 2022-07-31.
A report like the following would give me a better overview of what is going on:
(borg_venv) [root@68f3930f72fc /]# borg2 --repo=/borg2_repo/ check -v --show-rc --progress
Starting full repository check
…
Starting archive consistency check...
Checking archives 0.0%
Analyzing archive Dell-Vostro-2016-12-31T12:02:02 2016-12-31 11:02:03.396484+00:00 a8ee20e75b699e69ea0cc6d1112355c367d66a92f4ad5f5448a4da455731350e (1/135)
…
Archive consistency check complete, problems found.
The following chunks are missing in the repository:
- Chunk a8675a3a8a8a655fa32acda74241e95d46912b88b1864d8022a3e71761195dce, 553,688 bytes
- var/lib/flatpak/repo/objects/2c/626568776d516cd6ba4b71ff899b8748964baeb6800a3ad37fec1f0b3652be.file: Tuxedo-InfinityBook-14-2022-03-31T20:05:51, Tuxedo-InfinityBook-14-2022-04-29T10:36:23, Tuxedo-InfinityBook-14-2022-05-31T11:08:41
- var/lib/flatpak/runtime/org.gnome.Platform/x86_64/42/00a7680bc472e0205934c70b2bf6fecfb5a085cc1792dad7958e5cd07b648a70/files/bin/gpgsm: Tuxedo-InfinityBook-14-2022-03-31T20:05:51
- var/lib/flatpak/runtime/org.gnome.Platform/x86_64/41/07e2f19d660a4bf5520857f2eea3aa7005d0e320b2ee5ec7c02a5ea95d2123a3/files/bin/gpgsm: Tuxedo-InfinityBook-14-2022-03-31T20:05:51, Tuxedo-InfinityBook-14-2022-04-29T10:36:23
- …
- Chunk c757e091ab9fbbd6d01d73cc0672d5ef322e21bdbbdf4953450369d2f9ea3155, 337,731 bytes
- var/lib/flatpak/repo/objects/06/8644aaa4d3eca45d513b467982d9fd4ca39606f9eb050c74c431732da66a6b.file: Tuxedo-InfinityBook-14-2021-09-30T10:16:11, Tuxedo-InfinityBook-14-2021-10-31T15:49:04, Tuxedo-InfinityBook-14-2021-11-30T12:18:48, Tuxedo-InfinityBook-14-2021-12-31T10:50:36, Tuxedo-InfinityBook-14-2022-01-31T16:48:07, Tuxedo-InfinityBook-14-2022-02-28T10:56:58, Tuxedo-InfinityBook-14-2022-03-31T20:05:51, Tuxedo-InfinityBook-14-2022-04-29T10:36:23
- …
- Chunk 6e99d76a3a8a8f7c42cd10a3c6abdc9b6ef62d2f02e6c7dd27583d7c66b81e1b, 823 bytes
- …
Background:
The reason why I'm suggesting this now is due to Borg 2 no longer assuming a "hard link master", which causes borg2 check to report the same missing chunk over and over again if that chunk happens to be missing in a file with many hard links. Borg 1.4 on the other hand reported missing chunks for the "hard link master" only, i.e., once.
So, borg2 check repeating the same missing chunk for all hard linked files is expected behaviour. It's no bug, borg2 check is working perfectly fine. I don't want to suggest changing hard link handling. What I want to discuss is whether borg2 check should report it this way (i.e., this issue is about changing the UI only, not the backend).
However, the repeated reports of the same chunk makes it very hard to get an idea of what's going on:
(borg_venv) [root@68f3930f72fc /]# borg2 --repo=/borg2_repo/ check -v --show-rc --progress
Starting full repository check
…
Starting archive consistency check...
Checking archives 0.0%
Analyzing archive Dell-Vostro-2016-12-31T12:02:02 2016-12-31 11:02:03.396484+00:00 a8ee20e75b699e69ea0cc6d1112355c367d66a92f4ad5f5448a4da455731350e (1/135)
…
Checking archives 8.9%
Analyzing archive Tuxedo-InfinityBook-14-2022-03-31T20:05:51 2022-03-31 18:05:57.879279+00:00 a836feadc6d2ce36965360db1637ecdd60e10e49e8697e4158010d9accc39a7f (13/135)
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/repo/objects/06/8644aaa4d3eca45d513b467982d9fd4ca39606f9eb050c74c431732da66a6b.file: Missing file chunk detected (Byte 0-337731, Chunk c757e091ab9fbbd6d01d73cc0672d5ef322e21bdbbdf4953450369d2f9ea3155).
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/repo/objects/49/2e580bd7ac5842e9bfe34a382520bfd2190b08759adc52b837014a631cc4df.file: Missing file chunk detected (Byte 0-823, Chunk 6e99d76a3a8a8f7c42cd10a3c6abdc9b6ef62d2f02e6c7dd27583d7c66b81e1b).
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/repo/objects/2c/626568776d516cd6ba4b71ff899b8748964baeb6800a3ad37fec1f0b3652be.file: Missing file chunk detected (Byte 0-553688, Chunk a8675a3a8a8a655fa32acda74241e95d46912b88b1864d8022a3e71761195dce).
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/runtime/org.gnome.Platform/x86_64/42/00a7680bc472e0205934c70b2bf6fecfb5a085cc1792dad7958e5cd07b648a70/files/bin/gpgsm: Missing file chunk detected (Byte 0-553688, Chunk a8675a3a8a8a655fa32acda74241e95d46912b88b1864d8022a3e71761195dce).
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/runtime/org.gnome.Platform/x86_64/41/07e2f19d660a4bf5520857f2eea3aa7005d0e320b2ee5ec7c02a5ea95d2123a3/files/bin/gpgsm: Missing file chunk detected (Byte 0-553688, Chunk a8675a3a8a8a655fa32acda74241e95d46912b88b1864d8022a3e71761195dce).
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/21.08/e5aff027f1cfc1b950d476dab5159628f077a4ca114e87160a123df9cce700bf/files/bin/gpgsm: Missing file chunk detected (Byte 0-553688, Chunk a8675a3a8a8a655fa32acda74241e95d46912b88b1864d8022a3e71761195dce).
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/runtime/org.kde.Platform/x86_64/5.15-21.08/77bad058026209fa1bb4509ce155a93be883dbcdca01ef1eb4f65fe1f7ffe877/files/bin/gpgsm: Missing file chunk detected (Byte 0-553688, Chunk a8675a3a8a8a655fa32acda74241e95d46912b88b1864d8022a3e71761195dce).
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/runtime/org.freedesktop.Sdk/x86_64/21.08/1a972810d883f9721f90b87e092d282184b8a95fc10fc72068723cd54f893b22/files/bin/gpgsm: Missing file chunk detected (Byte 0-553688, Chunk a8675a3a8a8a655fa32acda74241e95d46912b88b1864d8022a3e71761195dce).
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/runtime/org.freedesktop.Sdk/x86_64/21.08/1a972810d883f9721f90b87e092d282184b8a95fc10fc72068723cd54f893b22/files/lib/gcc/x86_64-unknown-linux-gnu/11.2.0/plugin/include/cp/cp-tree.h: Missing file chunk detected (Byte 0-337731, Chunk c757e091ab9fbbd6d01d73cc0672d5ef322e21bdbbdf4953450369d2f9ea3155).
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/runtime/org.freedesktop.Sdk/x86_64/21.08/1a972810d883f9721f90b87e092d282184b8a95fc10fc72068723cd54f893b22/files/lib/x86_64-linux-gnu/ruby/gems/3.0.0/gems/rbs-1.0.4/sig/method_types.rbs: Missing file chunk detected (Byte 0-823, Chunk 6e99d76a3a8a8f7c42cd10a3c6abdc9b6ef62d2f02e6c7dd27583d7c66b81e1b).
Tuxedo-InfinityBook-14-2022-03-31T20:05:51: var/lib/flatpak/.removed/org.freedesktop.Platform-fcf3dbb56c117835216058acfa99ae280d713029f8d674e1b56e019a20ed082d/files/bin/gpgsm: Missing file chunk detected (Byte 0-553688, Chunk a8675a3a8a8a655fa32acda74241e95d46912b88b1864d8022a3e71761195dce).
…
Note that chunk a8675a3a8a8a655fa32acda74241e95d46912b88b1864d8022a3e71761195dce is reported as missing for seven paths. The other two reported chunks (c757e091ab9fbbd6d01d73cc0672d5ef322e21bdbbdf4953450369d2f9ea3155 and 6e99d76a3a8a8f7c42cd10a3c6abdc9b6ef62d2f02e6c7dd27583d7c66b81e1b) are reported as missing twice each.
Borg 1.4 on the other hand reports each chunk missing just once (i.e., for a single path each).
Moving this to a separate report section as suggested above would IMHO allow for an even better overview of the repo's problems.
Additional notes:
I'm running Borg 2.0.0b20.dev213+gb43c3adab
Even if this suggestion is considered beneficial, I believe it's rather low priority. Since it's an UI change only, no breaking changes are expected, so there's no need to implement this with Borg 2.0, but could be implemented at any later point.
WDYT?