borg icon indicating copy to clipboard operation
borg copied to clipboard

lost index.xxxx after a backup and don't understand why

Open PhilippeAccorsi opened this issue 3 weeks ago • 2 comments

Hi, On one of us backup, we have some error of integrity. After some search and test, we seen that index.xxxx is missing on the repository. We investigate that because command borg list /path/repo take too much time and finish all the time with this kind of error after 15-20min borg.helpers.errors.IntegrityError: Data integrity error: Segment entry checksum mismatch [segment 1234, offset 137810265]. Also borg command not ask about "password" to list backup.

What we do to backup:

Backup-server is configured with append only and restriction on ssh access. Production-server push backup on backup-server with borgmatic command with ssh connection.

The backup-server is a basic computer (without ECC RAM), with a RAID5 (4 disks) and debian13. We have init the repository directly on this server. Borg/borgbackup version: 1.4.0

We have tested RAM with memtest86 (without error). No error seen on disks with smartctl short test.

We not understand why we lost the index.xxxx, and why we not have seen this before (we have 3 backup-server, 10-15 repository for different kind of production-server.). So we think the file is corrupted and Debian13 not show this file anymore, after last backup.

To solve the problem and repair repository, we have used this "solution" https://github.com/borgbackup/borg/issues/7673#issuecomment-1605659580 : delete 3 files with .XXXX and execute borg check --debug --repair --progress /backup/nas-backup. After that, 3 files recreated, borg list work again correctly. We seen some log like this

Data integrity error: Segment entry checksum mismatch [segment 77, offset 13183780]
attempting to recover /backup/nas-backup/data/0/77

But at end, no error about file lost.

Someone have seen problem like this before? Do you have an idea about if it's a HW or SW problem? Is it still exist another solution to recreate index.xxxx and/or do you know when this file is created/modified on backup process? Do you think this king of corrupted/integrity error come from power supply instability?

Regards, Philippe

PhilippeAccorsi avatar Dec 04 '25 10:12 PhilippeAccorsi

we seen that index.xxxx is missing on the repository.

That's unusual. borg 1.x works with transactions and either finishes a transaction successfully by writing an updated index or can rollback the incomplete transaction, going back to the index in the state how it was at transaction start.

Maybe a filesystem issue? fsck?

Data integrity error: Segment entry checksum mismatch [segment 1234, offset 137810265]

That means you have corruption in the repo files. The crc32 did not match anymore. The crc32 is computed on the borg repo server ("borg serve" side) just before it writes an entry into a segment file (usually a new data chunk).

Backup-server is configured with append only.

That means that borg only write new data to new segment files. As long as "append only" is active, it will not ever "touch" old segment files (e.g. for compaction).

We have tested RAM with memtest86 (without error). No error seen on disks with smartctl short test.

Good. That would have been my first suggestion otherwise.

There can be other hw issues, of course.

borg check --repair REPO

That will build a completely fresh repo index from the information contained in the segment files (which is the authoritative source of truth). But, as you have corrupted segment files, it can only use the stuff that is not corrupted.

attempting to recover /backup/nas-backup/data/0/77

That means it reads the segment file, iterating over all its entries and only keeps the non-corrupt entries.

If the archives part of "borg check --repair" does not complain after that, you maybe are lucky and the corrupt entry was not used anymore. Otherwise (still used), the archives part should log some warnings.

Do you have an idea about if it's a HW or SW problem?

Can be both, but the issue is (or was) likely located on the borg repo server, e.g.:

  • fs corrupted by power outage, malfunctioning power supply
  • other hard crashes of the system, e.g. kernel panic
  • some other hw issue
  • some low level OS, fs or driver issue

A bug in borg causing this is very unlikely or we would see it all over the place.

ThomasWaldmann avatar Dec 04 '25 16:12 ThomasWaldmann

Doing regular borg check is helpful for such situations, so you will notice when a problem appears.

If you don't do that, you could be finding issues in a borg check that were caused long in the past, but were unnoticed until when you ran borg check recently. It could be even that the root cause of the problem was already fixed and can't be seen anymore.

ThomasWaldmann avatar Dec 04 '25 16:12 ThomasWaldmann

@PhilippeAccorsi could you solve it?

ThomasWaldmann avatar Dec 11 '25 17:12 ThomasWaldmann

After repair yes. For the moment everythings work good. We not understand what is going on in the past. We cross finger to not seen this again :)

I close this issue.

PhilippeAccorsi avatar Dec 11 '25 17:12 PhilippeAccorsi

@ThomasWaldmann I have not seen your first reply at the begining, when I have seen other message (I just seen you're first message now) :(

Thanks a lot for all answer \o/ We have experimented some power outage in the past, so maybe the error come at this time. Not seen problem after power outage, but not check presence/absence of file and not force check on repository. Its now in my mind, to check all repo after this kind of HW problem.

PhilippeAccorsi avatar Dec 11 '25 17:12 PhilippeAccorsi