borg icon indicating copy to clipboard operation
borg copied to clipboard

Feature: Filename index

Open xeruf opened this issue 4 years ago • 9 comments

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

ISSUE

Describe the problem you're observing.

When I want to find a backed up file, I might not know which archive it is in. It would be handy if borg would create an index (like updatedb or everything on windows) that could be searched.

xeruf avatar Nov 30 '21 18:11 xeruf

You can use borg mount REPO and then use find or whatever tool you prefer.

I once shortly experimented with an additional index for borg, but in the end didn't find it useful enough considering the space and time maintaining such an index needs.

ThomasWaldmann avatar Nov 30 '21 23:11 ThomasWaldmann

Actually, with the right flags this could be handled well by locate. How about subcommands like borg updatedb and borg locate that use locate with appropriate flags under the hood? I would be open to figure out the needed flags.

xeruf avatar Mar 26 '22 08:03 xeruf

Not sure what you mean / how that would be useful.

ThomasWaldmann avatar Mar 26 '22 12:03 ThomasWaldmann

Do you know locate? It does one thing and does it well: Index filesystems. It can use custom databases, so borg could take care of it internally, mounting new backups and indexing them, exposing a customised locate subcommand that searches the indexes of all archives.

xeruf avatar Mar 28 '22 12:03 xeruf

borg runs on a lot of platforms. guess introducing external tools that are not present on all of them is not desirable.

currently borg is mostly used on posix platforms, like linux, bsd, macOS, openindiana. but work to port to windows (and maybe even haiku) is ongoing...

ThomasWaldmann avatar Mar 28 '22 12:03 ThomasWaldmann

btw, i experimented with an index based on the whoosh pure python indexing library long ago. but as it took quite a lot of time to update those indexes and the indexes also took quite a lot of disk space, i somehow did not find it worth it overall.

ThomasWaldmann avatar Mar 28 '22 12:03 ThomasWaldmann

index update time and index space do not really matter to me, what matters is the frequent operation I want to complete quickly: I am looking for a file, can't find it on my local machine, and want to quickly find out whether it is in one of my backups.

The space aspect is curious, as locate is able to index my whole filesystem (half a TB) in under 100MB.

xeruf avatar Jul 10 '22 18:07 xeruf

When I tried whoosh, I noticed:

  • takes additional time at / after each backup
  • consumes space per backup (not just once, as in the case of locate)
  • can be replaced by keeping borg create --list output, so one can grep in there later
  • one can also just borg mount the whole repo and search in there

That's why I did not develop it any further, I didn't find any good reason to do so, but there were quite some "cost" (even if ignoring the dev and maintenance time).

ThomasWaldmann avatar Jul 11 '22 10:07 ThomasWaldmann

right, I guess one can sorta mimick this functionality by listing the files for each backup with borg, which shouldn't be too bad in terms of speed...

xeruf avatar Jul 11 '22 13:07 xeruf