zoekt icon indicating copy to clipboard operation
zoekt copied to clipboard

Is zoekt indexing crash consistent

Open shubham149 opened this issue 1 year ago • 1 comments

I wanted to check whether zoekt indexing is crash consistent. If not, will re-indexing all the repositories on startup (after crash) ensure that indexing is in consistent state?

shubham149 avatar Dec 18 '23 07:12 shubham149

cc @sourcegraph/search-platform

keegancsmith avatar Dec 18 '23 15:12 keegancsmith

@shubham149 I'm sorry for the very slow response here! Confirming I understand: when you say "crash consistent", you mean snapshot consistency. For example, if Zoekt was indexing a new commit, and the machine crashed at any point in indexing, the disk still represents a valid data snapshot and doesn't contain inconsistencies (like mix of old and new commits, some data missing, etc.).

Zoekt is not guaranteed to be crash consistent. However, the indexing strategy is quite simple and it would be possible to make it consistent. Details:

  • Each index is a collection of immutable files
  • Each file represents a shard for a single repo
  • To reindex, we create a temporary directory and rename the files once all repo shards are finished

Let us know if you have other questions or are interested in contributing improvements.

jtibshirani avatar May 13 '24 18:05 jtibshirani

Thans @jtibshirani for the updates

shubham149 avatar May 14 '24 07:05 shubham149