zoekt
zoekt copied to clipboard
Is zoekt indexing crash consistent
I wanted to check whether zoekt indexing is crash consistent. If not, will re-indexing all the repositories on startup (after crash) ensure that indexing is in consistent state?
cc @sourcegraph/search-platform
@shubham149 I'm sorry for the very slow response here! Confirming I understand: when you say "crash consistent", you mean snapshot consistency. For example, if Zoekt was indexing a new commit, and the machine crashed at any point in indexing, the disk still represents a valid data snapshot and doesn't contain inconsistencies (like mix of old and new commits, some data missing, etc.).
Zoekt is not guaranteed to be crash consistent. However, the indexing strategy is quite simple and it would be possible to make it consistent. Details:
- Each index is a collection of immutable files
- Each file represents a shard for a single repo
- To reindex, we create a temporary directory and rename the files once all repo shards are finished
Let us know if you have other questions or are interested in contributing improvements.
Thans @jtibshirani for the updates