gitoxide icon indicating copy to clipboard operation
gitoxide copied to clipboard

split index support

Open SidneyDouw opened this issue 3 years ago • 1 comments

What is a split index?

When using a split index the index is split up into two separate files:

  • the split index at $GIT_DIR/index
  • the shared index at $GIT_DIR/sharedindex.<SHA-1>

The shared index contains all entries while the split index contains and accumulates changes. These changes in the split index are occasionally written into the shared index, either "automatically" based on config settings, or by the git-update-index command. "Automatically" here means that this runs every time the index is being read or updated.

Link Extension

The link extension stores 2 bitmaps that record how the stored changes in the split index should be merged with the shared index.

The replace bitmap stores which entries in the shared index should be replaced by entries stored in the split index, i.e. [0, 1, 1, 0, 1] -> replace shared index entries at index 1, 2, 4 with split index entries at 0, 1, 2 Any additional entries in the split index should be added to the shared index.

The delete bitmap stores which entries in the shared index should be deleted, i.e. [1, 0, 0, 1] -> delete entries at index 0 and 3

Tasks

  • [x] check git source code to learn more about how split index works and update this issue
  • [x] #655
  • [ ] writing
    • [ ] write only the split index without ever merging / updating the shared index
      • [ ] if this turns out to be too much work just write a regular index and discard the split / shared index files
    • [ ] merge split and shared indexes (based on maxPercentChange) - source code reference
  • [ ] update active shared index modification time every time the split index is being read / updated to prevent automatic deletion

Config Settings

  • core.splitIndex enables the use of a split index
  • splitIndex.maxPercentChange The percentage threshold of entries in the split index (compared to the shared index) that triggers a write to the shared index. It defaults to 20.
  • splitIndex.sharedIndexExpire If the modification time of a shared index is older than this value it will be deleted. Takes values like "now", "never", "2.weeks.ago" (default). This needs to be parsed.

Notes

  • split index and sparse index are incompatible in git
  • git | date.c is used to parse dates

Questions

  • Does the split index get emptied / recreated from scratch when writing its changes to the shared index?
  • I assume the actively used shared index can and should never be deleted, but based only on the sharedIndexExpire setting it could be

References

SidneyDouw avatar Dec 06 '22 11:12 SidneyDouw

Thanks for summing up this feature, this issue shall be the authority of this feature.

Besides the need to be able to read the shared index and process it so the index can be used, I have a strong feeling that this capability is legacy already. With sparse indices and the index DIR entry, I think there isn't going to be a problem with index sizes anymore. As far as I know, and this might be very wrong, split index support isn't really available for plenty of features or interactions, particularly with sparse indices. I guess they will add it when these sparse indices get so large that a split index is reasonable, and I think gitoxide wouldn't be the first to implement this one.

My stance here is to only add minimal support and be OK with 'unsplitting' the index when writing by dropping the extension (and writing the whole index), and then wait and see, if writing gives any trouble.

I am curious as to what you discover as you dig into the git source and the history of the split index feature :).

Byron avatar Dec 06 '22 15:12 Byron