gitoxide icon indicating copy to clipboard operation
gitoxide copied to clipboard

The slotmap turned out to be too small

Open FoxLightning opened this issue 11 months ago • 22 comments

Current behavior 😯

gix is not working with repository ~2M files + LFS

Expected behavior 🤔

work as git

Git behavior

Image

Steps to reproduce 🕹

git status git log

FoxLightning avatar Jan 20 '25 15:01 FoxLightning

Thanks for reporting!

This shouldn't be happening as it starts out the slotmap with num_files_to_open * 2, so there is always twice as much as it would need to map everything.

Could you list the files in .git/objects/pack/?

Byron avatar Jan 20 '25 15:01 Byron

/usr/bin/ls 00 0a 13 1c 24 2c 34 3d 45 4d 57 60 69 73 7e 88 90 98 a0 a9 b3 bd c6 cf db e5 ee f8 info 01 0b 14 1d 25 2d 35 3e 46 4e 58 61 6a 75 81 89 91 99 a1 aa b4 be c7 d0 dc e6 ef f9 pack 03 0c 15 1e 26 2e 36 3f 47 4f 59 62 6b 76 82 8a 92 9a a3 ab b5 bf c8 d2 dd e7 f0 fa 04 0d 16 1f 27 2f 37 40 48 50 5b 64 6c 77 83 8b 93 9b a4 ac b6 c0 c9 d4 de e9 f1 fb 05 0e 17 20 28 30 38 41 49 51 5c 65 6d 78 84 8c 94 9c a5 ae b7 c1 ca d6 e0 ea f2 fc 07 10 18 21 29 31 39 42 4a 52 5d 66 6e 79 85 8d 95 9d a6 af b8 c2 cb d7 e1 eb f4 fd 08 11 1a 22 2a 32 3a 43 4b 53 5e 67 6f 7a 86 8e 96 9e a7 b0 b9 c3 cc d8 e3 ec f5 fe 09 12 1b 23 2b 33 3c 44 4c 54 5f 68 70 7d 87 8f 97 9f a8 b2 bc c4 ce d9 e4 ed f7 ff

FoxLightning avatar Jan 20 '25 16:01 FoxLightning

its on windows

FoxLightning avatar Jan 20 '25 16:01 FoxLightning

Image 190 items

FoxLightning avatar Jan 20 '25 16:01 FoxLightning

Thanks, it really looks like this is a bug where it can't properly scale beyond 32 anymore.

Byron avatar Jan 20 '25 16:01 Byron

What about the progress of the task? How can I help you? Perhaps you need additional information?

FoxLightning avatar Jan 26 '25 11:01 FoxLightning

This issue is top of my list, and still I didn't get to it - special times. Indeed, I'd appreciate your help, thanks for asking.

https://github.com/GitoxideLabs/gitoxide/blob/dbf079f4b70db01ae4f850796ae6006d45d3f99c/gix/tests/gix/remote/fetch.rs#L88-L90

The test above could be adapted to create more than 32 packs in the local repository by fetching repeatedly. Then a default repository can be opened there, with the assumption that this reproduces the issue you are seeing after accessing the head_id().

If this doesn't reproduce, and as an immediate fix, you could also try to set the default slotmap size to 96, which would allow 96 indices and packs to be opened at the same time (i.e. nearly 200 files) - that way it should just work without truly fixing the bug. If it still doesn't work though, this would certainly also tell us something about the nature of the issue.

Byron avatar Jan 27 '25 07:01 Byron

Apologies for the long wait.

However, I did change the test in the linked PR to the point where it should be able to reproduce the issue.

The hypothesis was that if there are many packs, it's unable properly configure its slotmap to be able to load all packs if needed. However, int he screenshot it seemingly failed to do that.

A test was created to create a repository that would be above a set minimal amount of packs it could read, to validate it dynamically adjusts that value. And the test works as expected.

Further, I have a repository with 145 packs, and gix status works fine there as well.

Could you retry this with the latest version of gix (CLI)?

My assumption would be that it is still reproducing for you, but ideally we could work on debugging this together to figure out what the problem is. In the screenshot, it clearly didn't see all packs and initialized the slotmap with the minimum of 32 slots, even though 95 packs (the slotmap should be 105 slots large then). It's notable that the algorithm to figure out how many slots are needed is filtering the directory (.git/objects/packs) contents quite a bit, and that would be another reason that it can't see files that are seemingly there.

https://github.com/GitoxideLabs/gitoxide/blob/52ff68bff1114619f7d2dfbe228e00b4608c0dfe/gix-odb/src/store_impls/dynamic/load_index.rs#L457-L460

Thanks in advance for your help.

Byron avatar Apr 19 '25 14:04 Byron

Perhaps a different cause, but I can consistently hit "slotmap too small" when a concurrent 'git fetch' brings down new pack files and I reuse an existing/stale gix::Repository object.

anguslees avatar May 06 '25 01:05 anguslees

Thanks for sharing! It's good that this sounds like parallelism isn't the issue here, but merely the local state of another gix::Repository. Does that means that you'd think the following triggers the issue?

let fetch_repo = gix::open(path);
let other_repo = fetch_repo.clone();

let (packs, fetched_refs) = fetch_repo.fetch();
assert!(packs > 0);

// This works
fetched_refs[0].attach(&fetched_repo).id().object();
// slotmap too small
fetched_refs[0].attach(&other_repo).id().object();

If not, maybe you can update the example to something that resembles your code better (or share your code), so I can see if this reproduces in a test. Thanks again.

Byron avatar May 06 '25 05:05 Byron

Does that means that you'd think the following triggers the issue

Yes, I think this matches what I suspect.

or share your code

I'm using a PoC/hack to make jj work with git blobless partial clones (git clone --filter blob:none). PR is https://github.com/jj-vcs/jj/pull/6451. The code is real basic and just catches NotFound error from git_repo.find_object(), synchronously calls git cat-file -e $id, and then tries git_repo.find_object() again. git cat-file $id will (iiuc) download a pack file with 1 blob, so I can suddenly end up with a lot of new small pack files (until the next git gc).

I see errors like:

Internal error: Failed to check out commit 3e995d23ed96a25844facbd02f6fc7887c81eb77
Caused by:
1: Internal backend error
2: Error when reading object 35889d0083e24107b7cb1eba9d6935d5bf7540f1 of type file
3: The slotmap turned out to be too small with 36 entries, would need 1 more

Notably they're always "need 1 more", which I think strongly suggests the issue is tickled by my git cat-file $id branch. I have not debugged into specific code paths further than this observation.

Oh, and fwiw, I think my build of jj is using (not modified in my PR):

gix = { version = "0.71.0", default-features = false, features = [
    "attributes",
    "blob-diff",
    "index",
    "max-performance-safe",
    "zlib-rs",
] }

anguslees avatar May 06 '25 23:05 anguslees

That is great to hear that this is a PoC which basically dialled pack creation up to 11 :)!

I presume this won't happen anymore if the gix::Repository is re-instantiated every now and then, or after each potential pack fetch.

Right now there is no dynamic growth of the fixed slotmap, and one has to instantiate it so that it can accommodate the anticipated maximum amount of packs. The default minimal size is 36 which assumes a worst-case geometric pack, but it will be what it needs to be +10% safety margin when instantiating a new gix::Repository with gix::open().

Now I am thinking that maybe one way to improve the situation for most is to just put it to 256 or 512 even to cover even extreme real-world cases - for you that is still easily exhausted unless measures are taken to counter it.

Ideally, it would dynamically grow, but getting this to work in this highly-concurrent portion of the codebase that is under-tested will be very hard and risky (things could break only when things get truly multi-threaded).

Byron avatar May 07 '25 05:05 Byron

for some reason I bump into this on somewhat-big repo, but no issues on small one.

# gix --version
gix 0.46.0

arogozhnikov avatar Nov 21 '25 04:11 arogozhnikov

Does this happen when you use gix? Or is this part of a long-running program?

Byron avatar Nov 22 '25 08:11 Byron

I used gix directly

arogozhnikov avatar Nov 22 '25 10:11 arogozhnikov

Is there a way to reproduce this? gix clone <your-repo> && gix … maybe? Thanks again.

Byron avatar Nov 22 '25 10:11 Byron

@Byron I'm getting this error in docs.rs, though crates-index-diff.

The slotmap turned out to be too small with 32 entries, would need 1 more

It seems like not all the time when we're fetching?

also, not sure if it makes a difference?

$ cargo tree -i [email protected]
gix v0.72.1
└── crates-index-diff v28.0.0
    └── docs-rs v0.6.0 (/Users/syphar/src/rust-lang/docs.rs)

syphar avatar Nov 23 '25 11:11 syphar

I think there is a couple of interleaved threads going on here :D, but to sum it up:

  • with gix, the CLI, there should be no such issue. But if it's reproducible a fix should be straightforward. Maybe something about git-alternates, but… for all I know these are handled.
  • with long-running programs that use gix (the crate) and that fetch a lot, this error is expected. The static nature of the map isn't suited for an ever-increasing number of packs. The repo should be re-opened regularly, or after querying object db statistics that would show how 'full' the slotmap is.

The version of gix (v0.72) should be recent enough.

Byron avatar Nov 23 '25 12:11 Byron

with long-running programs that use gix (the crate) and that fetch a lot, this error is expected. The static nature of the map isn't suited for an ever-increasing number of packs. The repo should be re-opened regularly, or after querying object db statistics that would show how 'full' the slotmap is.

What would you recommend for us in docsrs? we fetch every minute or so. How would we query the statistics?

syphar avatar Nov 23 '25 16:11 syphar

repo.objects.store_ref().metrics().unused_slots should do the trick. Since each non-empty fetch creates a new small pack, you should see this value go down by one each time. In theory, compacting these small packs into one should free up slots, but in practice that might not work as I don't think I have a test for that yet.

Byron avatar Nov 24 '25 04:11 Byron

thanks! I'll check it out, and use it as a trigger to reopen the repo,

syphar avatar Nov 24 '25 06:11 syphar

Great, please let me know how it goes. And lastly, I recommend making reasonably sure that the newly opened repo won't be used before the old one is dropped so all system resources are released before new ones are possibly acquired. Oh, and the size of the slotmap (or the amount of spare slots) can also be configured when opening - this would allow you to tune how often you'd want to reopen it.

Maybe it's also a good indicator of when the repo wants to be repacked.

Byron avatar Nov 24 '25 06:11 Byron