cargo
cargo copied to clipboard
`object not found` when fetching git dependency
Problem
When building a crate that depends on other crates via git, sometimes you get an error like this:
Caused by:
failed to load source for dependency `processor`
Caused by:
Unable to update https://github.com/aptos-labs/aptos-indexer-processors.git?rev=d44b2d209f57872ac593299c34751a5531b51352#d44b2d20
Caused by:
object not found - no match for id (d44b2d209f57872ac593299c34751a5531b51352); class=Odb (9); code=NotFound (-3)
You can see an instance of this failure here: https://github.com/Homebrew/homebrew-core/pull/165260/files. Which comes from here in case the error fails to show up on the first link: https://github.com/Homebrew/homebrew-core/actions/runs/8180427193/job/22368538918?pr=165260.
This only happens sometimes. It seems to happen more often in CI environments, I'm not sure I've seen it happen locally.
Steps
Repro is challenging because it occurs only sometimes. If you look at any of the past brew version bumps for the aptos
formula you'll see that one of the CI steps fails at least once every time due to this issue: https://github.com/Homebrew/homebrew-core/pulls?q=is%3Apr+aptos+is%3Aclosed.
The dep in question comes from here: https://github.com/aptos-labs/aptos-indexer-processors. That repo itself has git submodules, so that could be a factor.
Possible Solution(s)
I'm not sure how to fix this, mostly for now we've just been adding retries to get around it.
Notes
No response
Version
This is the cargo version we're using for the binary in question.
cargo 1.75.0 (1d8b05cdd 2023-11-20)
release: 1.75.0
commit-hash: 1d8b05cdd1287c64467306cf3ca2c8ac60c11eb0
commit-date: 2023-11-20
host: aarch64-apple-darwin
libgit2: 1.7.1 (sys:0.18.1 vendored)
libcurl: 8.4.0 (sys:0.4.68+curl-8.4.0 system ssl:(SecureTransport) LibreSSL/3.3.6)
ssl: OpenSSL 1.1.1u 30 May 2023
os: Mac OS 14.3.1 [64-bit]
From my understanding, under the same CI environment, a retry seems to work? That's odd. Are they any custom Git config in the CI environment?
I've noticed this happening in GitHub Actions, but notably on a variety of different runners, workflows, etc.
You'll see in the original error btw that the commit of the repo it says it can't find does indeed exist: https://github.com/aptos-labs/aptos-indexer-processors/commit/d44b2d209f57872ac593299c34751a5531b51352. But it's an orphan, not sure if that is relevant.
Indeed orphans can be a problem, but usually only when the rev
is not specified. That does not appear to be the case here.
One thing to investigate is that it appears https://github.com/aptos-labs/aptos-indexer-processors.git is specified with three different commit revs. I don't recall how cargo handles fetching those separate revs. If it is randomly fetching one of them, and the others aren't "reachable", then that could be a problem. It could also depend on how github's servers decide what to send, since they don't always send a minimal set, and it might change depending on which server is accessed or the phase of the moon.
One thing to investigate is that it appears https://github.com/aptos-labs/aptos-indexer-processors.git is specified with three different commit revs
Where can you see that? I can try fix that to reduce the odds of this problem happening.
Here are the three commits:
- https://github.com/aptos-labs/aptos-core/tree/aptos-cli-v3.0.1/ecosystem/indexer-grpc/indexer-grpc-parser ->
6fdc6f31fcc494d4ba77ca3bdc8c9eb6b3fc1acb
- https://github.com/aptos-labs/aptos-core/blob/aptos-cli-v3.0.1/crates/aptos/Cargo.toml#L83 ->
d44b2d209f57872ac593299c34751a5531b51352
- https://github.com/aptos-labs/aptos-core/blob/aptos-cli-v3.0.1/Cargo.toml#L458 ->
4801acae7aea30d7e96bbfbe5ec5b04056dfa4cf
I haven't got time creating a minimal reproducer with a similar layout though.
@ehuss Would it be the case that git gc
removed unreachable objects?
Oh I see what you mean. Unfortunately this is intentional, we have something of a... complicated versioning scheme at the moment.
@ehuss Would it be the case that
git gc
removed unreachable objects?
Not directly, I don't think. The actions linked above seem to have some caching, but not of the cargo directory that I can see (and would likely be too large for the cache anyways). Cargo won't run git gc
for a long time.
GitHub runs git gc
internally, and that can affect things. I believe they use some heuristics for how frequently it runs. Referring to orphans runs the risk that GitHub will remove them complete (and that is partially why I said it may contribute to the randomness, since from what I've seen GitHub has mirrors that are not consistent with one another in terms of how they compress and gc their pack files).
I believe this orphan has been this way for quite a while, so it seems strange that at this point it'd be in this partially available state. I guess some bad state sharding on their side or something?
So I suppose if we depend on a non orphan commit that'll improve our odds.
On the cargo side, does cargo retry in this situation?
OK, I think I see one possibility of what is happening.
When fetching a repo, cargo doesn't know if a rev
refers to a commit or a tag. It can resolve this via github_fast_path
which uses the GitHub API to determine the specific OID from the rev
.
However, if you have exceeded the API rate limit, that function gets a 403 HTTP response, and returns FastPathRev::Indeterminate
and then the fetch function uses a refspec of ["+refs/heads/*:refs/remotes/origin/*", "+HEAD:refs/remotes/origin/HEAD"]
, fetch all branches hoping the commit lives on one of them. However the commit d44b2d209f57872ac593299c34751a5531b51352 is only reachable by a PR, and GitHub PR's are only accessible from pull/ID/head
refs.
I have opened #13563 to include the GITHUB_TOKEN to avoid the API rate limit on CI.
Another option is cargo could assume something that is 40 hexadecimal characters is an commit hash, and not a tag, and use the single commit refspec. I'm not entirely certain about that, but seems relatively safe? @weihanglo WDYT?
Generally, though, I would strongly recommend against using commits from PRs.
Nice finding!
Another option is cargo could assume something that is 40 hexadecimal characters is an commit hash, and not a tag, and use the single commit refspec. I'm not entirely certain about that, but seems relatively safe? @weihanglo WDYT?
In https://github.com/rust-lang/cargo/pull/10807, we expect Git will eventually move to support SHA256 at some point for future compatibility. Not sure if we want to go back to assume it is always 40 characters.
I don't know if this is the same case/bug or not, but I wanted to comment here and see before opening a new issue. We have a fairly repeatable issue with the same "object not found - no match for id" error that presents under CI when attempting to resolve Cargo.toml
dependencies specified as a (GitHub) git url and a rev
value, but the weird thing is that the error only occurs under macOS and not under any of the other runner images.
It's only recently popped up, but it now happens more often than not, always for the same URL/SHA pair. You can see the most recent CI failure here: https://github.com/fish-shell/fish-shell/actions/runs/8988940232/job/24690902288
The repo and revision in question is this one: https://github.com/meh/rust-terminfo/commits/7259f5aa5786a9d396162da0d993e268f6163fb2/
The error happens after several other git-resident dependencies have been fetched and used OK as part of the build process. Using CARGO_NET_GIT_FETCH_WITH_CLI=true
did not make any difference (I understand why not, this is just for posterity).
Do you have any pointers on what we can try or any additional information we can provide?
@mqudsi
Pretty not sure. In https://github.com/rust-lang/cargo/pull/13718/ the affected project using some orphan commits. Perhaps fetching orphan commits needs more requests so easier to hit rate limit because they might be gc'd? This thought came in my mind because the linked commit of fish-shell terminfo
is also an orphan commit. Maybe change to https://github.com/meh/rust-terminfo/commit/708facf665ccfcb567c83b338fe295f4258e7857 and see if it happens again?
In #10807, we expect Git will eventually move to support SHA256 at some point for future compatibility. Not sure if we want to go back to assume it is always 40 characters.
It looks like this is the change that introduced this bug, because it changed the logic of whether to add refs/commit/{0}
to the list of refspecs to fetch based from "is it GitHub and is this a 40-character commit hash" to "did github_fast_path
succeed", which is implicitly relying on your ability to access api.github.com (and not being rate-limited). This is pretty unfortunate.