please
please copied to clipboard
github_repo will fail confusingly if the given revision downloads a file with a different prefix.
If a github_repo
rule is used for a repo and the revision argument is set to something that doesnt exist, please will still download whatever the main branch is and try to run arcat
on it with the -s {expected but different prefix}
flag set.
However, this makes arcat
fail silently and some subsequent rule will then fail stating the directory that should have been given by arcat
was not found as it wasnt created in the first place.
Ideally either arcat
should report that the prefix didnt exist, or there should be a check to see if downloaded file name doesnt match what was expected.
I wasn't able to reproduce this with the following MWE:
github_repo(
name = "repo",
repo = "thought-machine/please",
revision = "0123456789abcdef0123456789abcdef01234567",
)
because GitHub (correctly) returns a 404 instead of an archive:
$ plz-out/bin/src/please build //tests:repo
Build stopped after 1.38s. 1 target failed:
//tests:_repo#download
1 error occurred:
* Error retrieving https://github.com/thought-machine/please/archive/0123456789abcdef0123456789abcdef01234567.zi
p: 404 Not Found
The same happens when revision
is set to something that doesn't look like a commit hash (e.g. a non-existent branch name).
The thing about arcat not erroring if the value of -s
isn't present at the front of any of the archive member names is indeed a known problem, though - I ran into it in #2829.
Ok, so I looked into it a bit more.
The issue I was facing was with mysql-connector-cpp.
I had incorrectly set please to use the master
branch which does not exist (or no longer exists?) as they use trunk
instead.
So it was trying to fetch https://github.com/mysql/mysql-connector-cpp/archive/master.zip
, which appears to fetch the correct trunk
branch from github instead, everything named with the trunk
prefix.
For the record, the 'actual' url is supposed to be https://github.com/mysql/mysql-connector-cpp/archive/trunk.zip
as expected, which works fine in please.
I also checked some other common branch names, such as main
, which shows the correct 404 response as you said.
I would still say that please should confirm if the downloaded file actually matches, but this seems to be more related with some (intentional?) inconsistencies with github.
Right - I suspect what happened here is that the owner(s) of mysql-connector-cpp
renamed the default branch, which - as you suggested - will set up a redirection of requests for the original branch name to the new branch name (although this happens for all renamed branches, not just the default branch).
The downloading of the archive in github_repo
is ultimately handed off to remote_file
, so this is really outside the scope of the subrepo rules. Perhaps it makes sense to add a new follow_redirects
parameter to remote_file
, defaulting to True
for backwards compatibility but allowing for problems like this to be surfaced in a more obvious way by setting it to False
. I think this is a better solution than trying to hack in GitHub-specific workarounds in github_repo
(especially because any hashes
specified on the github_repo
target will no longer be valid for the downloaded archive after the branch is renamed, which IMO dooms any attempt to handle the branch rename automatically and transparently). I'll have a chat with the Please folks tomorrow and see what they think.