godot-ci
godot-ci copied to clipboard
Gitlab CI Jobs randomly fail
Did you ever encounter weird phenomena like these:
Running with gitlab-runner 13.8.0 (775dd39d)
on newest xxxxxx
Resolving secrets
00:00
Preparing the "docker" executor
...
Preparing environment
00:01
Running on runner-xxxxxxxx-project-169-concurrent-0 via c9efa222138c...
Getting source from Git repository
00:04
Fetching changes with git depth set to 50...
Reinitialized existing Git repository in /builds/<group>/<project>/.git/
fatal: shallow file has changed since we read it
Cleaning up file based variables
00:02
ERROR: Job failed: exit code 1
?
It happens randomly to different jobs (mostly linux and windows export stages), sometimes files go missing, sometimes I get this. It's frustrating. I'm running a docker-in-docker gitlab-runner setup.
EDIT: I am running a docker-in-docker runner configuration and set concurrent=4 in the runner toml file so there should never be more jobs than this setting allows. I was able to workaround this issue by adding retries to the jobs inside the .gitlab-ci.yml file. Unsure if this is a GitLab, Godot command line or godot-ci -specific bug.
I started getting these errors for Windows build recently as well.
There's a link to https://api.itch.io/wharf/builds/... in the log, here's it's contents:
{
"errors": [
"method not supported"
],
"details": "The HTTP method you used (GET) does not work with this API endpoint"
}
I run into the same issue today, for me it turned out that I was using 2 gitlab-runner services on the same runner config:
- One that I'd started after registering a runner using
gitlab-runner run - And another running in the background from the
gitlab-runner.service. I didn't think I would have that one since I never explicitly rungitlab-runner install/start.
Also noticed that, something weird was going on since even though I had set concurrent = 1, two jobs were being picked up at the same time on my custom runner.
I figured it out when, finally, I tried to create the service seeing that it was already there :man_facepalming:
The issue was resolved after killing the manually started runner ^ ^
TLDR
If you have manually started a runner (eg. using gitlab-runner run) make sure that there is no other runner instance running against the same config (could be a manual or a service one - eg if systemctl status gitlab-runner.service shows the service as active)
This also happens even if you don't accidentally have to gitlab-runner instances running using the same config.