Improve CI pipeline and integration tests
I have opened a few PR to move from Travis CI to Github and update integration tests.
The following proposal for processing the PR.
Independent of this, new feature:
- #898
- #909
Independent of this, fix docs:
- #900
- #904
- #919
bring back linter:
- #899
- ~later some PR fix linting issues~
- #922
- ~bring back linter for windows~
update integration test docker images:
- #907
- #902
- #906
- #908
- new PR to remove old images
- #901
- new PR to uses new docker images from ghcr (incl. new MD5 checksums to bypass build in CI)
- decide when to build new docker test images
After that it is possible to move from Travis CI to Github Actions.
- #928
- #946
- #916
- ~update call of linter in makefile (example)~ (#922)
- rebuilding / remove the scripts in the ci folder
- review if codeclimate is needed anymore
- #933
It may be useful to provide all CIs with a uniform wording and an optimized sequence at the end.
You've been busy, love it!
I'll go through these PRs and start reviewing/merging the ones with least dependencies.
Two questions regarding integration-tests:
- Did you ever find a fix for the systemd in docker thing?
- I haven't looked at the PRs yet, do they leverage the make targets?
- If not, I want to make sure it's reproducible locally outside of CI if possible.
- There's also act - written by a former colleague of mine, one of the best engineers I had the pleasure of working with. I've never personally used it.
Did you ever find a fix for the systemd in docker thing?
The topic is very annoying. Docker is simply not made for running multiple processes and monitoring them from a separate process (systemd)- there are solutions to start multiple processes in a container. But that won't help, because the container is used to simulate an operating system. The whole thing can only be solved by calling docker run with the correct parameters.
I haven't looked at the PRs yet, do they leverage the make targets?
All these changes run with Travis CI, Github workflow and locally on my ubuntu machine (make test-all)
One issue remains open from my point of view. The test in which runlevel a service runs is not compatible with current linux versions. My knowledge of go is not great enough to investigate this.
@dklimpel random question, are you on gophers slack by any chance?
Between #900 and #904 is there a dependency or order to merge them. Both have failing builds currently.
After those are merged, then the long awaited journey to move off of Travis begins.
Both have different failing jobs. And both fix different jobs. I would start with #900 and then merge main to #904. Then should #904 not failing anymore.
#900 merged, #904 updated
One issue remains open from my point of view. The test in which runlevel a service runs is not compatible with current linux versions. My knowledge of go is not great enough to investigate this.
Which Linux (branch) is failing, I can checkout that brance and investigate next week. Wonder if it's a Goss bug, or working as intended.
Which Linux (branch) is failing, I can checkout that brance and investigate next week. Wonder if it's a Goss bug, or working as intended.
It is a problem with Debian and Ubuntu:
- #902
- #908
Also on my local Ubuntu machine.
Okay, thanks. I'll check them both out.
Which PRs are next for documentation?
https://github.com/goss-org/goss/pull/919?
I have tried to put it into a sorted list above.
#904 fix the linting issues. After that there should be a valid documentation. #919 fixing for upcoming changes that the documentation pipeline becomes triggered.
GitHub creates a workflow when push to master automatically. https://github.com/goss-org/goss/actions/workflows/pages/pages-build-deployment
IMHO this is failing because this is a default job with Jekyll theme. Probably this can be disabled in project settings. https://github.com/goos-org/goss/settings/pages - Change to source "GitHub Action". Readthedocs should not need this, because it is working by triggers, I think. GitHub pages is not used here.
Changed, I guess next PR to be merged will validate this?
Also, please continue to use this issue to let me know the next PR in the chain. I find it a lot easier to track on here.
This is an amazing level of work by the way, much appreciated. It's something I've wanted for a long time. Unfortunately, due to limited time I never got around to it, my focus tends to be:
- Bugs/security findings
- Features
- Everything else (refactor, CI, etc..)
Changed, I guess next PR to be merged will validate this?
Yes, it is.
My suggestion for the next steps.
- #919 finish the topic docs and get a reliable pipeline. It is no change of program code.
- #899 to bring back linting to prevent bugs
Hey @dklimpel , if you don't mind. Let me know on here the next PR that's ready and I'll review.
This is an awesome amount of work you put it, it's greatly appreciated!
To improve the code:
- #922 or bring tests up to date
- #907
Holding off on merging more PRs until Travis oss credits are replenished.
Don't want CI to get in a broken state with unclear traceability on what caused it.
This is an awesome amount of work. Can't thank you enough for taking the time to do this.
Holding off on merging more PRs until Travis oss credits are replenished.
Ok.
I think #928 can help. This enables unit tests with GitHub Pipeline.
Using this as a central coordination point. What PRs are ready for merge, so I can start going through them.
I would recommend finalizing a few topics before we take a look at the docker images.
- bring unit tests to GH action #928
- remove not needed codeclimate #933
- update docs and close #931 #934
- #925
- new release to fix #941
We are unable to start your build at this time. You exceeded the number of users allowed for your plan. Please review your plan details and follow the steps to resolution.
Dealing with more travis-ci issues, waiting on support. Figured it's relevant to this issue given the work-effort here =)
If you would like to run the integration tests with GHA:
- #946
I have added a replacement of CentOS in:
- #906
The creation of binarries for tags / releases is included there:
- #916
IMHO this would be the last step to replace the functionalities of travis CI.
After that, there will certainly be some clean-up work to do, such as renaming variables like "$TRAVIS_TAG".
Travis-ci seems to be failing on rockylinux9 build, but I can't figure out the reason to be honest.
Last good build: https://app.travis-ci.com/github/goss-org/goss/builds/271468694?serverType=git
failing build: https://app.travis-ci.com/github/goss-org/goss/builds/271478309?serverType=git
The git commit sha is the same, do you know of anything else that could have changed on the CI side between a regular build vs a release build?
Also, I noticed this issue with GHA on releases: https://github.com/goss-org/goss/actions/runs/9982852891
Just putting all issues I've encountered so far in one comment.
Travis-ci seems to be failing on rockylinux9 build, but I can't figure out the reason to be honest.
There is something going wrong with the user agent.
HTTP: http://httpbin/headers: Body:
Expected
"object: *http.cancelTimerBody"
to have patterns
["\"Foo\": \"bar\"","\"User-Agent\": \"goss/0.0.0\""]
the missing elements were
["\"User-Agent\": \"goss/0.0.0\""]
I can take a look deeper the next days.
failing build: https://app.travis-ci.com/github/goss-org/goss/builds/271478309?serverType=git
It is flaky or a problem with Travis. I had run a current build in #950 and it was successfully: https://app.travis-ci.com/github/goss-org/goss/jobs/624212980
Odd, I reran the failed test like 5 times and the same error every time. Let me try pushing a new tag.. I wonder if travis has some logic to re-use the same flaky instance on retries.
Oh, I see what the issue is! it's not travis. The test is written in a way that assumes that goss is always a dev build aka version 0.0.0, yet when a real build happens it fails due to it having a real version.
The test is written in a way that assumes that goss is always a dev build aka version 0.0.0, yet when a real build happens it fails due to it having a real version.
Good catch!