community icon indicating copy to clipboard operation
community copied to clipboard

Proposal: Providing a Consistent CI/CD Experience

Open Brandon-Kimberly opened this issue 4 years ago • 28 comments

Introduction

Across all OpenTelemetry repositories there is currently 5 different, active CI providers. Each of these providers have their own way of executing tests, interacting with the user, and publishing test results. This can make it difficult for newcomers to contribute to multiple OpenTelemetry repositories.

Current Landscape

Repository CI Provider Automated Build and Test Code Coverage Automated Performance Testing Automated Deployment Automated Docs Deployment
Collector CircleCI [x] [x] [x] [x] []
C++ GHA [x] [x] [x] [] []
JavaScript CircleCI/GHA [x] [x] [] [] [x]
.NET Azure [x] [x] [] [] []
PHP Travis [x] [x] [] [] []
Java CircleCI [x] [x] [x] [x] [x]
Python Travis/CircleCI [x] [x] [] [x] [x]
Ruby CircleCI [x] [] [] [x] []
Go CircleCI [x] [x] [] [] []
Swift GHA/Scope [x] [] [] [] []
Rust CircleCI [x] [x] [] [] [x]
Erlang CircleCI [x] [x] [x] [] []

Proposal

I propose that all languages consider using the same CI provider. This would create a more consistent development process and make it easier for developers to contribute to multiple language libraries.

We suggest that provider be GitHub Actions. Here’s why:

Ease-of-Use

CircleCI and Travis will automatically run when pull requests and commits are issued against the repository. But if a contributor forks the repository, unless they set up an account with the CI provider and link it to their forked repository, CI will not be activated and tests will not be run automatically.

In contrast, GitHub Actions works out of the box on a forked repository and can be easily configured to run a test workflow each time a commit is issued. This would help individual contributors test their code and ensure code quality before submitting a pull request against the repository.

Transparency

Current CI providers such as CircleCI and Travis allow anyone to view the console output when building and running tests but the test results can not be seen anywhere on the GitHub repository. To view this testing output: You need go to a different website, navigate a different user interface, and then sift through thousands of lines of console output. This is not a seamless developer experience.

In contrast, using GitHub Actions would provide all testing output directly on the repository’s GitHub page, which would help contributors to find, read, and use the test output to maintain code quality.

Control

GitHub Actions’ integration with other GitHub features means you can have finer control over the CI pipeline. For example, certain workflows can be set to only run on a new release. Workflows can even be used to close stale issues and pull requests.

Recommendation

I recommend that we consider using one consistent CI provider, GitHub Actions, which provides an integrated and seamless developer experience for all contributors.

Example

Please see this example that the C++ repository has adopted for the above reasons.

Brandon-Kimberly avatar Jun 25 '20 01:06 Brandon-Kimberly

Thank you for analysis. Last this question was discussed we intentionally kept the decision of CI pipeline of choice to maintainers of an individual repositories. And giving all advantages you mentioned, there may be enough incentive for maintainers to switch. If not - I'm not sure how we can compare the benefits of making it easy for newcomer to contribute to many repositories with potential issues and overhead for maintainers with switching to GitHub Actions.

If you want to take lead on helping individual repositories with this switch - it will be great.

Also, for the standardization effort I'd advocate for even increased transparency and suggest we make builds fully containerized. This will make builds even more transparent and easier to try locally. It also will ensure that anybody can release a version without the dependency on github.

SergeyKanzhelev avatar Jun 25 '20 19:06 SergeyKanzhelev

For Java, we get automatic Javadoc publishing via javadoc.io. https://www.javadoc.io/doc/io.opentelemetry I'm not sure if the docs-deployment was another thing, though.

jkwatson avatar Jun 29 '20 17:06 jkwatson

For Java, we get automatic Javadoc publishing via javadoc.io. https://www.javadoc.io/doc/io.opentelemetry I'm not sure if the docs-deployment was another thing, though.

Good catch! Updated.

Brandon-Kimberly avatar Jun 29 '20 17:06 Brandon-Kimberly

hey @Brandon-Kimberly! in the Java Instrumentation repository, we are currently using CircleCI's xlarge instances (8 cores, 16 GB). I tried super unsuccessfully a while back to fit into the CircleCI free tier (2 GB), but I think it's much more likely we could fit into the Github Actions runner (2 cores, 7 GB). We would need to reduce parallelism within each job, which would likely bump build time from ~20 min to ~1 hour, so we'll probably also need to split out into more parallel jobs in order to keep the build time reasonable (well, i don't want to imply ~20 min build time is reasonable, but at least not worse than that?)

also, it's news to me that i'm the mentor for this project, i don't object though 😄

trask avatar Jun 29 '20 22:06 trask

We are pretty heavily invested in CircleCI in the Collector. Unless someone volunteers to migrate all of it then moving to Github actions or anywhere else is going to be problematic.

tigrannajaryan avatar Jun 30 '20 21:06 tigrannajaryan

I agree that CircleCI seems to be pretty broadly supported/common for Go projects, and thus that we as the Go developers would probably prefer to use Circle, and we've heard above that the Collector wants to remain on CircleCI as well...

lizthegrey avatar Jul 06 '20 17:07 lizthegrey

Updated list: .NET repo migrated from Azure pipelines to Github actions.

cijothomas avatar Jul 09 '20 17:07 cijothomas

I'm closing this issue. Please re-open if there are more arguments on pushing for aligning CI tools

SergeyKanzhelev avatar Jul 09 '20 17:07 SergeyKanzhelev

I feel like we need more discussion on this. Maybe not all projects will be able to migrate to GH actions, but some could? Maybe @trask still wants to go that way.

In any case, we should make it clear that we have a guidelines overall, and the conclusions from this ticket could help.

PS - @trask no worries, you won't be a mentor, we only needed your feedback 😃 (unless you have free cycles to mentor this).

carlosalberto avatar Jul 09 '20 17:07 carlosalberto

what kind of conclusion do you feel would be useful beyond declaring that it's on maintainers discretion, but we recommend GitHub Actions.

BTW, as I mentioned in comment above, do we want to push for fully dockerized build definitions?

SergeyKanzhelev avatar Jul 09 '20 17:07 SergeyKanzhelev

beyond declaring that it's on maintainers discretion, but we recommend GitHub Actions.

I think this would be a good thing, yes (in case we reach an agreement). In this case, we would get a little bit of uniformity (hopefully we will only be using GH Actions + CircleCI).

carlosalberto avatar Jul 09 '20 17:07 carlosalberto

we only needed your feedback

I started converting Java Instrumentation repo to Github Actions.

In Github Actions we need to parallelize across lots of jobs to get similar performance that we were getting from CircleCI, where we are both parallelizing across jobs and within jobs (by using larger hardware on paid plan).

But Github Actions has max 20 parallel builds across the whole org (we can bump that up with a paid plan).

Also, there's not a configuration to auto-cancel old builds when you push updates to a PR, which ends up really clogging those 20 parallel builds for a long time when people append PRs (which seems to happen fairly often, given that our builds take a long time in the first place). There are some custom Github Actions to auto cancel old builds (https://github.com/marketplace?type=actions&query=cancel) but they rely on getting a chance to run, which they don't when your whole build queue is clogged.

trask avatar Jul 09 '20 19:07 trask

We had a brief chat with @trask regarding the CI tools for OpenTelemetry, and I would definitely suggest migrating to the GitHub Actions.

CNCF has already established agreements and generous plans with Azure Pipelines and GitHub Actions, and we already have a good experience of other CNCF project migration to these solutions.

But Github Actions has max 20 parallel builds across the whole org (we can bump that up with a paid plan).

Eg, this is not an issue with the GHA offering for CNCF.

idvoretskyi avatar Jul 30 '20 19:07 idvoretskyi

Great to hear that the GHA limitations don't apply to the OT account. @Brandon-Kimberly can you post an update on the coverage we've completed for various repos :-)

alolita avatar Jul 31 '20 01:07 alolita

This is the current landscape of CI/CD in OpenTelemetry:

Repository CI/CD Provider Using GitHub Actions as primary CI? Migrated In Last 6 Weeks?
Collector CircleCI
Python CircleCI/GHA
JS CircleCI/GHA
C++ GHA
.NET Azure/GHA
Go CircleCI
PHP GHA
Java CircleCI
Rust GHA
Swift GHA/Scope
Ruby GHA
Erlang CircleCI/GHA
Java-Instr CircleCI

Brandon-Kimberly avatar Jul 31 '20 01:07 Brandon-Kimberly

Hey @Brandon-Kimberly, can you add the Java Instrumentation repo to your table (not that it's green, but so we track it)? thx!

trask avatar Jul 31 '20 01:07 trask

We had a brief chat with @trask regarding the CI tools for OpenTelemetry, and I would definitely suggest migrating to the GitHub Actions.

CNCF has already established agreements and generous plans with Azure Pipelines and GitHub Actions, and we already have a good experience of other CNCF project migration to these solutions.

But Github Actions has max 20 parallel builds across the whole org (we can bump that up with a paid plan).

Eg, this is not an issue with the GHA offering for CNCF.

What would be the actual actual limit to max parallel builds? is it 60? or unlimited?

cijothomas avatar Jul 31 '20 02:07 cijothomas

Right now we are on the Team subscription with the limit of 60 concurrent jobs. We received it by asking GitHub directly. @idvoretskyi if there is an agreement between CNCF and GitHub that enables bigger limit - we will definitely benefit from it as there are simply too many active groups working on different language SDKs and other components.

SergeyKanzhelev avatar Jul 31 '20 08:07 SergeyKanzhelev

@SergeyKanzhelev Should not be an issue from the billing standpoint, but let me double-check if there are any technical limitations on the GHA side.

idvoretskyi avatar Jul 31 '20 09:07 idvoretskyi

All the go repos have automatically docs because of godoc

bogdandrutu avatar Aug 13 '20 18:08 bogdandrutu

Should not be an issue from the billing standpoint, but let me double-check if there are any technical limitations on the GHA side.

Any news on this, @idvoretskyi ?

iNikem avatar Sep 14 '20 19:09 iNikem

Should not be an issue from the billing standpoint, but let me double-check if there are any technical limitations on the GHA side.

Any news on this, @idvoretskyi ?

Current status is that at the moment we are on teams subscription and watching if this would not be enough. If we will hit the limit, we will continue conversation. Is there any concerns regarding the current number of jobs or anticipated problems that blocking something?

SergeyKanzhelev avatar Sep 14 '20 19:09 SergeyKanzhelev

Not any immediate problems, but I am thinking if it is a good idea to bring a massively parallel job to Java instrumentation repo.

iNikem avatar Sep 14 '20 19:09 iNikem

@iNikem we are in process of discovering expanded options, but as @SergeyKanzhelev mentioned, it's not an immediate need AFAIK.

idvoretskyi avatar Sep 14 '20 19:09 idvoretskyi

Not any immediate problems, but I am thinking if it is a good idea to bring a massively parallel job to Java instrumentation repo.

If there are some existing known numbers that wouldn't fit to the current limits and blocking migration - let's discuss.

SergeyKanzhelev avatar Sep 14 '20 19:09 SergeyKanzhelev

Rust currently has code coverage as well as automated docs deployment

jtescher avatar Nov 03 '20 20:11 jtescher

@Brandon-Kimberly, can you open the similar "Proposal: Use GitHub Actions for CI/CD" issue in https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation repo? Or alternatively can I copy-paste the text of your issues and do it myself? :)

iNikem avatar Apr 01 '21 15:04 iNikem

@Brandon-Kimberly, can you open the similar "Proposal: Use GitHub Actions for CI/CD" issue in https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation repo? Or alternatively can I copy-paste the text of your issues and do it myself? :)

@Brandon-Kimberly, scratch that, they beat me to it :)

iNikem avatar Apr 01 '21 15:04 iNikem

I think this can be closed as a great success! Thanks everyone who contributed to the effort of converting to GitHub Actions!

trask avatar Dec 06 '22 05:12 trask