feat: support additional services for blue-green. fixes #451
Add support for additional preview and active services for the BlueGreen strategy.
Checklist:
- [x] Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this is a chore.
- [x] The title of the PR is (a) conventional with a list of types and scopes found here, (b) states what changed, and (c) suffixes the related issues number. E.g.
"fix(controller): Updates such and such. Fixes #1234". - [x] I've signed my commits with DCO
- [x] I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
- [ ] My builds are green. Try syncing with master if they are not.
- [ ] My organization is added to USERS.md.
This is my first contribution to Argo Rollouts, and I feel like #451 isn't really closed if this doesn't encompass the Canary strategy. I want to work on it, but first I wanted to write E2E tests for this feature. I'm rather new to Go, so I'm not sure if I'm doing this right. Still combing through the available tests to see if I can find a good example to follow.
Support for multiple services is added as a couple new optional fields called
additionalPreviewServices and additionalActiveServices. When populated with
service names, these lists will cause the services in them to be updated as
though they were the preview service or the active service.
Help with this would be much appreciated.
Kudos, SonarCloud Quality Gate passed! 
0 Bugs
0 Vulnerabilities
0 Security Hotspots
1 Code Smell
No Coverage information
0.0% Duplication
Go Published Test Results
1 984 tests 1 984 :heavy_check_mark: 2m 37s :stopwatch: 118 suites 0 :zzz: 1 files 0 :x:
Results for commit 0e94be5f.
:recycle: This comment has been updated with latest results.
E2E Tests Published Test Results
4 files 4 suites 3h 34m 19s :stopwatch: 96 tests 80 :heavy_check_mark: 5 :zzz: 11 :x: 400 runs 364 :heavy_check_mark: 20 :zzz: 16 :x:
For more details on these failures, see this check.
Results for commit 0e94be5f.
:recycle: This comment has been updated with latest results.
Would it make sense to add activeServices/previewServices, and just make them mutually exclusive?
Would it make sense to add activeServices/previewServices, and just make them mutually exclusive?
If I understand your suggestion correctly, I don't think so. My original use case was I had two Services for a specific service: one attached to a Network Load Balancer, redirecting UDP traffic, and a simple ClusterIP referenced by Ingress. There were two active and two preview Services.
There are more use cases reported in #451. The problem with just using a list is it's not explicit which Service might be the "origin of the truth" in the case of recovering from a crash. Not that that behavior is really made explicit here.
@d3adb5 Sorry, i realize re-reading it that I wasn't clear there. I didn't mean to make them mutually exclusive to each other (i know you need active and preview).
i meant to make the plural and singular (eg: activeService and activeServices) mutually exclusive.
Which I now realize is similar to the suggestion @huikang https://github.com/argoproj/argo-rollouts/issues/451#issuecomment-933017263 made on #451 - though i don't know if they meant to suggest making them mutually exclusive (basically, can only use 1 - the singular, or the plural) with the plural being preferred going forward - since it works with 1 or more services.
As far as source of truth, i would make the first in the list the SoT, and document that.
This PR is stale because it has been open 90 days with no activity.
I realize it's been a while since I touched this PR. Life's been busy, now there are even merge conflicts. I'll get to it sooner or later.
As far as source of truth, i would make the first in the list the SoT, and document that.
My goal with the additional* parameters was to avoid having to document which one will be the SoT. My hopes were reading the spec would make it immediately clear which one would be used in case of recovery and which ones would become secondary updates, rather than let the user guess which one would be the SoT and have to read the documentation to confirm their assumptions.
Kudos, SonarCloud Quality Gate passed! 
0 Bugs
0 Vulnerabilities
0 Security Hotspots
1 Code Smell
No Coverage information
11.4% Duplication
Codecov Report
Patch coverage: 39.39% and project coverage change: -0.07 :warning:
Comparison is base (
6ac1533) 81.68% compared to head (0e94be5) 81.61%.
Additional details and impacted files
@@ Coverage Diff @@
## master #2603 +/- ##
==========================================
- Coverage 81.68% 81.61% -0.07%
==========================================
Files 133 133
Lines 20178 20211 +33
==========================================
+ Hits 16483 16496 +13
- Misses 2843 2857 +14
- Partials 852 858 +6
| Impacted Files | Coverage Δ | |
|---|---|---|
| rollout/bluegreen.go | 65.94% <21.42%> (-2.36%) |
:arrow_down: |
| rollout/service.go | 76.82% <52.63%> (-2.15%) |
:arrow_down: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
This PR is stale because it has been open 90 days with no activity.
Hello @d3adb5 any updates on this? :) We are in need of the same feature and wonder if we can help contribute and push for the feature.
Hey @fdfzcq, I haven't had the time to look back at this, and since I've moved on from the project where the use case came up it took a hit in my list of priorities. Last I remember I was a little confused about how to add E2E tests for this and still wondering if I'm going about the type change the right way.
I have a rough week or two ahead of me at work right now, but afterwards I'll be able to commit myself to this. If you want, you can always take over this implementation. Want to explain your use case here or elsewhere so I get a clearer picture of what you guys need?
@d3adb5 Our use case is kind of similar to the one you have. We want to use Argo Rollouts in a situation where we route traffic to the same deployment through multiple service resources. We need multiple service resources, because each service defines a different protocol (UDP and TCP) for the same L4 pass-through LB (GKE docs). Multiple protocols on a single service resource is not supported if we want GKE to manage a corresponding L4 pass-through LB for us, so that's why we're hoping that Argo Rollouts might support multiple service resources instead.
So, it has been a while since there were changes made here. @d3adb5 if you are still on this, please let me/us know if/how I/we can support.
This is a feature that I would really love to see in Argo, and that would solve a big pain point for me.
If you cannot find time to contribute to this any more, that is fair. Life moves on and so do we. Just let us know, so that the rest of the community can take over. And thanks for creating this PR in the first place.
Sorry for the delay getting back to everyone. @eduardOrthopy, it would help me quite a bit to understand how to properly run and write E2E tests for this. For some reason E2E tests with a kind or k3d cluster on my machine are always failing intermittently. I'm new to Go, too, so any feedback on what I have here would be well appreciated.
I rebased on master and solved conflicts, ran make codegen again, and will be looking into this during my spare time. Please add as much feedback as you like here! If anybody wants to push to my branch, I'm fine with that too. :thinking:
Published E2E Test Results
4 files 4 suites 3h 9m 4s ⏱️ 113 tests 101 ✅ 7 💤 5 ❌ 460 runs 425 ✅ 28 💤 7 ❌
For more details on these failures, see this check.
Results for commit 4cd5ad08.
:recycle: This comment has been updated with latest results.
Published Unit Test Results
2 296 tests 2 296 ✅ 2m 59s ⏱️ 128 suites 0 💤 1 files 0 ❌
Results for commit 4cd5ad08.
:recycle: This comment has been updated with latest results.
Quality Gate passed
Issues
1 New issue
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
@d3adb5 sorry for the delay on my side this time. As I said, it is open source, and you have a life, as do all others that work on this. Priorities change over time.
Thanks for letting the community know how you feel about joint work on this topic.
I will check out the branch and look at the IT E2E testing part. Maybe I can get some clarity into that.