helm-kubernetes-services icon indicating copy to clipboard operation
helm-kubernetes-services copied to clipboard

Adds Statefulset Support

Open rvc-mf opened this issue 2 years ago • 34 comments

Description

Adds support for statefulset deployments.

This pull request introduces the following changes:

  • Adds an optional field called workloadType to values.yaml, where the default value is deployment but statefulset is also an option.
  • Adds a _statefulset_spec.tpl template which gets used when workloadType is set to statefulset. In turn, _deployment_spec_tpl only gets used when workloadType is set to deployment.
  • Add an optional updateStrategy field to values.yaml which gets injected into updateStrategy on statefulset workloads (workloadType: statefulset). deploymentStrategy continues getting injected into strategy on deployment workloads.
  • Add an optional name field under service due to Statefulset workload requirements. service could still have enabled: false if the service resource is created outside of the chart.

None of these changes are breaking, as statefulset support is optional and off-by-default. Pre-existing charts should be able to use this version without encountering issues. Similarly, I don't expect any downtime from the changes.

Documentation

I have updated the README in /charts/k8s-service to indicate that statefulsets are also deployable by the chart. I have also added proper documentation on each field that was added/edited in the chart's values.yaml.

I have also added three tests that test the correctness of statefulset canary deployments, similar to how deployment workloads are tested. You can find the output here: https://gist.github.com/rvc-mf/c2c4c82713cd8da4a95a3bf079d43cb4

Furthermore, I have successfully deployed several statefulset applications using these changes (previously made in a "private" fork) in other projects.

TODOs

Please ensure all of these TODOs are completed before asking for a review.

  • [x] Ensure the branch is named correctly with the issue number. e.g: feature/new-vpc-endpoints-955 or bug/missing-count-param-434.
  • [x] Update the docs.
  • [x] Keep the changes backward compatible where possible.
  • [x] Run the pre-commit checks successfully.
  • [x] Run the relevant tests successfully.

Related Issues

Addresses #136

rvc-mf avatar May 19 '22 18:05 rvc-mf

Bumping this up as it's been a couple of months with no response. @autero1 Is there anything missing from this PR that I should work on before it's ready for review?

rvc-mf avatar Sep 29 '22 19:09 rvc-mf

It is a useful addition indeed 👍

mloskot avatar Apr 11 '23 19:04 mloskot

@yorinasub17 this is a good feature.

rimoore avatar May 18 '23 18:05 rimoore

Hiya! Responding because I was tagged - unfortunately, I am no longer with Gruntwork so I can't do anything to move this along 😞 Sorry I can't be of help here!

yorinasub17 avatar May 18 '23 19:05 yorinasub17

Bumping this again. @rhoboat @zackproser @autero1

rvc-mf avatar Aug 14 '23 16:08 rvc-mf

Hi @rvc-mf - none of the folks you've pinged here work at Gruntwork any longer.

zackproser avatar Aug 20 '23 16:08 zackproser

AFAICT, this project is orphaned and should be archived to indicate the unmaintained state.

mloskot avatar Aug 20 '23 19:08 mloskot

AFAICT, this project is orphaned and should be archived to indicate the unmaintained state.

How is it an orphaned project if it's actively being used by the k8s-service terraform module? Let alone a module you released not 2 weeks ago now.

brandon-langley-mf avatar Aug 21 '23 19:08 brandon-langley-mf

Hi @rvc-mf - none of the folks you've pinged here work at Gruntwork any longer.

@zackproser I pinged the people who were assigned to review the PR automatically. Where can I find a list of people who can look at these changes?

rvc-mf avatar Aug 21 '23 19:08 rvc-mf

Hi @rvc-mf, I'll make sure this PR gets routed to the right SME for review.

arsci avatar Aug 21 '23 20:08 arsci

AFAICT, this project is orphaned and should be archived to indicate the unmaintained state.

How is it an orphaned project if it's actively being used by the k8s-service terraform module? Let alone a module you released not 2 weeks ago now.

I use the charts/k8s-service in my own IasC and I've been watching this project for quite a while, and based on at least two

  • this fairly trivial non-intrusive PR has been dandling for almost 1.5 year without any conclusive decision
  • my #156 about quite a serious issue, at least from point of AKS users, has been dangling without a single word of response for half a year now

tells the gut feeling of my 20+ years of experience as an open source developer and contributor that this project has been orphaned.

mloskot avatar Aug 21 '23 21:08 mloskot

Hi folks, Gruntwork co-founder and Chief Product Officer here. I'm sorry for the delays and lack of response from our team. As a few folks have mentioned, all those Grunts subscribed to this thread happen to be no longer with the company, so this escaped our radar screen.

But as of today, it escapes us no more. As @arsci said, we'll route this to the right SMEs and get an official response on priority here.

josh-padnick avatar Aug 21 '23 22:08 josh-padnick

Hey all, as @josh-padnick mentioned, apologies for the lack of traction on this one. I just got this added to our project board to get prioritized for a review. Feel free to tag me for anything related to this repo as I am now one of the maintainers of this project. Review soon to come!

ryehowell avatar Aug 21 '23 22:08 ryehowell

Awesome news, thank you!

mloskot avatar Aug 21 '23 23:08 mloskot

@ryehowell , I believe I've accounted for all the changes you've mentioned in your comment, but I'm running into issues with the testing suite. I can't seem to get it to work, even for tests that cover files that I didn't change. See the error log below (this is on a file I have not changed):

$ go test -v -tags all -timeout 60m -run TestK8SServiceCanaryDeployment
# (...)
TestK8SServiceCanaryDeployment 2023-09-12T07:36:48-06:00 logger.go:66: Error: INSTALLATION FAILED: services "k8s-service-canary-ykmv5h-canary-test" already exists
=== NAME  TestK8SServiceCanaryDeployment
    install.go:16: 
                Error Trace:    $HOME/projects/helm-kubernetes-services/test/install.go:16
                                                        $HOME/projects/helm-kubernetes-services/test/k8s_service_canary_deployment_test.go:53
                Error:          Received unexpected error:
                                error while running command: exit status 1; Error: INSTALLATION FAILED: services "k8s-service-canary-ykmv5h-canary-test" already exists
                Test:           TestK8SServiceCanaryDeployment
TestK8SServiceCanaryDeployment 2023-09-12T07:36:48-06:00 logger.go:66: Running command helm with args [delete --namespace k8s-service-canary-ykmv5h k8s-service-canary-ykmv5h]
TestK8SServiceCanaryDeployment 2023-09-12T07:36:49-06:00 logger.go:66: release "k8s-service-canary-ykmv5h" uninstalled
TestK8SServiceCanaryDeployment 2023-09-12T07:36:49-06:00 client.go:42: Configuring Kubernetes client using config file $HOME/.kube/config with context 
--- FAIL: TestK8SServiceCanaryDeployment (0.47s)
FAIL
exit status 1
FAIL    github.com/gruntwork-io/helm-kubernetes-services/test   0.488s

I tried debugging the error and it seems like the error happens at the helm module ("github.com/gruntwork-io/terratest/modules/helm") in one of the RunCommand steps. Here's an example command the helm.Install code generates:

Running command helm with args [install --namespace k8s-service-ykmv5h -f /home/rogvc/projects/helm-kubernetes-services/charts/k8s-service/linter_values.yaml -f /home/rogvc/projects/helm-kubernetes-services/test/fixtures/main_statefulset_values.yaml k8s-service-ykmv5h /home/rogvc/projects/helm-kubernetes-services/charts/k8s-service]

Any tips on how to proceed?

rvc-mf avatar Sep 12 '23 13:09 rvc-mf

Hey @rvc-mf , apologies for the delay here. We've been working on getting some new features out for our EKS module. I'll do another review of this and take a look at the failing test as well.

@mateimicu, would you be available to help review this? I've got this in my queue, but wanted see if you would like to take a look at this as well 😄 to cast a wider net on this.

ryehowell avatar Oct 06 '23 02:10 ryehowell

Hey @rvc-mf , apologies for the delay here. We've been working on getting some new features out for our EKS module. I'll do another review of this and take a look at the failing test as well.

@mateimicu, would you be available to help review this? I've got this in my queue, but wanted see if you would like to take a look at this as well 😄 to cast a wider net on this.

@ryehowell sure :D, I will take a look today

mateimicu avatar Oct 06 '23 08:10 mateimicu

FYA, although I'm in no position to provide a technical review of this PR, I would like to endorse it as a user of this project. I have been using the change in this PR in a setup of a few AKS clusters for 6 months now and w/o any issues. Thank you @rvc-mf rv and others involved.

mloskot avatar Oct 06 '23 08:10 mloskot

@ryehowell , I believe I've accounted for all the changes you've mentioned in your comment, but I'm running into issues with the testing suite. I can't seem to get it to work, even for tests that cover files that I didn't change. See the error log below (this is on a file I have not changed):

$ go test -v -tags all -timeout 60m -run TestK8SServiceCanaryDeployment
# (...)
TestK8SServiceCanaryDeployment 2023-09-12T07:36:48-06:00 logger.go:66: Error: INSTALLATION FAILED: services "k8s-service-canary-ykmv5h-canary-test" already exists
=== NAME  TestK8SServiceCanaryDeployment
    install.go:16: 
                Error Trace:    $HOME/projects/helm-kubernetes-services/test/install.go:16
                                                        $HOME/projects/helm-kubernetes-services/test/k8s_service_canary_deployment_test.go:53
                Error:          Received unexpected error:
                                error while running command: exit status 1; Error: INSTALLATION FAILED: services "k8s-service-canary-ykmv5h-canary-test" already exists
                Test:           TestK8SServiceCanaryDeployment
TestK8SServiceCanaryDeployment 2023-09-12T07:36:48-06:00 logger.go:66: Running command helm with args [delete --namespace k8s-service-canary-ykmv5h k8s-service-canary-ykmv5h]
TestK8SServiceCanaryDeployment 2023-09-12T07:36:49-06:00 logger.go:66: release "k8s-service-canary-ykmv5h" uninstalled
TestK8SServiceCanaryDeployment 2023-09-12T07:36:49-06:00 client.go:42: Configuring Kubernetes client using config file $HOME/.kube/config with context 
--- FAIL: TestK8SServiceCanaryDeployment (0.47s)
FAIL
exit status 1
FAIL    github.com/gruntwork-io/helm-kubernetes-services/test   0.488s

I tried debugging the error and it seems like the error happens at the helm module ("github.com/gruntwork-io/terratest/modules/helm") in one of the RunCommand steps. Here's an example command the helm.Install code generates:

Running command helm with args [install --namespace k8s-service-ykmv5h -f /home/rogvc/projects/helm-kubernetes-services/charts/k8s-service/linter_values.yaml -f /home/rogvc/projects/helm-kubernetes-services/test/fixtures/main_statefulset_values.yaml k8s-service-ykmv5h /home/rogvc/projects/helm-kubernetes-services/charts/k8s-service]

Any tips on how to proceed?

Hi @rvc-mf, Thanks for the PR. I will need some time to give it a good read.

Let's first try and unblock you.

I managed to run the command you provided locally, go test -v -tags all -timeout 60m -run TestK8SServiceCanaryDeployment from the main branch.

I think this fails because there is a name collision. The original services name is defined here as name: {{ include "k8s-service.fullname" . }} The default configuration for the headless service you introduced is defined here and unless overwritten by .Values.headlessService.name collides with the original one.

  {{- if .Values.headlessService.name }}
  name: {{ .Values.headlessService.name }}
  {{- else }}
  name: {{ include "k8s-service.fullname" . }}
  {{- end }}

mateimicu avatar Oct 06 '23 23:10 mateimicu

@mateimicu Thanks for the reply. I was able to find the issue and solve it on my latest changes. I'm now running into another issue that I think has more to do with my docker + minikube environment than the code changes.

I checked out the main version of the repository and I'm trying to run the tests to get my environment up to speed before retesting my new changes, but I am still getting errors when doing so.

Has anyone in your team seen a similar error to this when running the testing suite?

Error we get when we add 'minikube' to /etc/hosts

=== NAME TestK8SServiceCanaryDeployment k8s_service_example_test_helpers.go:234: Error Trace: /home/rogvc/temp/helm-kubernetes-services/test/k8s_service_example_test_helpers.go:234 /home/rogvc/temp/helm-kubernetes-services/test/retry.go:68 /home/rogvc/temp/helm-kubernetes-services/test/retry.go:93 /home/rogvc/temp/helm-kubernetes-services/test/retry.go:68 /home/rogvc/temp/helm-kubernetes-services/test/retry.go:57 /home/rogvc/temp/helm-kubernetes-services/test/k8s_service_example_test_helpers.go:232 /home/rogvc/temp/helm-kubernetes-services/test/k8s_service_canary_deployment_test.go:59 Error: Received unexpected error: Get "http://minikube:32085": dial tcp: lookup minikube on 15.122.222.53:53: no such host Test: TestK8SServiceCanaryDeployment TestK8SServiceCanaryDeployment 2023-10-24T11:45:27-06:00 logger.go:66: Running command helm with args [delete --namespace k8s-service-canary-hkznz5 k8s-service-canary-hkznz5] TestK8SServiceCanaryDeployment 2023-10-24T11:45:27-06:00 logger.go:66: release "k8s-service-canary-hkznz5" uninstalled TestK8SServiceCanaryDeployment 2023-10-24T11:45:27-06:00 client.go:42: Configuring Kubernetes client using config file /home/rogvc/.kube/config with context --- FAIL: TestK8SServiceCanaryDeployment (26.76s) FAIL

Error we get when we don't add it:

=== NAME TestK8SServiceCanaryDeployment k8s_service_example_test_helpers.go:234: Error Trace: /home/csmcclain/temp/helm-kubernetes-services/test/k8s_service_example_test_helpers.go:234 /home/csmcclain/temp/helm-kubernetes-services/test/retry.go:68 /home/csmcclain/temp/helm-kubernetes-services/test/retry.go:93 /home/csmcclain/temp/helm-kubernetes-services/test/retry.go:68 /home/csmcclain/temp/helm-kubernetes-services/test/retry.go:57 /home/csmcclain/temp/helm-kubernetes-services/test/k8s_service_example_test_helpers.go:232 /home/csmcclain/temp/helm-kubernetes-services/test/k8s_service_canary_deployment_test.go:59 Error: Received unexpected error: Get "http://minikube:30961": dial tcp: lookup minikube on 127.0.0.53:53: server misbehaving Test: TestK8SServiceCanaryDeployment TestK8SServiceCanaryDeployment 2023-10-24T13:28:40-06:00 logger.go:66: Running command helm with args [delete --namespace k8s-service-canary-tomrer k8s-service-canary-tomrer] TestK8SServiceCanaryDeployment 2023-10-24T13:28:40-06:00 logger.go:66: release "k8s-service-canary-tomrer" uninstalled TestK8SServiceCanaryDeployment 2023-10-24T13:28:40-06:00 client.go:42: Configuring Kubernetes client using config file /home/csmcclain/.kube/config with context --- FAIL: TestK8SServiceCanaryDeployment (15.92s) FAIL

If y'all don't use Docker + Minikube to run tests, how do you set up a kubernetes cluster to run the test suite?

rvc-mf avatar Oct 24 '23 17:10 rvc-mf

Tried running the main testing suite using EKS+Fargate, a local rancher cluster, and minikube + qemu2 but still got similar issues.

Would appreciate some direction on how to set up a proper testing environment to get unblocked on this PR.

rvc-mf avatar Nov 01 '23 16:11 rvc-mf

Update: got test suite running by giving minikube a static ip on start (thanks @brandon-langley-mf ).

See gist for updated test results: https://gist.github.com/rvc-mf/c2c4c82713cd8da4a95a3bf079d43cb4

PR is ready for code review once again 👍🏼

rvc-mf avatar Nov 16 '23 17:11 rvc-mf

Tried running the main testing suite using EKS+Fargate, a local rancher cluster, and minikube + qemu2 but still got similar issues.

Would appreciate some direction on how to set up a proper testing environment to get unblocked on this PR.

Hi @rvc-mf,

minikube works definitely on macOS / Linux / Windows. The test is trying to do a HTTP request to the service exposed as a NodePort. Depending on your Kubernetes setup.

This is what I use for MacOS

Start minikube like this minikube start --driver=docker --extra-config=apiserver.service-node-port-range=32760-32767 --ports=127.0.0.1:32760-32767:32760-32767. This will:

  • limit the port-ranges a nodePort object can have using the API Server configuration
  • expose the exact same range locally
  • the last step is the add 127.0.0.1 minikube in /etc/hosts

How to debug this in general

To check how to do this in your setup, you can:

  1. Keep the helm release after the test finishes. You can do this by removing the line here
  2. Now check how you can access the Service. You can find the service by running kubectl svc -A. It is the one that starts with k8s-service-canary-.

mateimicu avatar Nov 17 '23 15:11 mateimicu

Update: got test suite running by giving minikube a static ip on start (thanks @brandon-langley-mf ).

Awesome 🥳

Let me pull the latest and test it

mateimicu avatar Nov 17 '23 15:11 mateimicu

Update: got test suite running by giving minikube a static ip on start (thanks @brandon-langley-mf ).

Awesome 🥳

Let me pull the latest and test it

@mateimicu fyi, we found two tests are failing that are also failing in the release tag as well. They look like examples that are unfortuantely tagged in such a way they are pulled into the 'all' and 'integration' tests. We did not adjust them because we weren't sure if this was intentional or not - is this something we should open an issue for?

brandon-langley-mf avatar Nov 17 '23 16:11 brandon-langley-mf

Hi @brandon-langley-mf

If you have some details a new issue would be awesome to track it.

mateimicu avatar Nov 17 '23 16:11 mateimicu

@mateimicu, @ryehowell, any updates on this one?

brandon-langley-mf avatar Dec 15 '23 14:12 brandon-langley-mf

Sorry @brandon-langley-mf this slipped on my side. Let me jump on it 👀

mateimicu avatar Dec 15 '23 16:12 mateimicu