aws-load-balancer-controller icon indicating copy to clipboard operation
aws-load-balancer-controller copied to clipboard

Update go to v1.22, controller-runtime dependency to v0.18.2, and kubernetes libs to v0.30.0

Open larntz opened this issue 1 year ago • 7 comments

Issue

https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/3649

Description

There is a similar pr open for the kubernetes libs, but there haven't been any updates in a few weeks.

In the comments of that pr there is a comment:

yes we plan to upgrade the controller-runtime to v0.15+, but it has breaking changes and may need more time and efforts to investigate and implement corresponding changes. I think we need to do it sooner than later. Any community contribution is very welcome. Thanks.

This PR is to address the k8s 1.29 dependency updates and refactoring required (the bulk of the changes) to update the controller runtime to v0.15+. Also sets the go version 1.22.

Two tests were removed that were related to decoder injection that is no longer possible in new versions of controller runtime. Two other tests were removed that checked ingresses were deleted there was a deletion timestamp with no finalizer. This is no longer possible because controller-runtime sets the DeletionTimestamp to nil on Create(). Because of this the DeletionTimestamps were also removed from the deleted ingress with finalizer tests.

Checklist

  • [x] Added tests that cover your change (if possible)
  • [x] Added/modified documentation as required (such as the README.md, or the docs directory)
  • [x] Manually tested
  • [x] Made sure the title of the PR is a good description that can go into the release notes

BONUS POINTS checklist: complete for good vibes and maybe prizes?! :exploding_head:

  • [ ] Backfilled missing tests for code in same general area :tada:
  • [ ] Refactored something and made the world a better place :star2:

larntz avatar May 17 '24 21:05 larntz

Welcome @larntz!

It looks like this is your first PR to kubernetes-sigs/aws-load-balancer-controller 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/aws-load-balancer-controller has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar May 17 '24 21:05 k8s-ci-robot

Hi @larntz. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar May 17 '24 21:05 k8s-ci-robot

@aaroniscode and @oliviassss This is related to pr 3675 and issue 3649. I can't push changes to that PR so I created this one.

larntz avatar May 17 '24 21:05 larntz

/ok-to-test

shraddhabang avatar May 17 '24 23:05 shraddhabang

/retest

larntz avatar May 19 '24 05:05 larntz

@larntz, thanks for the contribution! I'm reviewing this PR. but just notice the e2e test is still failing on this test suite

   STEP: expect dns name from Ingresses be non-empty @ 05/20/24 20:37:19.058
  STEP: expect dns name eventually be available: k8s-e2egroupa3a6c647-827aa94e50-910226086.us-west-2.elb.amazonaws.com @ 05/20/24 20:37:19.246
  STEP: expect http://k8s-e2egroupa3a6c647-827aa94e50-910226086.us-west-2.elb.amazonaws.com/path-a returns backend-a @ 05/20/24 20:40:50.701
  [FAILED] in [It] - /home/prow/go/src/github.com/kubernetes-sigs/aws-load-balancer-controller/test/e2e/ingress/multi_path_backend_test.go:201 @ 05/20/24 20:40:50.811
• [FAILED] [334.566 seconds]
test ingresses with multiple path and backends with podReadinessGate enabled [It] IngressGroup across namespaces should behaves correctly
/home/prow/go/src/github.com/kubernetes-sigs/aws-load-balancer-controller/test/e2e/ingress/multi_path_backend_test.go:101
  [FAILED] Unexpected error:
      <*utils.MultiError | 0xc0016a27f8>: 
      multiple error: [Response Body mismatches, diff:   []uint8{
      + 	0x62,
      + 	0x61,
      + 	0x63,
      + 	0x6b,
      + 	0x65,
      + 	0x6e,
      + 	0x64,
      + 	0x2d,
      + 	0x61,
        }
      ]
      {
          errs: [
              <*errors.fundamental | 0xc0016a2780>{
                  msg: "Response Body mismatches, diff: \u00a0\u00a0[]uint8{\n+\u00a0\t0x62,\n+\u00a0\t0x61,\n+\u00a0\t0x63,\n+\u00a0\t0x6b,\n+\u00a0\t0x65,\n+\u00a0\t0x6e,\n+\u00a0\t0x64,\n+\u00a0\t0x2d,\n+\u00a0\t0x61,\n\u00a0\u00a0}\n",
                  stack: [0x37dc114, 0x37dc4d6, 0x385f1b4, 0x378e015, 0x379ed16, 0x385e53a, 0x3782633, 0x37966ed, 0x162a9a1],
              },
          ],
      }
  occurred
  In [It] at: /home/prow/go/src/github.com/kubernetes-sigs/aws-load-balancer-controller/test/e2e/ingress/multi_path_backend_test.go:201 @ 05/20/24 20:40:50.811 

While I'm checking the code changes, can you try to run the failed test suites locally for a quicker debug? you can try to use ginkgo cmd with --focus option to focus on the specific suites. Something like

  ginkgo -focus "<test description>" -timeout 2h -v -r test/e2e -- \
    --kubeconfig=${CLUSTER_KUBECONFIG} \
    --cluster-name=${CLUSTER_NAME} \
    --aws-region=${AWS_REGION} \
    --aws-vpc-id=${cluster_vpc_id} \
    --helm-chart=${HELM_DIR}/aws-load-balancer-controller \
    --controller-image=${CONTROLLER_IMAGE_NAME} \

oliviassss avatar May 20 '24 21:05 oliviassss

@oliviassss yes, I will work on that this evening.

larntz avatar May 20 '24 21:05 larntz

I was able to test this some. I found on this failing test that two ingresses get created in one namespace and one or both are missing the group ingress finalizer metadata. Not sure yet if that's a symptom or cause of the failing test. I will continue troubleshooting, but wanted to provide an update.

larntz avatar May 21 '24 00:05 larntz

@oliviassss I think this is ready for a more in depth review now. If you need anything else from me let me know. Thanks!

larntz avatar May 21 '24 16:05 larntz

@larntz, thanks! I left some comments

oliviassss avatar May 21 '24 19:05 oliviassss

/lgtm sending to @M00nF1sh for second opinions /assign @M00nF1sh

oliviassss avatar May 22 '24 20:05 oliviassss

@larntz Thanks for making this change, it's much helpful and looks pretty solid. However, i noticed there are two changes

  1. changed the lease to lease only instead of configmaplease. Can we revert this change and do it in a separate PR(may be can be deferred to a minor version release)
  2. the rtCfg.Namespace is not passed to caches.

M00nF1sh avatar May 22 '24 22:05 M00nF1sh

New changes are detected. LGTM label has been removed.

k8s-ci-robot avatar May 23 '24 01:05 k8s-ci-robot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: larntz Once this PR has been reviewed and has the lgtm label, please ask for approval from m00nf1sh. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar May 23 '24 01:05 k8s-ci-robot

I tested this locally last night but now realize I didn't update the image tag. I'll work on fixing this today. Apologies for the noise.

larntz avatar May 23 '24 10:05 larntz

/ok-to-test

M00nF1sh avatar May 23 '24 17:05 M00nF1sh

Merging this PR right now. We need to conduct thorough tests on it before shipping it to users. For example, need to test across k8s versions and aws regions, as well as the features this PR amended, like ingressclassparam, webhookcerts, tgb, etc.

oliviassss avatar May 24 '24 17:05 oliviassss