karmada icon indicating copy to clipboard operation
karmada copied to clipboard

test/e2e/karmadactl_test.go: test register command

Open mohamedawnallah opened this issue 1 year ago • 7 comments

What type of PR is this?

/kind failing-test

Which issue(s) this PR fixes:

Part of #4544

Context

Dependent on #5429

Does this PR introduce a user-facing change?:

NONE

mohamedawnallah avatar Aug 09 '24 07:08 mohamedawnallah

Currently this test case should register a cluster with CA verification fails with the following error:

Main error:

Unable to connect to the server: dial tcp: lookup member-e2e-gfj-control-plane on 127.0.0.53:53: server misbehaving

Full trace:

  [FAILED] Unexpected error:
      <exec.CodeExitError>:
      error running /home/karmada/go/bin/karmadactl --kubeconfig=/home/karmada/.kube/member-e2e-gfj.config register 172.18.0.8:6443 --token bxfkm7.0xx94rzy1mda3iun --discovery-token-ca-cert-hash sha256:49514bb5d379ef5940a6e22709bfba9085950a0e54d2811c642fb75e4bdfd76c --cluster-name member-e2e-gfj --karmada-agent-image docker.io/karmada/karmada-agent:latest --ca-cert-path /tmp/karmada-e2e-301700272/ca.crt:
      Command stdout:
      [preflight] Running pre-flight checks
      [preflight] All pre-flight checks were passed
      [karmada-agent-start] Waiting to perform the TLS Bootstrap
      [karmada-agent-start] Waiting to construct karmada-agent kubeconfig

      stderr:
      Unable to connect to the server: dial tcp: lookup member-e2e-gfj-control-plane on 127.0.0.53:53: server misbehaving

      error:
      exit status 1
      {
          Err: <*errors.errorString | 0xc000062060>{
              s: "error running /home/karmada/go/bin/karmadactl --kubeconfig=/home/karmada/.kube/member-e2e-gfj.config register 172.18.0.8:6443 --token bxfkm7.0xx94rzy1mda3iun --discovery-token-ca-cert-hash sha256:49514bb5d379ef5940a6e22709bfba9085950a0e54d2811c642fb75e4bdfd76c --cluster-name member-e2e-gfj --karmada-agent-image docker.io/karmada/karmada-agent:latest --ca-cert-path /tmp/karmada-e2e-301700272/ca.crt:\nCommand stdout:\n[preflight] Running pre-flight checks\n[preflight] All pre-flight checks were passed\n[karmada-agent-start] Waiting to perform the TLS Bootstrap\n[karmada-agent-start] Waiting to construct karmada-agent kubeconfig\n\nstderr:\nUnable to connect to the server: dial tcp: lookup member-e2e-gfj-control-plane on 127.0.0.53:53: server misbehaving\n\nerror:\nexit status 1",
          },
          Code: 1,
      }
  occurred

Not 100% sure if this is dependent on this issue #4262 ?

mohamedawnallah avatar Aug 09 '24 07:08 mohamedawnallah

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 47.96%. Comparing base (9b71510) to head (74522cd). Report is 14 commits behind head on master.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5334      +/-   ##
==========================================
+ Coverage   47.88%   47.96%   +0.08%     
==========================================
  Files         676      676              
  Lines       55974    55964      -10     
==========================================
+ Hits        26802    26842      +40     
+ Misses      27421    27351      -70     
- Partials     1751     1771      +20     
Flag Coverage Δ
unittests 47.96% <ø> (+0.08%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov-commenter avatar Aug 09 '24 07:08 codecov-commenter

Thanks @mohamedawnallah The E2E seems to have failed for all versions. /assign @zhzhuang-zju

XiShanYongYe-Chang avatar Aug 09 '24 07:08 XiShanYongYe-Chang

I've checked the logs, and there is an issue with creating the KarmadaDir at /etc/karmada while running register command because this directory is owned by root and requires sudo privileges. I've bypassed this locally by running:

sudo -E env "PATH=$PATH" ginkgo -focus "Karmadactl register testing"

This allows the test cases to bypass the issue, but I think there are a couple of problems with this approach. Currently, running Ginkgo with sudo in the CI environment could lead to issues later on?

A potential improvement could be to add a flag, something like --karmada-directory, to the karmadactl register command, similar to the --ca-cert-path flag. This would provide more flexibility for the end-user by allowing the use of a temporary directory during testing, thereby solving the issue out of the box. WDYT?

mohamedawnallah avatar Aug 09 '24 19:08 mohamedawnallah

Hi @mohamedawnallah based on your previous E2E and UT, I think your test organization ideas are clear and your code is clean and easy to read. I look forward to your continued contributions. ^-^

XiShanYongYe-Chang avatar Aug 15 '24 08:08 XiShanYongYe-Chang

A potential improvement could be to add a flag, something like --karmada-directory, to the karmadactl register command, similar to the --ca-cert-path flag. This would provide more flexibility for the end-user by allowing the use of a temporary directory during testing, thereby solving the issue out of the box. WDYT?

that make sense. Users might have the same issue under a non-root user account.

@mohamedawnallah Would you be willing to work on this issue?

zhzhuang-zju avatar Aug 19 '24 04:08 zhzhuang-zju

Hi @mohamedawnallah, I understand that the associated blockage has been resolved by #5862 . Do you have time to push this pr forward?

XiShanYongYe-Chang avatar Feb 19 '25 08:02 XiShanYongYe-Chang

Hi @mohamedawnallah, I understand that the associated blockage has been resolved by #5862 . Do you have time to push this pr forward?

Thanks for reminding this, I almost forgot about this.

zhzhuang-zju avatar Feb 19 '25 08:02 zhzhuang-zju

Hi @mohamedawnallah, I understand that the associated blockage has been resolved by https://github.com/karmada-io/karmada/pull/5862 . Do you have time to push this pr forward?

Thanks, @XiShanYongYe-Chang, for bringing this to my attention. I've updated the test cases following the resolution of PR #5862. The remaining issue for this PR seems to be the error: Unable to connect to the server: dial tcp: lookup member-e2e-jnpl9-control-plane on 127.0.0.53:53: server misbehaving, which occurs when registering a cluster using a bootstrap token. I'm not sure if this is a known issue with the karmada register operation or something similar to what was discussed in #4262? I would love to hear your thoughts all.

mohamedawnallah avatar Feb 20 '25 08:02 mohamedawnallah

Ask @zhzhuang-zju to help take a look~

XiShanYongYe-Chang avatar Feb 20 '25 08:02 XiShanYongYe-Chang

I‘ll take a look when the ci is completed

zhzhuang-zju avatar Feb 20 '25 08:02 zhzhuang-zju

The remaining issue for this PR seems to be the error: Unable to connect to the server: dial tcp: lookup member-e2e-jnpl9-control-plane on 127.0.0.53:53: server misbehaving, which occurs when registering a cluster using a bootstrap token. I'm not sure if this is a known issue with the karmada register operation or something similar to what was discussed in https://github.com/karmada-io/karmada/issues/4262? I would love to hear your thoughts all.

I've solved that issue. I found that I was using the member cluster config to create the bootstrap token instead of the Karmada APIServer config file. Changing this resolved the problem.

error validating "/tmp/deploy-4jdtb.yaml": error validating data: [apiVersion not set, kind not set]; if you choose to ignore these errors, turn validation off with --validate=false

The e2e tests for this registering operation pass locally, but the error occurred in the CI tests and seems unrelated to the changes proposed in this PR. I've rebased onto the master branch since this PR has been open for a while, but the issue still persists. Any thoughts, @zhzhuang-zju?

mohamedawnallah avatar Feb 21 '25 01:02 mohamedawnallah

s: "error running /home/runner/go/bin/karmadactl --kubeconfig=/home/runner/.kube/karmada.config register 172.18.0.4:6443 --token 2mnk2l.4segshmnxwcyshc9 --discovery-token-ca-cert-hash sha256:3477db6b4299125dc026573a04ebba3ebad161593c11498a7b66e3a150685ef2 --kubeconfig=/home/runner/.kube/member-e2e-9kc55.config --cluster-name member-e2e-9kc55 --karmada-agent-image docker.io/karmada/karmada-agent:latest:\nCommand stdout:\n[preflight] Running pre-flight checks\n[preflight] All pre-flight checks were passed\n[karmada-agent-start] Waiting to perform the TLS Bootstrap\n\nstderr:\nUnable to connect to the server: dial tcp: lookup karmada-host-control-plane on 127.0.0.53:53: server misbehaving\n\nerror:\nexit status 1",

⏳ It looks like the end-to-end test cases fail due to interference with other members. Running the register test case sequentially instead of in parallel may resolve the issue.

EDIT: ⏳ It looks like there are issues connecting to the Karmada API server during CI E2E tests. However, the tests pass locally when running ./hack/run-e2e.sh. I'll revisit this issue later with fresh eyes. In the meantime, feel free to share any input you may have regarding this issue.

EDIT: ✅ Resolved the connection issue to the Karmada API server. The e2e tests for the register command assumed the current-context in the Karmada host kubeconfig was karmada-apiserver, but it defaults to karmada-host. Updating this fixed the issue.

mohamedawnallah avatar Feb 21 '25 19:02 mohamedawnallah

Hey @zhzhuang-zju, How am I supposed to move this PR forward? I've already addressed the feedback and all the test cases pass :)

mohamedawnallah avatar Apr 02 '25 09:04 mohamedawnallah

Hey @zhzhuang-zju, How am I supposed to move this PR forward? I've already addressed the feedback and all the test cases pass :)

Sorry for the delay, I'll take a look ASAP

zhzhuang-zju avatar Apr 02 '25 09:04 zhzhuang-zju

Good job! Isn't it okay for us to merge it?

XiShanYongYe-Chang avatar Apr 02 '25 09:04 XiShanYongYe-Chang

Hey @zhzhuang-zju, How am I supposed to move this PR forward? I've already addressed the feedback and all the test cases pass :)

Sorry for the delay, I'll take a look ASAP

Never mind! I understand you've been busy. Just ping me if you have any additional feedback, and I'd be happy to address it :)

Isn't it okay for us to merge it?

I think so but pending zhzhuang-zju feedback if any.

mohamedawnallah avatar Apr 02 '25 09:04 mohamedawnallah

Great job~ thanks @mohamedawnallah /lgtm leave Approve to @XiShanYongYe-Chang to take a look

zhzhuang-zju avatar Apr 07 '25 01:04 zhzhuang-zju

/lgtm cc @XiShanYongYe-Chang

zhzhuang-zju avatar Apr 09 '25 02:04 zhzhuang-zju

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: XiShanYongYe-Chang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

karmada-bot avatar Apr 09 '25 02:04 karmada-bot