test/e2e/karmadactl_test.go: test register command
What type of PR is this?
/kind failing-test
Which issue(s) this PR fixes:
Part of #4544
Context
Dependent on #5429
Does this PR introduce a user-facing change?:
NONE
Currently this test case should register a cluster with CA verification fails with the following error:
Main error:
Unable to connect to the server: dial tcp: lookup member-e2e-gfj-control-plane on 127.0.0.53:53: server misbehaving
Full trace:
[FAILED] Unexpected error:
<exec.CodeExitError>:
error running /home/karmada/go/bin/karmadactl --kubeconfig=/home/karmada/.kube/member-e2e-gfj.config register 172.18.0.8:6443 --token bxfkm7.0xx94rzy1mda3iun --discovery-token-ca-cert-hash sha256:49514bb5d379ef5940a6e22709bfba9085950a0e54d2811c642fb75e4bdfd76c --cluster-name member-e2e-gfj --karmada-agent-image docker.io/karmada/karmada-agent:latest --ca-cert-path /tmp/karmada-e2e-301700272/ca.crt:
Command stdout:
[preflight] Running pre-flight checks
[preflight] All pre-flight checks were passed
[karmada-agent-start] Waiting to perform the TLS Bootstrap
[karmada-agent-start] Waiting to construct karmada-agent kubeconfig
stderr:
Unable to connect to the server: dial tcp: lookup member-e2e-gfj-control-plane on 127.0.0.53:53: server misbehaving
error:
exit status 1
{
Err: <*errors.errorString | 0xc000062060>{
s: "error running /home/karmada/go/bin/karmadactl --kubeconfig=/home/karmada/.kube/member-e2e-gfj.config register 172.18.0.8:6443 --token bxfkm7.0xx94rzy1mda3iun --discovery-token-ca-cert-hash sha256:49514bb5d379ef5940a6e22709bfba9085950a0e54d2811c642fb75e4bdfd76c --cluster-name member-e2e-gfj --karmada-agent-image docker.io/karmada/karmada-agent:latest --ca-cert-path /tmp/karmada-e2e-301700272/ca.crt:\nCommand stdout:\n[preflight] Running pre-flight checks\n[preflight] All pre-flight checks were passed\n[karmada-agent-start] Waiting to perform the TLS Bootstrap\n[karmada-agent-start] Waiting to construct karmada-agent kubeconfig\n\nstderr:\nUnable to connect to the server: dial tcp: lookup member-e2e-gfj-control-plane on 127.0.0.53:53: server misbehaving\n\nerror:\nexit status 1",
},
Code: 1,
}
occurred
Not 100% sure if this is dependent on this issue #4262 ?
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 47.96%. Comparing base (
9b71510) to head (74522cd). Report is 14 commits behind head on master.
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## master #5334 +/- ##
==========================================
+ Coverage 47.88% 47.96% +0.08%
==========================================
Files 676 676
Lines 55974 55964 -10
==========================================
+ Hits 26802 26842 +40
+ Misses 27421 27351 -70
- Partials 1751 1771 +20
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 47.96% <ø> (+0.08%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Thanks @mohamedawnallah The E2E seems to have failed for all versions. /assign @zhzhuang-zju
I've checked the logs, and there is an issue with creating the KarmadaDir at /etc/karmada while running register command because this directory is owned by root and requires sudo privileges. I've bypassed this locally by running:
sudo -E env "PATH=$PATH" ginkgo -focus "Karmadactl register testing"
This allows the test cases to bypass the issue, but I think there are a couple of problems with this approach. Currently, running Ginkgo with sudo in the CI environment could lead to issues later on?
A potential improvement could be to add a flag, something like --karmada-directory, to the karmadactl register command, similar to the --ca-cert-path flag. This would provide more flexibility for the end-user by allowing the use of a temporary directory during testing, thereby solving the issue out of the box. WDYT?
Hi @mohamedawnallah based on your previous E2E and UT, I think your test organization ideas are clear and your code is clean and easy to read. I look forward to your continued contributions. ^-^
A potential improvement could be to add a flag, something like
--karmada-directory, to thekarmadactl registercommand, similar to the--ca-cert-pathflag. This would provide more flexibility for the end-user by allowing the use of a temporary directory during testing, thereby solving the issue out of the box. WDYT?
that make sense. Users might have the same issue under a non-root user account.
@mohamedawnallah Would you be willing to work on this issue?
Hi @mohamedawnallah, I understand that the associated blockage has been resolved by #5862 . Do you have time to push this pr forward?
Hi @mohamedawnallah, I understand that the associated blockage has been resolved by #5862 . Do you have time to push this pr forward?
Thanks for reminding this, I almost forgot about this.
Hi @mohamedawnallah, I understand that the associated blockage has been resolved by https://github.com/karmada-io/karmada/pull/5862 . Do you have time to push this pr forward?
Thanks, @XiShanYongYe-Chang, for bringing this to my attention. I've updated the test cases following the resolution of PR #5862. The remaining issue for this PR seems to be the error: Unable to connect to the server: dial tcp: lookup member-e2e-jnpl9-control-plane on 127.0.0.53:53: server misbehaving, which occurs when registering a cluster using a bootstrap token. I'm not sure if this is a known issue with the karmada register operation or something similar to what was discussed in #4262? I would love to hear your thoughts all.
Ask @zhzhuang-zju to help take a look~
I‘ll take a look when the ci is completed
The remaining issue for this PR seems to be the error: Unable to connect to the server: dial tcp: lookup member-e2e-jnpl9-control-plane on 127.0.0.53:53: server misbehaving, which occurs when registering a cluster using a bootstrap token. I'm not sure if this is a known issue with the karmada register operation or something similar to what was discussed in https://github.com/karmada-io/karmada/issues/4262? I would love to hear your thoughts all.
I've solved that issue. I found that I was using the member cluster config to create the bootstrap token instead of the Karmada APIServer config file. Changing this resolved the problem.
error validating "/tmp/deploy-4jdtb.yaml": error validating data: [apiVersion not set, kind not set]; if you choose to ignore these errors, turn validation off with --validate=false
The e2e tests for this registering operation pass locally, but the error occurred in the CI tests and seems unrelated to the changes proposed in this PR. I've rebased onto the master branch since this PR has been open for a while, but the issue still persists. Any thoughts, @zhzhuang-zju?
s: "error running /home/runner/go/bin/karmadactl --kubeconfig=/home/runner/.kube/karmada.config register 172.18.0.4:6443 --token 2mnk2l.4segshmnxwcyshc9 --discovery-token-ca-cert-hash sha256:3477db6b4299125dc026573a04ebba3ebad161593c11498a7b66e3a150685ef2 --kubeconfig=/home/runner/.kube/member-e2e-9kc55.config --cluster-name member-e2e-9kc55 --karmada-agent-image docker.io/karmada/karmada-agent:latest:\nCommand stdout:\n[preflight] Running pre-flight checks\n[preflight] All pre-flight checks were passed\n[karmada-agent-start] Waiting to perform the TLS Bootstrap\n\nstderr:\nUnable to connect to the server: dial tcp: lookup karmada-host-control-plane on 127.0.0.53:53: server misbehaving\n\nerror:\nexit status 1",
⏳ It looks like the end-to-end test cases fail due to interference with other members. Running the register test case sequentially instead of in parallel may resolve the issue.
EDIT:
⏳ It looks like there are issues connecting to the Karmada API server during CI E2E tests. However, the tests pass locally when running ./hack/run-e2e.sh. I'll revisit this issue later with fresh eyes. In the meantime, feel free to share any input you may have regarding this issue.
EDIT:
✅ Resolved the connection issue to the Karmada API server. The e2e tests for the register command assumed the current-context in the Karmada host kubeconfig was karmada-apiserver, but it defaults to karmada-host. Updating this fixed the issue.
Hey @zhzhuang-zju, How am I supposed to move this PR forward? I've already addressed the feedback and all the test cases pass :)
Hey @zhzhuang-zju, How am I supposed to move this PR forward? I've already addressed the feedback and all the test cases pass :)
Sorry for the delay, I'll take a look ASAP
Good job! Isn't it okay for us to merge it?
Hey @zhzhuang-zju, How am I supposed to move this PR forward? I've already addressed the feedback and all the test cases pass :)
Sorry for the delay, I'll take a look ASAP
Never mind! I understand you've been busy. Just ping me if you have any additional feedback, and I'd be happy to address it :)
Isn't it okay for us to merge it?
I think so but pending zhzhuang-zju feedback if any.
Great job~ thanks @mohamedawnallah /lgtm leave Approve to @XiShanYongYe-Chang to take a look
/lgtm cc @XiShanYongYe-Chang
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: XiShanYongYe-Chang
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~test/OWNERS~~ [XiShanYongYe-Chang]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment