:bug: Panic in `APIServer.Stop` when `Authn == nil`
After a test using this repo which failed with an error like
ERROR controller-runtime.test-env unable to start the controlplane {"tries": 4, "error": "timeout waiting for process etcd to start successfully (it may have failed to start, or stopped unexpectedly before becoming ready)"}
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).startControlPlane
~/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:330
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).Start
~/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:260
…
I saw a further error
Test Panicked
runtime error: invalid memory address or nil pointer dereference
Full Stack Trace
sigs.k8s.io/controller-runtime/pkg/internal/testing/controlplane.(*APIServer).Stop(0x14)
~/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/testing/controlplane/apiserver.go:425 +0x8f
sigs.k8s.io/controller-runtime/pkg/internal/testing/controlplane.(*ControlPlane).Stop(0xc00001a000)
~/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/testing/controlplane/plane.go:87 +0x3d
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).Stop(0xc00001a000)
~/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:194 +0x114
…
This seems to be similar to #1724 (merged toward 0.11.0). Maybe related to #1750?
Untested (not known how to reproduce the original error with etcd).
Welcome @jglick!
It looks like this is your first PR to kubernetes-sigs/controller-runtime 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.
You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.
You can also check if kubernetes-sigs/controller-runtime has its own contribution guidelines.
You may want to refer to our testing guide if you run into trouble with your tests not passing.
If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!
Thank you, and welcome to Kubernetes. :smiley:
Hi @jglick. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/ok-to-test
Can we add a test case as well?
Can we add a test case as well?
Probably above my skill level.
It looks to me like a side-effect of a more concerning bug. Why is it even nil?
The documentation says there should be a default value if it's empty https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/internal/testing/controlplane/apiserver.go#L51
This line configures it if it is empty https://github.com/kubernetes-sigs/controller-runtime/blob/5636d975d88e2072884fd82c75b5d3bacf274919/pkg/internal/testing/controlplane/apiserver.go#L264
@DirectXMan12 I think you wrote this, can you help us clarify?
This is just a side effect of some other (properly reported) error: there is code which tries to clean up by stopping a service which in this case had not been fully initialized.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
I think this remains valid.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/assign @hoegaarden
@jglick you should make the test pass in order for us to merge it. maybe try to rebase?
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: AlmogBaku, jglick
Once this PR has been reviewed and has the lgtm label, please ask for approval from hoegaarden by writing /assign @hoegaarden in a comment. For more information see:The Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Rebased as suggestion. https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_controller-runtime/1785/pull-controller-runtime-test-master/1485692352398888960 is opaque to me.
/lgtm
Can we add a test case as well?
Probably above my skill level.
There isn't much of a point in merging a fix without a test, the next change might just break it again.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue or PR with
/reopen - Mark this issue or PR as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closed this PR.
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue or PR with
/reopen- Mark this issue or PR as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
If somebody else lands on this page when running controller when following this guide https://book.kubebuilder.io/reference/envtest.html. For me the issue was because cfg, err = testEnv.Start() returned the error missing of missing /usr/local/kubebuilder/bin/etcd binary. We solved it by setting the binary using the env variable:
KUBEBUILDER_ASSETS=__PROJECT_PATH__/bin/k8s/1.25.0-linux-amd64