Fleet-Agent fails to transition from 'fleet-agent-bootstrap' to 'fleet-agent' Secret after Rancher update
Is there an existing issue for this?
- [x] I have searched the existing issues
Current Behavior
After updating Rancher from 2.7.5 to 2.8.5, some imported RKE2 clusters are displayed as offline in the Rancher UI. Upon investigation, the issue seems related to the Fleet-Agent remaining stuck in a "bootstrap" state. Specifically, the Fleet-Agent continues to use the fleet-agent-bootstrap secret and fails to generate the fleet-agent secret. This issue only occurs for certain clusters, while others work as expected.
Expected Behavior
The Fleet-Agent should transition from using the fleet-agent-bootstrap secret to creating and using the fleet-agent secret, completing the registration process.
Steps To Reproduce
-
Update Rancher Management Server from version 2.7.5 to 2.8.5.
-
Ensure there are imported downstream clusters (e.g., v1.26.16+rke2r1).
-
Check the logs of the fleet-agent on an affected cluster:
kubectl -n cattle-fleet-system logs -l app=fleet-agentExample error logs:
time="2025-01-27T07:48:46Z" level=error msg="Failed to register agent: registration failed: cannot create clusterregistration on management cluster for cluster id 'some_random_id': Unauthorized" time="2025-01-27T07:49:46Z" level=warning msg="Cannot find fleet-agent secret, running registration" time="2025-01-27T07:49:46Z" level=info msg="Creating clusterregistration with id 'some_random_id' for new token" -
Compare the cattle-fleet-system namespace secrets:
- Working clusters have a fleet-agent secret.
- Affected clusters only have a fleet-agent-bootstrap secret.
Environment
- Architecture: x86
- Fleet Version: v0.9.5
- Cluster:
- Provider: RKE2
- Options:
- Kubernetes Version: v1.26.16
Logs
<details> <summary>cattle-fleet-system logs</summary>
time="2025-01-27T09:12:49Z" level=error msg="Failed to register agent: registration failed: cannot create clusterregistration on management cluster for cluster id '66t7lwf9r6gpwljrk5swl9hdxgt5bkxc756nvl7hbtd75r2zfrgm9v': Unauthorized"
time="2025-01-27T09:13:49Z" level=warning msg="Cannot find fleet-agent secret, running registration"
time="2025-01-27T09:13:49Z" level=info msg="Creating clusterregistration with id '66t7lwf9r6gpwljrk5swl9hdxgt5bkxc756nvl7hbtd75r2zfrgm9v' for new token"
</details>
<details> <summary>cattle-cluster-agent logs from a cluster that lost connection to rancher</summary>
kubectl -n cattle-system logs deployments/cattle-cluster-agent
Found 2 pods, using pod/cattle-cluster-agent-984568b5-cpsh7
Error: --namespace or env NAMESPACE is required to be set
Usage:
fleet-agent [flags]
Flags:
--agent-scope string An identifier used to scope the agent bundleID names, typically the same as namespace
--checkin-interval string How often to post cluster status
--debug Turn on debug logging
--debug-level int If debugging is enabled, set klog -v=X
-h, --help help for fleet-agent
--kubeconfig string kubeconfig file
--namespace string namespace to watch
-v, --version version for fleet-agent
time="2025-01-27T07:12:36Z" level=fatal msg="--namespace or env NAMESPACE is required to be set"
</details>
Anything else?
No response
Same here, the Init container is stuck in registration phase.
time="2025-03-01T07:35:01Z" level=warning msg="Cannot find fleet-agent secret, running registration"
time="2025-03-01T07:35:01Z" level=info msg="Creating clusterregistration with id '589kvb5962bmls2zgrbfkfd7jjmw2tb9fnbvb7vbxnr88s9bqtdb2c' for new token"
time="2025-03-01T07:35:01Z" level=error msg="Failed to register agent: registration failed: cannot create clusterregistration on management cluster for cluster id '589kvb5962bmls2zgrbfkfd7jjmw2tb9fnbvb7vbxnr88s9bqtdb2c': Unauthorized"
Here's a bit of debug information when running it manually:
$ fleetagent --debug --debug-level 9 register
I0301 07:33:23.769769 158 merged_client_builder.go:121] Using in-cluster configuration
2025-03-01T07:33:23Z INFO setup starting registration on upstream cluster {"namespace": "cattle-fleet-local-system"}
I0301 07:33:23.770730 158 round_trippers.go:466] curl -v -XGET -H "User-Agent: fleetagent/v0.0.0 (linux/amd64) kubernetes/$Format" -H "Authorization: Bearer <masked>" -H "Accept: application/json, */*" 'https://10.43.0.1:443/api/v1/namespaces/cattle-fleet-local-system/secrets/fleet-agent'
I0301 07:33:23.771490 158 round_trippers.go:510] HTTP Trace: Dial to tcp:10.43.0.1:443 succeed
I0301 07:33:23.779418 158 round_trippers.go:553] GET https://10.43.0.1:443/api/v1/namespaces/cattle-fleet-local-system/secrets/fleet-agent 404 Not Found in 8 milliseconds
I0301 07:33:23.779476 158 round_trippers.go:570] HTTP Statistics: DNSLookup 0 ms Dial 0 ms TLSHandshake 3 ms ServerProcessing 4 ms Duration 8 ms
I0301 07:33:23.779496 158 round_trippers.go:577] Response Headers:
I0301 07:33:23.779520 158 round_trippers.go:580] Audit-Id: 81920e58-4acb-4484-8fb1-14fd212f8e1d
I0301 07:33:23.779535 158 round_trippers.go:580] Cache-Control: no-cache, private
I0301 07:33:23.779545 158 round_trippers.go:580] Content-Type: application/json
I0301 07:33:23.779578 158 round_trippers.go:580] X-Kubernetes-Pf-Flowschema-Uid: 761e5877-c905-4b0f-bef4-fa21351f0054
I0301 07:33:23.779590 158 round_trippers.go:580] X-Kubernetes-Pf-Prioritylevel-Uid: af6a8b8a-b244-402d-ba08-54fb0cfcf0d0
I0301 07:33:23.779599 158 round_trippers.go:580] Content-Length: 196
I0301 07:33:23.779606 158 round_trippers.go:580] Date: Sat, 01 Mar 2025 07:33:23 GMT
I0301 07:33:23.779656 158 request.go:1351] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"secrets \"fleet-agent\" not found","reason":"NotFound","details":{"name":"fleet-agent","kind":"secrets"},"code":404}
WARN[0000] Cannot find fleet-agent secret, running registration
I0301 07:33:23.780034 158 round_trippers.go:466] curl -v -XGET -H "Accept: application/json, */*" -H "User-Agent: fleetagent/v0.0.0 (linux/amd64) kubernetes/$Format" -H "Authorization: Bearer <masked>" 'https://10.43.0.1:443/api/v1/namespaces/cattle-fleet-local-system/secrets/fleet-agent-bootstrap'
I0301 07:33:23.783016 158 round_trippers.go:553] GET https://10.43.0.1:443/api/v1/namespaces/cattle-fleet-local-system/secrets/fleet-agent-bootstrap 200 OK in 2 milliseconds
I0301 07:33:23.783069 158 round_trippers.go:570] HTTP Statistics: GetConnection 0 ms ServerProcessing 2 ms Duration 2 ms
I0301 07:33:23.783084 158 round_trippers.go:577] Response Headers:
I0301 07:33:23.783095 158 round_trippers.go:580] Content-Length: 3915
I0301 07:33:23.783106 158 round_trippers.go:580] Date: Sat, 01 Mar 2025 07:33:23 GMT
I0301 07:33:23.783112 158 round_trippers.go:580] Audit-Id: 870f9c29-62c2-435d-8be5-58e712ea19cc
I0301 07:33:23.783117 158 round_trippers.go:580] Cache-Control: no-cache, private
I0301 07:33:23.783123 158 round_trippers.go:580] Content-Type: application/json
I0301 07:33:23.783132 158 round_trippers.go:580] X-Kubernetes-Pf-Flowschema-Uid: 761e5877-c905-4b0f-bef4-fa21351f0054
I0301 07:33:23.783138 158 round_trippers.go:580] X-Kubernetes-Pf-Prioritylevel-Uid: af6a8b8a-b244-402d-ba08-54fb0cfcf0d0
I0301 07:33:23.783258 158 request.go:1351] Response Body: {"kind":"Secret","apiVersion":"v1","metadata":{"name":"fleet-agent-bootstrap","namespace":"cattle-fleet-local-system","uid":"7795d125-ff8f-4f53-8fa7-3c5622832ad7","resourceVersion":"6250553","creationTimestamp":"2025-02-25T16:34:52Z","labels":{"objectset.rio.cattle.io/hash":"362023f752e7f1989d8b652e029bd2c658ae7c44"},"annotations":{"objectset.rio.cattle.io/applied":"H4sIAAAAAAAA/3yQS3PaMBSF/8tdAxXm7ZksGihQ1fYUGz/QTpYvibH8GOumQDL57x3D9LFJlpqje75zzhtkkiTYbyCbPMD2F7bLr2CDEzByguHSDzO+zx9XfsTDIBR8x9ahf9MYLYshD0JeYOGtdtH3q2DrURDyR8H0Enr/DEPfARvk1mdq606d6/ziLsfMvY7Pzunb1FvtXqEHSr8YwtaTJZpGKgQbRHnRItlRulmcDvH5AXpgroaw9PEpN9RKyuvq/4ODtWbZ5qKdWBgRR8yJPZMlHhMJf3USb6JGvk53D50R1QVWHSPRP+SGbw+W/inDZh8lDXfDSRMkHk8T/xwXHvcKbdRWn9zNZR2FfuBqQW7Cg2goWnjvQYkk/65YVTXdgpnuWacnVGSQBm1eD5Qk0jjI6y95BjYcNSL15RNW1E/rmrpKTf/+qX8Xda2k7t9bdyjV4s18n5doSJYN2NWL1j3QMkX9KfJZmmewYTS1mDU6ziYWzo7DxXyRzdPpxEJmLdLMUtPJXOJMjccdrZIlfpQT7vKf6T9J/f47AAD//xPZ5xRkAgAA","objectset.rio.cattle.io/id":"fleet-agent-bootstrap-cattle-fleet-local-system"},"managedFields":[{"manager":"fleetcontroller","operation":"Update","apiVersion":"v1","time":"2025-02-25T16:34:52Z","fieldsType":"FieldsV1","fieldsV1":{"f:data":{".":{},"f:apiServerCA":{},"f:apiServerURL":{},"f:clusterNamespace":{},"f:systemRegistrationNamespace":{},"f:token":{}},"f:metadata":{"f:annotations":{".":{},"f:objectset.rio.cattle.io/applied":{},"f:objectset.rio.cattle.io/id":{}},"f:labels":{".":{},"f:objectset.rio.cattle.io/hash":{}}},"f:type":{}}}]},"data":{"apiServerCA":"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJkekNDQVIyZ0F3SUJBZ0lCQURBS0JnZ3Foa2pPUFFRREFqQWpNU0V3SHdZRFZRUUREQmhyTTNNdGMyVnkKZG1WeUxXTmhRREUzTXprek9EWXdNakF3SGhjTk1qVXdNakV5TVRnME56QXdXaGNOTXpVd01qRXdNVGcwTnpBdwpXakFqTVNFd0h3WURWUVFEREJock0zTXRjMlZ5ZG1WeUxXTmhRREUzTXprek9EWXdNakF3V1RBVEJnY3Foa2pPClBRSUJCZ2dxaGtqT1BRTUJCd05DQUFRVThqQ3ErVlA2eHVITERsQ2hvZFdwQ3grODVMaUxCS083L0NJMDh6SE0KMGgzbHRSYXNZeHNIUGFqUElzTVJaUHRRbG43bW4xeExFVjBzc2p1NkwwSVBvMEl3UURBT0JnTlZIUThCQWY4RQpCQU1DQXFRd0R3WURWUjBUQVFIL0JBVXdBd0VCL3pBZEJnTlZIUTRFRmdRVUEyNDlYN0hJcFE2ZlUwYUdmTmQ4CmpkNTd5UzB3Q2dZSUtvWkl6ajBFQXdJRFNBQXdSUUloQU91UjNvZzVIdndCKzF0WWZBSm9jU2hKTmhHaWRpbUMKNGk4dStFQXZRU1VEQWlCQUlBWkMvVVJSb2VDeFN0SVNWVUV4bWpBWERER095ZVBNa2orTUwxMkFGZz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K","apiServerURL":"aHR0cHM6Ly8xMC40My4wLjE6NDQz","clusterNamespace":"ZmxlZXQtbG9jYWw=","systemRegistrationNamespace":"Y2F0dGxlLWZsZWV0LWNsdXN0ZXJzLXN5c3RlbQ==","token":"ZXlKaGJHY2lPaUpTVXpJMU5pSXNJbXRwWkNJNklscHljMGxFVURSMlZtMXJSV1ZrYW1kbmFXSnJlbXRQVVY4Mll6aFRjVGhaWTA1allVSlZkVFkyUzBVaWZRLmV5SnBjM01pT2lKcmRXSmxjbTVsZEdWekwzTmxjblpwWTJWaFkyTnZkVzUwSWl3aWEzVmlaWEp1WlhSbGN5NXBieTl6WlhKMmFXTmxZV05qYjNWdWRDOXVZVzFsYzNCaFkyVWlPaUptYkdWbGRDMXNiMk5oYkNJc0ltdDFZbVZ5Ym1WMFpYTXVhVzh2YzJWeWRtbGpaV0ZqWTI5MWJuUXZjMlZqY21WMExtNWhiV1VpT2lKcGJYQnZjblF0ZEc5clpXNHRiRzlqWVd3dE9HWXhaalV6TXpRdFpUTTNNQzAwTnpBekxXSXlNakF0WVdFd01UUTVaRFk0TVRSbExYUnZhMlZ1SWl3aWEzVmlaWEp1WlhSbGN5NXBieTl6WlhKMmFXTmxZV05qYjNWdWRDOXpaWEoyYVdObExXRmpZMjkxYm5RdWJtRnRaU0k2SW1sdGNHOXlkQzEwYjJ0bGJpMXNiMk5oYkMwNFpqRm1OVE16TkMxbE16Y3dMVFEzTURNdFlqSXlNQzFoWVRBeE5EbGtOamd4TkdVaUxDSnJkV0psY201bGRHVnpMbWx2TDNObGNuWnBZMlZoWTJOdmRXNTBMM05sY25acFkyVXRZV05qYjNWdWRDNTFhV1FpT2lKbU0yVXdaalU0Tmkxak16RXhMVFE1TURZdE9UVXpZUzFtWlRJMk5tVmtaVGMyWVRRaUxDSnpkV0lpT2lKemVYTjBaVzA2YzJWeWRtbGpaV0ZqWTI5MWJuUTZabXhsWlhRdGJHOWpZV3c2YVcxd2IzSjBMWFJ2YTJWdUxXeHZZMkZzTFRobU1XWTFNek0wTFdVek56QXRORGN3TXkxaU1qSXdMV0ZoTURFME9XUTJPREUwWlNKOS5DWkx5NWFtRE8tdnpFRU5FVldRMXJTdXV2ajB4UExPUms1WS1rX1ZoSEtzY0FEMUJOcmV0LXZGQkhzYzl2ZU1KV3ZsbTZDZnU4ZTdEb3FJQjJkU0RhV2tnc0FSMjhydkRVMW9WbUo3cEtQS25YaDhxM0lTNDlLalBDSXowazF1dGhVdktlVmZBZFRNOFJfeGRNT2lGck5YU1FHMXVFZ3lNMzdjTy02VVBSVF9vVGQ1RG54UnQ5dFdSY2NMdHFSZ1p0YkZFeTNMOXZFaXFfM1J0NEpoU2RHSFNxUHFWbVdEQ1JYLUJvYW44Mk45OG9DOUFYb21DaVY4X3k4YlVraHgxV3h5c2JkOWxFM3FuQkRpcmJtQ1hzSE5scG1iRTlETDQyaHBmS3B3NWs1cTc2NUpSM2FiOHpZSVN6eXplVWhnT3g2SDNVSm1LTFRwamxwcWFVV3dyX1E="},"type":"Opaque"}
I0301 07:33:23.783801 158 round_trippers.go:466] curl -v -XGET -H "Authorization: Bearer <masked>" -H "Accept: application/json, */*" -H "User-Agent: fleetagent/v0.0.0 (linux/amd64) kubernetes/$Format" 'https://10.43.0.1:443/api/v1/namespaces/cattle-fleet-local-system/configmaps/fleet-agent'
I0301 07:33:23.787477 158 round_trippers.go:553] GET https://10.43.0.1:443/api/v1/namespaces/cattle-fleet-local-system/configmaps/fleet-agent 200 OK in 3 milliseconds
I0301 07:33:23.787541 158 round_trippers.go:570] HTTP Statistics: GetConnection 0 ms ServerProcessing 3 ms Duration 3 ms
I0301 07:33:23.787565 158 round_trippers.go:577] Response Headers:
I0301 07:33:23.787589 158 round_trippers.go:580] X-Kubernetes-Pf-Prioritylevel-Uid: af6a8b8a-b244-402d-ba08-54fb0cfcf0d0
I0301 07:33:23.787608 158 round_trippers.go:580] Content-Length: 1450
I0301 07:33:23.787630 158 round_trippers.go:580] Date: Sat, 01 Mar 2025 07:33:23 GMT
I0301 07:33:23.787646 158 round_trippers.go:580] Audit-Id: 10418836-3e6a-42cd-8002-9dda32d22697
I0301 07:33:23.787665 158 round_trippers.go:580] Cache-Control: no-cache, private
I0301 07:33:23.787682 158 round_trippers.go:580] Content-Type: application/json
I0301 07:33:23.787702 158 round_trippers.go:580] X-Kubernetes-Pf-Flowschema-Uid: 761e5877-c905-4b0f-bef4-fa21351f0054
I0301 07:33:23.787794 158 request.go:1351] Response Body: {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"fleet-agent","namespace":"cattle-fleet-local-system","uid":"bf3dffe9-71fb-4ed1-8a5e-4eec87e1bf56","resourceVersion":"214168","creationTimestamp":"2025-02-13T08:40:10Z","labels":{"objectset.rio.cattle.io/hash":"362023f752e7f1989d8b652e029bd2c658ae7c44"},"annotations":{"objectset.rio.cattle.io/applied":"H4sIAAAAAAAA/3yPvXKsMAyF30U1cLne5c9tqvQp3QgjFifGZixlZzLMvnvGkBRptpTOGX2fdphQEPQONobZ3UDDbgBvFORlIfvhwmsQSnf0BrSBmg0UBjyO5NmA3g0EXOnIfLS5VRjYUry7iVIFjwJWEvxlYAhRUFwMnMc4vpMVJqmSi5VFEU+Vi//cBBpmTyTlYVKOMQpLwq08S+UZHsSSv1hozSib6Dj+5lZiwXUDHT69L358nyEX5AU0XFpVq8vcNYq6+f/QD1M/to2iWg3jpGzb9EidvV4zLT/+1xPOJW9oc/LE9fEdAAD//6hbMZF4AQAA","objectset.rio.cattle.io/id":"fleet-agent-bootstrap-cattle-fleet-local-system"},"managedFields":[{"manager":"fleetcontroller","operation":"Update","apiVersion":"v1","time":"2025-02-13T08:40:10Z","fieldsType":"FieldsV1","fieldsV1":{"f:data":{".":{},"f:config":{}},"f:metadata":{"f:annotations":{".":{},"f:objectset.rio.cattle.io/applied":{},"f:objectset.rio.cattle.io/id":{}},"f:labels":{".":{},"f:objectset.rio.cattle.io/hash":{}}}}}]},"data":{"config":"{\"agentCheckinInterval\":\"0s\",\"labels\":{\"name\":\"local\",\"provider.cattle.io\":\"k3s\"},\"clientID\":\"589kvb5962bmls2zgrbfkfd7jjmw2tb9fnbvb7vbxnr88s9bqtdb2c\",\"bootstrap\":{},\"agentTLSMode\":\"strict\",\"gitClientTimeout\":\"0s\",\"garbageCollectionInterval\":\"15m0s\",\"agentWorkers\":{}}"}}
INFO[0000] Creating clusterregistration with id '589kvb5962bmls2zgrbfkfd7jjmw2tb9fnbvb7vbxnr88s9bqtdb2c' for new token
I0301 07:33:23.790161 158 request.go:1351] Request Body: {"kind":"ClusterRegistration","apiVersion":"fleet.cattle.io/v1alpha1","metadata":{"generateName":"request-","namespace":"fleet-local","creationTimestamp":null},"spec":{"clientID":"589kvb5962bmls2zgrbfkfd7jjmw2tb9fnbvb7vbxnr88s9bqtdb2c","clientRandom":"hmkbgq9gfkztn9m4jzfwnh9whq6zj4k4fnqbgv5x2xm2bf8vvrkng7","clusterLabels":{"fleet.cattle.io/created-by-agent-pod":"fleet-agent-0","name":"local","provider.cattle.io":"k3s"}},"status":{}}
I0301 07:33:23.790283 158 round_trippers.go:466] curl -v -XPOST -H "Content-Type: application/json" -H "User-Agent: fleetagent/v0.0.0 (linux/amd64) kubernetes/$Format" -H "Authorization: Bearer <masked>" -H "Accept: application/json, */*" 'https://10.43.0.1:443/apis/fleet.cattle.io/v1alpha1/namespaces/fleet-local/clusterregistrations'
I0301 07:33:23.794412 158 round_trippers.go:553] POST https://10.43.0.1:443/apis/fleet.cattle.io/v1alpha1/namespaces/fleet-local/clusterregistrations 401 Unauthorized in 4 milliseconds
I0301 07:33:23.794444 158 round_trippers.go:570] HTTP Statistics: GetConnection 0 ms ServerProcessing 3 ms Duration 4 ms
I0301 07:33:23.794461 158 round_trippers.go:577] Response Headers:
I0301 07:33:23.794476 158 round_trippers.go:580] Date: Sat, 01 Mar 2025 07:33:23 GMT
I0301 07:33:23.794492 158 round_trippers.go:580] Audit-Id: a21f3f98-a89a-4ba1-add1-2c710d777f21
I0301 07:33:23.794505 158 round_trippers.go:580] Cache-Control: no-cache, private
I0301 07:33:23.794513 158 round_trippers.go:580] Content-Type: application/json
I0301 07:33:23.794525 158 round_trippers.go:580] Content-Length: 129
I0301 07:33:23.794566 158 request.go:1351] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}
It might be related to https://github.com/rancher/rancher/issues/36117. But there's no real solution there and the description how it was resolved is very vauge. Anyway, for clarification:
kubectl auth can-i get secret --as=system:serviceaccount:cattle-fleet-local-system:fleet-agent -n cattle-fleet-local-system
yes
Can you reproduce this with a newer Rancher version, e.g. 2.11.x? This may help understand which resources need to be created and when.
Closing this issue in the absence of feedback.