consul
consul copied to clipboard
Use agent token for service/check deregistration during anti-entropy
Description
The changes agent anti-entropy syncs to only use agent token for deregistration of services and checks.
The previous behavior had the agent attempt to use the "service" token (i.e. from the token
field in a service definition file) and if that was not set, then it would use the agent token.
The previous behavior was problematic because, if the service token had been deleted, the deregistration request would fail. The agent would retry the deregistration during each anti-entropy sync, and the situation would never resolve.
The new behavior is to only/always use the agent token to service/check deregistration during anti-entropy. This is:
- Simpler: No fallback logic to try different tokens
- Faster (slightly): No time spent attempting the service token
- Correct: The agent token is able to deregister services on that agent's node, because:
- node:write permissions allow deregistration of services/checks on that node.
- The agent token must have node:write permission, or else the agent is not be able to (de)register itself into the catalog
Testing & Reproduction steps
Expand for test steps
-
Define a service definition like the following (named
wumbo
). This contains a service and check with theirtoken
field set.$ cat config/service.hcl service { name = "wumbo" id = "wumbo-id" token = "33333333-22b6-43f1-88f0-4c49d2a63554" check { name = "inline check" ttl = "9999h" status = "passing" } } check { name = "standalone check" ttl = "9999h" status = "passing" service_id = "wumbo-id" token = "33333333-22b6-43f1-88f0-4c49d2a63554" }
-
Start a consul agent
$ cat config/agent.hcl log_level = "debug" node_name = "client1" leave_on_terminate = true acl = { default_policy = "deny" down_policy = "extend-cache" enable_token_persistence = true enabled = true tokens = { initial_management = "63fb8a77-22b6-43f1-88f0-4c49d2a63554" agent = "00000000-22b6-43f1-88f0-4c49d2a63554" } } $ consul agent -dev -config-dir ./config
-
Create the service and agent tokens (taking advantage of client provided ids):
export CONSUL_HTTP_TOKEN=63fb8a77-22b6-43f1-88f0-4c49d2a63554 consul acl token create -service-identity wumbo -secret 33333333-22b6-43f1-88f0-4c49d2a63554 -accessor 087a8e18-21a7-41e1-b952-878b606e750a consul acl token create -node-identity client1:dc1 -secret 00000000-22b6-43f1-88f0-4c49d2a63554 -accessor 8afcfcf1-c5f5-4c92-9470-f3f4763b3fb2
-
Verify the service is soon registered
$ consul catalog services consul wumbo
In the Consul agent logs, you should see:
2023-01-27T13:26:44.963-0600 [INFO] agent: Synced node info 2023-01-27T13:26:44.964-0600 [INFO] agent: Synced service: service=wumbo-id 2023-01-27T13:26:44.964-0600 [DEBUG] agent: Check in sync: check=service:wumbo-id 2023-01-27T13:26:44.964-0600 [DEBUG] agent: Check in sync: check="standalone check" 2023-01-27T13:26:44.964-0600 [DEBUG] agent: Node info in sync 2023-01-27T13:26:44.964-0600 [DEBUG] agent: Service in sync: service=wumbo-id 2023-01-27T13:26:44.964-0600 [DEBUG] agent: Check in sync: check="standalone check" 2023-01-27T13:26:44.964-0600 [DEBUG] agent: Check in sync: check=service:wumbo-id
-
Delete the service token (the agent should not use this token for the service/check deletion. this ensures we'll see failures if it does use the service token)
$ consul acl token delete -id 087a8e18-21a7-41e1-b952-878b606e750a Token "087a8e18-21a7-41e1-b952-878b606e750a" deleted successfully
-
Deregister the service
$ consul services deregister -id wumbo-id Deregistered service: wumbo-id
-
Verify the service is deregistered
$ consul catalog services consul
In the agent logs, you should see
2023-01-27T13:35:23.064-0600 [DEBUG] agent: Node info in sync 2023-01-27T13:35:23.065-0600 [INFO] agent: Deregistered service: service=wumbo-id 2023-01-27T13:35:23.065-0600 [DEBUG] agent: Node info in sync 2023-01-27T13:35:23.065-0600 [DEBUG] agent: removed check: check="standalone check" 2023-01-27T13:35:23.065-0600 [DEBUG] agent: removed check: check=service:wumbo-id 2023-01-27T13:35:23.065-0600 [DEBUG] agent: removed service: service=wumbo-id 2023-01-27T13:35:23.065-0600 [DEBUG] agent: Node info in sync
Links
This is a replacement/alternative to https://github.com/hashicorp/consul/pull/14436
PR Checklist
- [ ] updated test coverage
- [ ] external facing docs updated
- [x] not a security concern