consul
consul copied to clipboard
agent: Service deregistration blocked by ACLs
Overview of the Issue
After upgrading from version 1.14.4 to version 1.16.2, there will be service health detection failures registered on some nodes every once in a while, and they will be restored to normal by restarting the consult server.
Reproduction Steps
I don't know how to reproduce it, but it appears every once in a while.
Consul info for both Client and Server
Client info
agent:
check_monitors = 0
check_ttls = 0
checks = 0
services = 0
build:
prerelease =
revision = 68f81912
version = 1.16.2
version_metadata =
consul:
acl = enabled
bootstrap = false
known_datacenters = 1
leader = false
leader_addr = 10.60.112.132:8300
server = true
raft:
applied_index = 314534
commit_index = 314534
fsm_pending = 0
last_contact = 70.541794ms
last_log_index = 314534
last_log_term = 37
last_snapshot_index = 311338
last_snapshot_term = 37
latest_configuration = [{Suffrage:Voter ID:cc0f834e-8c67-d394-344f-ee5331ea663a Address:10.60.238.199:8300} {Suffrage:Voter ID:f5ec631a-488d-37de-b2b2-7914cd030996 Address:10.60.112.132:8300} {Suffrage:Voter ID:a72236a8-a966-bc19-578b-65021b8f12ca Address:10.60.97.199:8300}]
latest_configuration_index = 0
num_peers = 2
protocol_version = 3
protocol_version_max = 3
protocol_version_min = 0
snapshot_version_max = 1
snapshot_version_min = 0
state = Follower
term = 37
runtime:
arch = amd64
cpu_count = 4
goroutines = 144
max_procs = 4
os = linux
version = go1.20.8
serf_lan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 9
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 77
members = 3
query_queue = 0
query_time = 1
serf_wan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 1
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 40
members = 3
query_queue = 0
query_time = 1
Server info
agent:
check_monitors = 0
check_ttls = 0
checks = 0
services = 0
build:
prerelease =
revision = 68f81912
version = 1.16.2
version_metadata =
consul:
acl = enabled
bootstrap = false
known_datacenters = 1
leader = false
leader_addr = 10.60.112.132:8300
server = true
raft:
applied_index = 314534
commit_index = 314534
fsm_pending = 0
last_contact = 70.541794ms
last_log_index = 314534
last_log_term = 37
last_snapshot_index = 311338
last_snapshot_term = 37
latest_configuration = [{Suffrage:Voter ID:cc0f834e-8c67-d394-344f-ee5331ea663a Address:10.60.238.199:8300} {Suffrage:Voter ID:f5ec631a-488d-37de-b2b2-7914cd030996 Address:10.60.112.132:8300} {Suffrage:Voter ID:a72236a8-a966-bc19-578b-65021b8f12ca Address:10.60.97.199:8300}]
latest_configuration_index = 0
num_peers = 2
protocol_version = 3
protocol_version_max = 3
protocol_version_min = 0
snapshot_version_max = 1
snapshot_version_min = 0
state = Follower
term = 37
runtime:
arch = amd64
cpu_count = 4
goroutines = 144
max_procs = 4
os = linux
version = go1.20.8
serf_lan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 9
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 77
members = 3
query_queue = 0
query_time = 1
serf_wan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 1
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 40
members = 3
query_queue = 0
query_time = 1
Operating system and Environment details
Deploy using consult k8s
# consul-k8s status
==> Consul Status Summary
Name Namespace Status Chart Version AppVersion Revision Last Updated
consul consul deployed 1.2.2 1.16.2 1 2023/10/12 10:32:19 CST
Log Fragments
2023-11-21T09:07:08.346Z [WARN] agent: Coordinate update blocked by ACLs: accessorID="anonymous token"
2023-11-21T09:07:34.373Z [WARN] agent: Coordinate update blocked by ACLs: accessorID="anonymous token"
2023-11-21T09:07:51.336Z [WARN] agent: Coordinate update blocked by ACLs: accessorID="anonymous token"
2023-11-21T09:08:15.443Z [WARN] agent: Coordinate update blocked by ACLs: accessorID="anonymous token"
2023-11-21T09:08:28.816Z [WARN] agent: Node info update blocked by ACLs: node=cc0f834e-8c67-d394-344f-ee5331ea663a accessorID="anonymous token"
2023-11-21T09:08:28.817Z [WARN] agent: Service deregistration blocked by ACLs: service=xxx_10.60.131.181_80 accessorID="anonymous token"
2023-11-21T09:08:28.817Z [WARN] agent: Service deregistration blocked by ACLs: service=xxx_10.60.171.169_80 accessorID="anonymous token"
2023-11-21T09:08:28.817Z [WARN] agent: Service deregistration blocked by ACLs: service=xxx_10.60.171.140_80 accessorID="anonymous token"
2023-11-21T09:08:28.818Z [WARN] agent: Check deregistration blocked by ACLs: check=service:xxx_10.60.171.169_80 accessorID="anonymous token"
2023-11-21T09:08:28.818Z [WARN] agent: Check deregistration blocked by ACLs: check=service:xxx_10.60.171.140_80 accessorID="anonymous token"
2023-11-21T09:08:28.818Z [WARN] agent: Check deregistration blocked by ACLs: check=service:xxx_10.60.131.181_80 accessorID="anonymous token"
2023-11-21T09:08:33.416Z [WARN] agent: Coordinate update blocked by ACLs: accessorID="anonymous token"
2023-11-21T09:08:54.231Z [WARN] agent: Coordinate update blocked by ACLs: accessorID="anonymous token"
2023-11-21T09:09:23.554Z [WARN] agent: Coordinate update blocked by ACLs: accessorID="anonymous token"
Just want to gather more info to help us reproduce the issue:
- are consul agents running in VM or K8s?
- are the failed service instances (shown in the screenshot) registered at the server nodes?
Just want to gather more info to help us reproduce the issue:
- are consul agents running in VM or K8s?
- are the failed service instances (shown in the screenshot) registered at the server nodes?
- Agents run on k8s.
- Sorry, I got it wrong before. The scenario where an exception occurs is when the service has stopped, but the consult server will not automatically unregister the stopped service.
@Rabbit-st
Thanks for the updated info. Consul-k8s should handle deregistering service if you remove the service by kubectl delete
However, consul will not deregister the stopped service automatically since the service instance is stored in Consul's catalog. Consul won't route traffic to the failed instance, so the connection from downstream will be directed to healthy instances of the service.
Could you provide more details about the situation of stopped services
? (is it stopped due to a true alarm or k8s node failure)
I have the same problem, and I fixed, service need a agent token to regiester and deregister. https://developer.hashicorp.com/consul/docs/security/acl/tokens/create/create-an-agent-token
I have the same problem, and I fixed, service need a agent token to regiester and deregister. https://developer.hashicorp.com/consul/docs/security/acl/tokens/create/create-an-agent-token
The token has been configured. Client issues, not supported consul 1.16.2. consul recovers after version degradation.
@Rabbit-st Degraded to which version? Facing similar issues. Also please reopen this issue