nacos
nacos copied to clipboard
docker运行nacos集群出现异常日志
Version
- os: CentOS Linux release 7.9.2009 (Core)
- docker: Docker version 20.10.7, build f0df350
- nacos: nacos/nacos-server:v2.1.0
Operation
- firewall-cmd --zone=public --add-port=xxx/tcp --permanent 开放了8148/8248/8348和9148/9248/9348
- 在同一台宿主机上执行docker compose -f docker-compose-nacos.yaml --env-file nacos.env up -d
- 查看nacos/logs目录下的日志,发现挺多异常信息
- docker logs xxx查看启动结果,显示:Nacos started successfully in cluster mode. use external storage
Code
version: "3"
services:
nacos1:
image: nacos/nacos-server:${NACOS_VERSION}
hostname: nacos1
container_name: nacos1
environment:
- MODE=cluster
- PREFER_HOST_MODE=hostname
- NACOS_SERVERS=nacos1:8848 nacos2:8848 nacos3:8848
- SPRING_DATASOURCE_PLATFORM:mysql
- MYSQL_SERVICE_HOST=${MYSQL_SERVICE_HOST}
- MYSQL_SERVICE_PORT=${MYSQL_SERVICE_PORT}
- MYSQL_SERVICE_USER=${MYSQL_SERVICE_USER}
- MYSQL_SERVICE_PASSWORD=${MYSQL_SERVICE_PWD}
- MYSQL_SERVICE_DB_NAME=${MYSQL_SERVICE_DB}
- JVM_XMS=128m
- JVM_XMX=128m
- JVM_XMN=128m
volumes:
- ${NACOS_HOME}/nacos1/logs:/home/nacos/logs
- ${NACOS_HOME}/nacos1/init.d:/home/nacos/init.d
ports:
- "8148:8848"
- "9148:9848"
privileged: true
restart: on-failure
nacos2:
image: nacos/nacos-server:${NACOS_VERSION}
hostname: nacos2
container_name: nacos2
environment:
- MODE=cluster
- PREFER_HOST_MODE=hostname
- NACOS_SERVERS=nacos1:8848 nacos2:8848 nacos3:8848
- SPRING_DATASOURCE_PLATFORM:mysql
- MYSQL_SERVICE_HOST=${MYSQL_SERVICE_HOST}
- MYSQL_SERVICE_PORT=${MYSQL_SERVICE_PORT}
- MYSQL_SERVICE_USER=${MYSQL_SERVICE_USER}
- MYSQL_SERVICE_PASSWORD=${MYSQL_SERVICE_PWD}
- MYSQL_SERVICE_DB_NAME=${MYSQL_SERVICE_DB}
- JVM_XMS=128m
- JVM_XMX=128m
- JVM_XMN=128m
volumes:
- ${NACOS_HOME}/nacos2/logs:/home/nacos/logs
- ${NACOS_HOME}/nacos2/init.d:/home/nacos/init.d
ports:
- "8248:8848"
- "9248:9848"
privileged: true
restart: on-failure
nacos3:
image: nacos/nacos-server:${NACOS_VERSION}
hostname: nacos3
container_name: nacos3
environment:
- MODE=cluster
- PREFER_HOST_MODE=hostname
- NACOS_SERVERS=nacos1:8848 nacos2:8848 nacos3:8848
- SPRING_DATASOURCE_PLATFORM:mysql
- MYSQL_SERVICE_HOST=${MYSQL_SERVICE_HOST}
- MYSQL_SERVICE_PORT=${MYSQL_SERVICE_PORT}
- MYSQL_SERVICE_USER=${MYSQL_SERVICE_USER}
- MYSQL_SERVICE_PASSWORD=${MYSQL_SERVICE_PWD}
- MYSQL_SERVICE_DB_NAME=${MYSQL_SERVICE_DB}
- JVM_XMS=128m
- JVM_XMX=128m
- JVM_XMN=128m
volumes:
- ${NACOS_HOME}/nacos3/logs:/home/nacos/logs
- ${NACOS_HOME}/nacos3/init.d:/home/nacos/init.d
ports:
- "8348:8848"
- "9348:9848"
privileged: true
restart: on-failure
Exception ./naming-server.log 2022-08-10 14:44:36,216 WARN Exception while request: http://nacos2:8848/nacos/v1/ns/operator/cluster/state, caused: {} org.apache.http.conn.HttpHostConnectException: Connect to nacos2:8848 [nacos2/172.18.0.3] failed: Connection refused (Connection refused) Caused by: java.net.ConnectException: Connection refused (Connection refused) java.io.IOException: failed to req API:http://nacos2:8848/nacos/v1/ns/operator/cluster/state. code:500 msg: org.apache.http.conn.HttpHostConnectException: Connect to nacos2:8848 [nacos2/172.18.0.3] failed: Connection refused (Connection refused) java.net.ConnectException: Connection refused java.net.ConnectException: Connection refused 2022-08-10 14:44:38,963 WARN Exception while request: http://nacos3:8848/nacos/v1/ns/operator/cluster/state, caused: {} org.apache.http.conn.HttpHostConnectException: Connect to nacos3:8848 [nacos3/172.18.0.4] failed: Connection refused (Connection refused) Caused by: java.net.ConnectException: Connection refused (Connection refused) java.io.IOException: failed to req API:http://nacos3:8848/nacos/v1/ns/operator/cluster/state. code:500 msg: org.apache.http.conn.HttpHostConnectException: Connect to nacos3:8848 [nacos3/172.18.0.4] failed: Connection refused (Connection refused) java.net.ConnectException: Connection refused java.net.ConnectException: Connection refused 2022-08-10 14:44:40,927 WARN Exception while request: http://nacos2:8848/nacos/v1/ns/distro/datums, caused: {} org.apache.http.conn.HttpHostConnectException: Connect to nacos2:8848 [nacos2/172.18.0.3] failed: Connection refused (Connection refused) Caused by: java.net.ConnectException: Connection refused (Connection refused) 2022-08-10 14:44:40,928 WARN Exception while request: http://nacos3:8848/nacos/v1/ns/distro/datums, caused: {} org.apache.http.conn.HttpHostConnectException: Connect to nacos3:8848 [nacos3/172.18.0.4] failed: Connection refused (Connection refused) Caused by: java.net.ConnectException: Connection refused (Connection refused) 2022-08-10 14:44:40,967 WARN Exception while request: http://nacos2:8848/nacos/v1/ns/operator/cluster/state, caused: {} org.apache.http.conn.HttpHostConnectException: Connect to nacos2:8848 [nacos2/172.18.0.3] failed: Connection refused (Connection refused) Caused by: java.net.ConnectException: Connection refused (Connection refused) java.io.IOException: failed to req API:http://nacos2:8848/nacos/v1/ns/operator/cluster/state. code:500 msg: org.apache.http.conn.HttpHostConnectException: Connect to nacos2:8848 [nacos2/172.18.0.3] failed: Connection refused (Connection refused) java.net.ConnectException: Connection refused java.net.ConnectException: Connection refused java.net.ConnectException: Connection refused java.net.ConnectException: Connection refused 2022-08-10 14:44:44,962 WARN Exception while request: http://nacos3:8848/nacos/v1/ns/operator/cluster/state, caused: {} org.apache.http.conn.HttpHostConnectException: Connect to nacos3:8848 [nacos3/172.18.0.4] failed: Connection refused (Connection refused) Caused by: java.net.ConnectException: Connection refused (Connection refused) java.io.IOException: failed to req API:http://nacos3:8848/nacos/v1/ns/operator/cluster/state. code:500 msg: org.apache.http.conn.HttpHostConnectException: Connect to nacos3:8848 [nacos3/172.18.0.4] failed: Connection refused (Connection refused) java.net.ConnectException: Connection refused java.net.ConnectException: Connection refused 2022-08-10 14:44:46,980 WARN Exception while request: http://nacos2:8848/nacos/v1/ns/operator/cluster/state, caused: {} org.apache.http.conn.HttpHostConnectException: Connect to nacos2:8848 [nacos2/172.18.0.3] failed: Connection refused (Connection refused) Caused by: java.net.ConnectException: Connection refused (Connection refused) java.io.IOException: failed to req API:http://nacos2:8848/nacos/v1/ns/operator/cluster/state. code:500 msg: org.apache.http.conn.HttpHostConnectException: Connect to nacos2:8848 [nacos2/172.18.0.3] failed: Connection refused (Connection refused) java.net.ConnectException: Connection refused java.net.ConnectException: Connection refused 2022-08-10 14:44:51,139 WARN Exception while request: http://nacos3:8848/nacos/v1/ns/operator/cluster/state, caused: {} org.apache.http.conn.HttpHostConnectException: Connect to nacos3:8848 [nacos3/172.18.0.4] failed: Connection refused (Connection refused) Caused by: java.net.ConnectException: Connection refused (Connection refused) java.io.IOException: failed to req API:http://nacos3:8848/nacos/v1/ns/operator/cluster/state. code:500 msg: org.apache.http.conn.HttpHostConnectException: Connect to nacos3:8848 [nacos3/172.18.0.4] failed: Connection refused (Connection refused) java.net.ConnectException: Connection refused java.net.ConnectException: Connection refused 2022-08-10 14:44:52,961 WARN Exception while request: http://nacos2:8848/nacos/v1/ns/operator/cluster/state, caused: {} org.apache.http.conn.HttpHostConnectException: Connect to nacos2:8848 [nacos2/172.18.0.3] failed: Connection refused (Connection refused) Caused by: java.net.ConnectException: Connection refused (Connection refused) java.io.IOException: failed to req API:http://nacos2:8848/nacos/v1/ns/operator/cluster/state. code:500 msg: org.apache.http.conn.HttpHostConnectException: Connect to nacos2:8848 [nacos2/172.18.0.3] failed: Connection refused (Connection refused) java.net.ConnectException: Connection refused java.io.IOException: failed to req API:http://nacos3:8848/nacos/v1/ns/operator/cluster/state. code:500 msg: caused: unable to find local peer: nacos3:8848, all peers: []; java.io.IOException: failed to req API:http://nacos2:8848/nacos/v1/ns/operator/cluster/state. code:500 msg: caused: unable to find local peer: nacos2:8848, all peers: [];
./nacos.log io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2Exception$StreamException: Received DATA frame for an unknown stream 3 at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2Exception.streamError(Http2Exception.java:147) 2022-08-10 14:44:50,008 INFO Creating filter chain: any request, [org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter@f096f37, org.springframework.security.web.context.SecurityContextPersistenceFilter@3d6a6bee, org.springframework.security.web.header.HeaderWriterFilter@30e6a763, org.springframework.security.web.csrf.CsrfFilter@fca387, org.springframework.security.web.authentication.logout.LogoutFilter@4b2e3e8f, org.springframework.security.web.savedrequest.RequestCacheAwareFilter@213c3543, org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter@3cff0139, org.springframework.security.web.authentication.AnonymousAuthenticationFilter@3effd4f3, org.springframework.security.web.session.SessionManagementFilter@732c9b5c, org.springframework.security.web.access.ExceptionTranslationFilter@3ae0b770] java.lang.IllegalStateException: unable to find local peer: nacos1:8848, all peers: [] java.lang.IllegalStateException: unable to find local peer: nacos1:8848, all peers: []
./protocol-raft.log java.lang.IllegalStateException: Fail to get leader of group naming_persistent_service java.lang.IllegalStateException: Fail to get leader of group naming_persistent_service, Unknown leader, Unknown leader, Unknown leader java.lang.IllegalStateException: Fail to get leader of group naming_persistent_service_v2, Fail to find node nacos3:7848 in group naming_persistent_service_v2, Unknown leader, Fail to find node nacos2:7848 in group naming_persistent_service_v2 java.lang.IllegalStateException: Fail to get leader of group naming_instance_metadata, Fail to find node nacos3:7848 in group naming_instance_metadata, Unknown leader, Fail to find node nacos2:7848 in group naming_instance_metadata java.lang.IllegalStateException: Fail to get leader of group naming_service_metadata, Fail to find node nacos3:7848 in group naming_service_metadata, Unknown leader, Fail to find node nacos2:7848 in group naming_service_metadata java.lang.IllegalStateException: Fail to get leader of group naming_persistent_service_v2, Unknown leader, Unknown leader, Unknown leader java.lang.IllegalStateException: Fail to get leader of group naming_service_metadata, Unknown leader, Unknown leader, Unknown leader java.lang.IllegalStateException: Fail to get leader of group naming_persistent_service_v2, Unknown leader, Unknown leader, Unknown leader java.lang.IllegalStateException: Fail to get leader of group naming_instance_metadata, Unknown leader, Unknown leader, Unknown leader java.lang.IllegalStateException: Fail to get leader of group naming_service_metadata, Unknown leader, Unknown leader, Unknown leader java.lang.IllegalStateException: Fail to get leader of group naming_instance_metadata, Unknown leader, Unknown leader, Unknown leader java.lang.IllegalStateException: Fail to get leader of group naming_persistent_service_v2, Unknown leader, Unknown leader, Unknown leader java.lang.IllegalStateException: Fail to get leader of group naming_instance_metadata, Unknown leader, Unknown leader, Unknown leader
./protocol-distro.log com.alibaba.nacos.core.distributed.distro.exception.DistroException: [DISTRO-EXCEPTION]Get snapshot from nacos2:8848 failed. Caused by: java.io.IOException: failed to req API: http://nacos2:8848/nacos/v1/ns/distro/datums. code: 500 msg: org.apache.http.conn.HttpHostConnectException: Connect to nacos2:8848 [nacos2/172.18.0.3] failed: Connection refused (Connection refused) com.alibaba.nacos.core.distributed.distro.exception.DistroException: [DISTRO-EXCEPTION]Get snapshot from nacos3:8848 failed. Caused by: java.io.IOException: failed to req API: http://nacos3:8848/nacos/v1/ns/distro/datums. code: 500 msg: org.apache.http.conn.HttpHostConnectException: Connect to nacos3:8848 [nacos3/172.18.0.4] failed: Connection refused (Connection refused) com.alibaba.nacos.core.distributed.distro.exception.DistroException: [DISTRO-EXCEPTION][DISTRO-FAILED] Get distro snapshot failed! Caused by: com.alibaba.nacos.api.exception.NacosException: No rpc client related to member: Member{ip='nacos2', port=8848, state=UP, extendInfo={raftPort=7848, readyToUpgrade=true}} com.alibaba.nacos.core.distributed.distro.exception.DistroException: [DISTRO-EXCEPTION][DISTRO-FAILED] Get distro snapshot failed! Caused by: com.alibaba.nacos.api.exception.NacosException: No rpc client related to member: Member{ip='nacos3', port=8848, state=UP, extendInfo={raftPort=7848, readyToUpgrade=true}}
./alipay-jraft.log 2022-08-10 14:44:31,313 ERROR Fail to connect nacos3:7848, remoting exception: java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 0.444879771s. [buffered_nanos=595476345, waiting_for_connection]. 2022-08-10 14:44:32,583 ERROR Fail to connect nacos1:7848, remoting exception: java.util.concurrent.TimeoutException. io.grpc.StatusRuntimeException: CANCELLED: call already cancelled at io.grpc.Status.asRuntimeException(Status.java:524) 2022-08-10 14:44:33,590 ERROR Fail to connect nacos2:7848, remoting exception: java.util.concurrent.TimeoutException. 2022-08-10 14:44:46,979 ERROR Fail to connect nacos3:7848, remoting exception: java.util.concurrent.TimeoutException. 2022-08-10 14:44:47,011 ERROR Fail to connect nacos3:7848, remoting exception: java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 0.999782639s. [buffered_nanos=740087279, remote_addr=nacos3/172.18.0.4:7848]. io.grpc.StatusRuntimeException: CANCELLED: call already cancelled at io.grpc.Status.asRuntimeException(Status.java:524)
网络问题,容器内的8848端口暴露出来的8148/8248/8348;要么让这几个容器共享同一个网络,要么通过暴露出来的外部端口访问。
网络问题,容器内的8848端口暴露出来的8148/8248/8348;要么让这几个容器共享同一个网络,要么通过暴露出来的外部端口访问。
- 是在同一台CentOS上通过docker运行3个nacos节点的,所以用8148/8248/8348对应容器内的8848,这点应该是没问题吧
- “通过暴露出来的外部端口访问”,查看日志报错前还没访问nacos页面,后面用宿主机IP+8148/8248/8348分别去访问页面是正常的
网络问题,容器内的8848端口暴露出来的8148/8248/8348;要么让这几个容器共享同一个网络,要么通过暴露出来的外部端口访问。
- 是在同一台CentOS上通过docker运行3个nacos节点的,所以用8148/8248/8348对应容器内的8848,这点应该是没问题吧
- “通过暴露出来的外部端口访问”,查看日志报错前还没访问nacos页面,后面用宿主机IP+8148/8248/8348分别去访问页面是正常的
控制台上面的集群信息包含这三台服务器么?如果包含的话,那是正常的,因为你的三个节点是同时启动的,在这种情况下,后两个节点还没就绪,第一个节点就开始尝试连接nacos2
,nacos3
节点,这时候肯定是会有错误日志的。
网络问题,容器内的8848端口暴露出来的8148/8248/8348;要么让这几个容器共享同一个网络,要么通过暴露出来的外部端口访问。
- 是在同一台CentOS上通过docker运行3个nacos节点的,所以用8148/8248/8348对应容器内的8848,这点应该是没问题吧
- “通过暴露出来的外部端口访问”,查看日志报错前还没访问nacos页面,后面用宿主机IP+8148/8248/8348分别去访问页面是正常的
控制台上面的集群信息包含这三台服务器么?如果包含的话,那是正常的,因为你的三个节点是同时启动的,在这种情况下,后两个节点还没就绪,第一个节点就开始尝试连接
nacos2
,nacos3
节点,这时候肯定是会有错误日志的。
- 在页面上查看节点列表,是nacos1:8848、nacos2:8848、nacos3:8848
- 查看节点元数据,比较后都是
{
"lastRefreshTime": 1660118412368,
"raftMetaData": {
"metaDataMap": {
"naming_instance_metadata": {
"leader": "nacos2:7848",
"raftGroupMember": [
"nacos3:7848",
"nacos1:7848",
"nacos2:7848"
],
"term": 2
},
"naming_persistent_service": {
"leader": "nacos3:7848",
"raftGroupMember": [
"nacos3:7848",
"nacos1:7848",
"nacos2:7848"
],
"term": 2
},
"naming_persistent_service_v2": {
"leader": "nacos3:7848",
"raftGroupMember": [
"nacos3:7848",
"nacos1:7848",
"nacos2:7848"
],
"term": 2
},
"naming_service_metadata": {
"leader": "nacos3:7848",
"raftGroupMember": [
"nacos3:7848",
"nacos1:7848",
"nacos2:7848"
],
"term": 2
}
}
},
"raftPort": "7848",
"readyToUpgrade": true,
"version": "2.1.0"
}
Connection refused (Connection refused)
这个问题就是端口没监听对,环境问题,需要自行排查下。
Connection refused (Connection refused)
这个问题就是端口没监听对,环境问题,需要自行排查下。
我的.yaml文件(Code)是照着官方的cluster-hostname.yaml改的,除了mysql,其他基本一致,环境的操作是Operation,这边是缺少了什么步骤吗
从报错看就是从nacos1节点访问nacos2和nacos3节点的时候报错8848端口拒绝连接,要么就是节点启动,没监听对应端口,要么就是docker的配置文件有问题,访问对应端口失败了。
完全按照nacos-docker的操作步骤跑一下试试。
No more response from author, I think this is a env problem.