TDengine
TDengine copied to clipboard
helm部署tdengine-3.0.2.4集群(1mnode-3dnodes-3replica)mnode宕机和重装集群后,集群都不能恢复
Additional Context 在k8s环境中使用tdengine,会经常遇到集群mnode节点所在pod重启,集群重装情况,因此做了两种场景的测试。 “create database test replica 3 ;”,在3replica时,两种场景,tdengine集群都不能正常使用。情况如下 Bug Description 1、k8s环境helm部署tdengine-3.0.2.4集群(1mnode-3dnodes-3replica) 删掉pod,tdengine3-0(mnode)之后,新pod建好之后,使用客户端去查库,发现tdengine3-1和tdengine3-2状态offline 查数据时报错“DB error: Fail to get table info, error: Sync not leader”
To Reproduce Steps to reproduce the behavior: 模拟k8s集群中mnode节点所在pod重启场景,操作如下:
[root@node01 tdengine]# kubectl delete pod tdengine3-0
pod "tdengine3-0" deleted
[root@node01 tdengine]# kubectl get pod -w|grep tdengine3
tdengine3-0 0/1 Running 0 8s
tdengine3-1 1/1 Running 0 62s
tdengine3-2 1/1 Running 0 2m10s
tdengine3-0 1/1 Running 0 10s
[root@node01 tdengine]# kubectl exec -it tdengine3-0 -- /bin/bash
root@tdengine3-0:~# taos
Welcome to the TDengine Command Line Interface, Client Version:3.0.2.4
Copyright (c) 2022 by TDengine, all rights reserved.
****************************** Tab Completion **********************************
* The TDengine CLI supports tab completion for a variety of items, *
* including database names, table names, function names and keywords. *
* The full list of shortcut keys is as follows: *
* [ TAB ] ...... complete the current word *
* ...... if used on a blank line, display all valid commands *
* [ Ctrl + A ] ...... move cursor to the st[A]rt of the line *
* [ Ctrl + E ] ...... move cursor to the [E]nd of the line *
* [ Ctrl + W ] ...... move cursor to the middle of the line *
* [ Ctrl + L ] ...... clear the entire screen *
* [ Ctrl + K ] ...... clear the screen after the cursor *
* [ Ctrl + U ] ...... clear the screen before the cursor *
**********************************************************************************
Server is Community Edition.
taos> show dnodes;
id | endpoint | vnodes | support_vnodes | status | create_time | note |
=================================================================================================================================================
1 | tdengine3-0.tdengine3.defau... | 2 | 8 | ready | 2023-01-30 17:42:13.682 | |
2 | tdengine3-1.tdengine3.defau... | 2 | 0 | offline | 2023-01-30 17:43:35.428 | status not received |
3 | tdengine3-2.tdengine3.defau... | 2 | 0 | offline | 2023-01-30 17:44:50.947 | status not received |
Query OK, 3 row(s) in set (0.002463s)
taos> show databases;
name |
=================================
information_schema |
performance_schema |
test |
Query OK, 3 row(s) in set (0.002456s)
taos> use test;
Database changed.
taos> select * from demo;
DB error: Fail to get table info, error: Sync not leader (10.288545s)
taos> select * from demo;
DB error: Fail to get table info, error: Sync not leader (10.289980s)
查看日志:
01/30 17:58:09.002209 00000083 SYN vgId:3, succeed to write raft store file:/var/lib/taos/vnode/vnode3/sync/raft_store.json, len:88
01/30 17:58:09.008271 00000083 SYN vgId:3, succeed to write raft store file:/var/lib/taos/vnode/vnode3/sync/raft_store.json, len:69
01/30 17:58:09.014683 00000083 SYN vgId:3, succeed to write raft store file:/var/lib/taos/vnode/vnode3/sync/raft_store.json, len:88
01/30 17:58:09.417848 00000105 MND dnode:2, in offline state
01/30 17:58:09.417876 00000105 MND dnode:3, in offline state
01/30 17:58:14.421721 00000105 MND dnode:2, in offline state
01/30 17:58:14.421751 00000105 MND dnode:3, in offline state
01/30 17:58:15.747127 00000084 SYN vgId:2, begin election, sync:candidate, term:89, commit-index:-1, first-ver:0, last-ver:4, min:-1, snap:-1, snap-term:0, elect-times:71, as-leader-times:0, cfg-ch-times:0, hb-slow:0, hbr-slow:0, aq-items:0, snaping:-1, replicas:3, last-cfg:-1, chging:0, restore:0, quorum:2, elect-lc-timer:72, hb:0, buffer:[-1 -1 4, 5), repl-mgrs:{0:0 [0 0, 0), 1:0 [0 0, 0), 2:0 [0 0, 0)}, members:{num:3, as:0, [tdengine3-0.tdengine3.default.svc.cluster.local:6030, tdengine3-1.tdengine3.default.svc.cluster.local:6030, tdengine3-2.tdengine3.default.svc.cluster.local:6030]}, hb:{0:1675072259030,1:1675072259030,2:1675072259030}, hb-reply:{0:1675072259030,1:1675072259030,2:1675072259030}
01/30 17:58:15.760348 00000084 SYN vgId:2, succeed to write raft store file:/var/lib/taos/vnode/vnode2/sync/raft_store.json, len:88
01/30 17:58:15.775121 00000084 SYN vgId:2, succeed to write raft store file:/var/lib/taos/vnode/vnode2/sync/raft_store.json, len:69
01/30 17:58:15.783892 00000084 SYN vgId:2, succeed to write raft store file:/var/lib/taos/vnode/vnode2/sync/raft_store.json, len:88
01/30 17:58:15.977109 00000083 SYN vgId:3, begin election, sync:candidate, term:89, commit-index:-1, first-ver:0, last-ver:7, min:-1, snap:-1, snap-term:0, elect-times:71, as-leader-times:0, cfg-ch-times:0, hb-slow:0, hbr-slow:0, aq-items:0, snaping:-1, replicas:3, last-cfg:-1, chging:0, restore:0, quorum:2, elect-lc-timer:72, hb:0, buffer:[-1 -1 7, 8), repl-mgrs:{0:0 [0 0, 0), 1:0 [0 0, 0), 2:0 [0 0, 0)}, members:{num:3, as:0, [tdengine3-0.tdengine3.default.svc.cluster.local:6030, tdengine3-1.tdengine3.default.svc.cluster.local:6030, tdengine3-2.tdengine3.default.svc.cluster.local:6030]}, hb:{0:1675072259030,1:1675072259030,2:1675072259030}, hb-reply:{0:1675072259030,1:1675072259030,2:1675072259030}
Bug Description 2、helm卸载tdengine集群后,重装集群,集群无法启动
To Reproduce Steps to reproduce the behavior: 模拟k8s集群中集群重装场景,操作如下:
[root@node01 tdengine]# helm uninstall tdengine3
release "tdengine3" uninstalled
[root@node01 tdengine]# helm install tdengine3 tdengine-3.0.2.tgz -f values.yaml
NAME: tdengine3
LAST DEPLOYED: Mon Jan 30 19:25:38 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
...
[root@node01 tdengine]# kubectl get pod |grep tdengine3
tdengine3-0 0/1 Error 2 29s
查看日志
[root@node01 tdengine]# kubectl logs --tail=300 tdengine3-0
01/30 19:26:37.914055 00000049 UTL default numOfCommitThreads 2
01/30 19:26:37.914057 00000049 UTL default numOfMnodeReadThreads 1
01/30 19:26:37.914059 00000049 UTL default numOfVnodeQueryThreads 8
01/30 19:26:37.914060 00000049 UTL default ratioOfVnodeStreamThrea 1.00
01/30 19:26:37.914062 00000049 UTL default numOfVnodeFetchThreads 4
01/30 19:26:37.914063 00000049 UTL default numOfVnodeRsmaThreads 4
01/30 19:26:37.914070 00000049 UTL default numOfQnodeQueryThreads 8
01/30 19:26:37.914071 00000049 UTL default numOfSnodeSharedThreads 2
01/30 19:26:37.914073 00000049 UTL default numOfSnodeUniqueThreads 2
01/30 19:26:37.914074 00000049 UTL default rpcQueueMemoryAllowed 1633705164
01/30 19:26:37.914075 00000049 UTL default syncElectInterval 25000
01/30 19:26:37.914077 00000049 UTL default syncHeartbeatInterval 1000
01/30 19:26:37.914078 00000049 UTL default syncHeartbeatTimeout 20000
01/30 19:26:37.914080 00000049 UTL default vndCommitMaxInterval 60000
01/30 19:26:37.914081 00000049 UTL env_var monitor 1
01/30 19:26:37.914083 00000049 UTL default monitorInterval 30
01/30 19:26:37.914084 00000049 UTL env_var monitorFqdn taoskeeper
01/30 19:26:37.914086 00000049 UTL default monitorPort 6043
01/30 19:26:37.914088 00000049 UTL default monitorMaxLogs 100
01/30 19:26:37.914089 00000049 UTL default monitorComp 0
01/30 19:26:37.914091 00000049 UTL default crashReporting 1
01/30 19:26:37.914092 00000049 UTL default telemetryReporting 1
01/30 19:26:37.914093 00000049 UTL default telemetryInterval 43200
01/30 19:26:37.914095 00000049 UTL default telemetryServer telemetry.taosdata.com
01/30 19:26:37.914097 00000049 UTL default telemetryPort 80
01/30 19:26:37.914098 00000049 UTL default transPullupInterval 2
01/30 19:26:37.914099 00000049 UTL default mqRebalanceInterval 2
01/30 19:26:37.914101 00000049 UTL default ttlUnit 86400
01/30 19:26:37.914102 00000049 UTL default ttlPushInterval 86400
01/30 19:26:37.914104 00000049 UTL default uptimeInterval 300
01/30 19:26:37.914105 00000049 UTL default queryRsmaTolerance 1000
01/30 19:26:37.914107 00000049 UTL default walFsyncDataSizeLimit 104857600
01/30 19:26:37.914108 00000049 UTL default udf 1
01/30 19:26:37.914110 00000049 UTL default udfdResFuncs
01/30 19:26:37.914111 00000049 UTL default udfdLdLibPath
01/30 19:26:37.914113 00000049 UTL default configDir /etc/taos/
01/30 19:26:37.914114 00000049 UTL default scriptDir /etc/taos/
01/30 19:26:37.914116 00000049 UTL default logDir /var/log/taos/
01/30 19:26:37.914118 00000049 UTL default minimalLogDirGB 1.00
01/30 19:26:37.914121 00000049 UTL default numOfLogLines 10000000
01/30 19:26:37.914123 00000049 UTL default asyncLog 1
01/30 19:26:37.914126 00000049 UTL default logKeepDays 0
01/30 19:26:37.914128 00000049 UTL env_var debugFlag 143
01/30 19:26:37.914130 00000049 UTL default simDebugFlag 143
01/30 19:26:37.914133 00000049 UTL default tmrDebugFlag 131
01/30 19:26:37.914139 00000049 UTL default uDebugFlag 143
01/30 19:26:37.914141 00000049 UTL default rpcDebugFlag 143
01/30 19:26:37.914142 00000049 UTL default jniDebugFlag 143
01/30 19:26:37.914144 00000049 UTL default qDebugFlag 143
01/30 19:26:37.914146 00000049 UTL default cDebugFlag 143
01/30 19:26:37.914147 00000049 UTL default dDebugFlag 143
01/30 19:26:37.914149 00000049 UTL default vDebugFlag 143
01/30 19:26:37.914150 00000049 UTL default mDebugFlag 143
01/30 19:26:37.914151 00000049 UTL default wDebugFlag 143
01/30 19:26:37.914153 00000049 UTL default sDebugFlag 143
01/30 19:26:37.914154 00000049 UTL default tsdbDebugFlag 143
01/30 19:26:37.914155 00000049 UTL default tqDebugFlag 143
01/30 19:26:37.914157 00000049 UTL default fsDebugFlag 143
01/30 19:26:37.914162 00000049 UTL default udfDebugFlag 143
01/30 19:26:37.914163 00000049 UTL default smaDebugFlag 143
01/30 19:26:37.914165 00000049 UTL default idxDebugFlag 143
01/30 19:26:37.914166 00000049 UTL default tdbDebugFlag 143
01/30 19:26:37.914168 00000049 UTL default metaDebugFlag 143
01/30 19:26:37.914169 00000049 UTL default timezone Asia/Shanghai (CST, +0800)
01/30 19:26:37.914171 00000049 UTL default locale en_US.UTF-8
01/30 19:26:37.914172 00000049 UTL default charset UTF-8
01/30 19:26:37.914174 00000049 UTL default assert 1
01/30 19:26:37.914175 00000049 UTL env_var enableCoreFile 1
01/30 19:26:37.914177 00000049 UTL default numOfCores 4.00
01/30 19:26:37.914178 00000049 UTL default SSE42 0
01/30 19:26:37.914180 00000049 UTL default AVX 0
01/30 19:26:37.914182 00000049 UTL default AVX2 0
01/30 19:26:37.914183 00000049 UTL default FMA 0
01/30 19:26:37.914185 00000049 UTL default SIMD-builtins 0
01/30 19:26:37.914186 00000049 UTL default openMax 1048576
01/30 19:26:37.914188 00000049 UTL default streamMax 16
01/30 19:26:37.914189 00000049 UTL default pageSizeKB 4
01/30 19:26:37.914191 00000049 UTL default totalMemoryKB 15954152
01/30 19:26:37.914192 00000049 UTL default os sysname Linux
01/30 19:26:37.914194 00000049 UTL default os nodename tdengine3-0
01/30 19:26:37.914195 00000049 UTL default os release 4.18.0-305.3.1.el8.x86_64
01/30 19:26:37.914197 00000049 UTL default os version #1 SMP Tue Jun 1 16:14:33 UTC 2021
01/30 19:26:37.914200 00000049 UTL default os machine x86_64
01/30 19:26:37.914207 00000049 UTL default version 3.0.2.4
01/30 19:26:37.914209 00000049 UTL default compatible_version 3.0.0.0
01/30 19:26:37.914211 00000049 UTL default gitinfo 03f21309e6594f9bb2d08d8739539f742f38e86d
01/30 19:26:37.914213 00000049 UTL default buildinfo Built at 2023-01-17 17:40
01/30 19:26:37.914215 00000049 UTL =================================================================
01/30 19:26:37.916583 00000049 DND start to init dnode env
01/30 19:26:37.917392 00000049 DND succceed to read mnode file /var/lib/taos//dnode/dnode.json
01/30 19:26:37.918001 00000049 DND succceed to read mnode file /var/lib/taos//mnode/mnode.json
01/30 19:26:37.918194 00000049 DND file:/var/lib/taos//qnode/qnode.json not exist
01/30 19:26:37.918203 00000049 DND file:/var/lib/taos//snode/snode.json not exist
01/30 19:26:37.918983 00000049 DND dnode env is initialized
01/30 19:26:37.918990 00000049 DND start to init service
01/30 19:26:37.919002 00000049 DND node:dnode, start to open
01/30 19:26:37.919009 00000049 UTL worker:dnode-mgmt is initialized, min:1 max:1
01/30 19:26:37.919047 00000049 UTL worker:dnode-mgmt:0 is launched, total:1
01/30 19:26:37.919052 00000049 UTL worker:dnode-mgmt, queue:0x2f64600 is allocated, ahandle:0x2f64430
01/30 19:26:37.919058 00000056 UTL worker:dnode-mgmt:0 is running, thread:00000056
01/30 19:26:37.919129 00000057 UDF start to init udfd
01/30 19:26:37.919144 00000057 UDF udfd LD_LIBRARY_PATH: ::/usr/lib
01/30 19:26:37.919796 00000057 UDF udfd is initialized
01/30 19:26:37.919827 00000049 DND node:dnode, has been opened
01/30 19:26:37.919843 00000049 DND node:mnode, start to open
01/30 19:26:37.919895 00000049 WAL wal module is initialized, rsetId:3
01/30 19:26:37.920846 00000049 DND succceed to read mnode file /var/lib/taos//mnode/mnode.json
01/30 19:26:37.921018 00000049 DND mnode start to open
01/30 19:26:37.921025 00000049 MND start to open mnode in /var/lib/taos//mnode
01/30 19:26:37.922159 00000049 WAL vgId:1, reset commitVer to -1
01/30 19:26:37.922176 00000049 MND mnode-wal is initialized
01/30 19:26:37.922185 00000049 MND start to init sdb in /var/lib/taos//mnode
01/30 19:26:37.922196 00000049 MND sdb init success
01/30 19:26:37.922199 00000049 MND mnode-sdb is initialized
01/30 19:26:37.922216 00000049 MND sdb table:trans is initialized
01/30 19:26:37.922222 00000049 MND mnode-trans is initialized
01/30 19:26:37.922235 00000049 MND sdb table:cluster is initialized
01/30 19:26:37.922242 00000049 MND mnode-cluster is initialized
01/30 19:26:37.922248 00000049 MND sdb table:mnode is initialized
01/30 19:26:37.922250 00000049 MND mnode-mnode is initialized
01/30 19:26:37.922255 00000049 MND sdb table:qnode is initialized
01/30 19:26:37.922256 00000049 MND mnode-qnode is initialized
01/30 19:26:37.922278 00000049 MND sdb table:snode is initialized
01/30 19:26:37.922283 00000049 MND mnode-snode is initialized
01/30 19:26:37.922287 00000049 MND sdb table:dnode is initialized
01/30 19:26:37.922288 00000049 MND mnode-dnode is initialized
01/30 19:26:37.922296 00000049 MND sdb table:user is initialized
01/30 19:26:37.922297 00000049 MND mnode-user is initialized
01/30 19:26:37.922300 00000049 MND mnode-grant is initialized
01/30 19:26:37.922302 00000049 MND mnode-privilege is initialized
01/30 19:26:37.922307 00000049 MND sdb table:acct is initialized
01/30 19:26:37.922312 00000049 MND mnode-acct is initialized
01/30 19:26:37.922319 00000049 MND sdb table:stream is initialized
01/30 19:26:37.922321 00000049 MND mnode-stream is initialized
01/30 19:26:37.922325 00000049 MND sdb table:topic is initialized
01/30 19:26:37.922329 00000049 MND mnode-topic is initialized
01/30 19:26:37.922332 00000049 MND sdb table:consumer is initialized
01/30 19:26:37.922355 00000049 MND mnode-consumer is initialized
01/30 19:26:37.922363 00000049 MND sdb table:subscribe is initialized
01/30 19:26:37.922364 00000049 MND mnode-subscribe is initialized
01/30 19:26:37.922367 00000049 MND sdb table:vgroup is initialized
01/30 19:26:37.922370 00000049 MND mnode-vgroup is initialized
01/30 19:26:37.922376 00000049 MND sdb table:stb is initialized
01/30 19:26:37.922383 00000049 MND mnode-stb is initialized
01/30 19:26:37.922387 00000049 MND sdb table:sma is initialized
01/30 19:26:37.922388 00000049 MND mnode-sma is initialized
01/30 19:26:37.922412 00000049 MND mnode-infos is initialized
01/30 19:26:37.922429 00000049 MND mnode-perfs is initialized
01/30 19:26:37.922439 00000049 MND sdb table:db is initialized
01/30 19:26:37.922441 00000049 MND mnode-db is initialized
01/30 19:26:37.922446 00000049 MND sdb table:func is initialized
01/30 19:26:37.922448 00000049 MND mnode-func is initialized
01/30 19:26:37.922452 00000049 MND start to reset sdb
01/30 19:26:37.922460 00000049 MND sdb:trans is reset
01/30 19:26:37.922497 00000049 MND sdb:cluster is reset
01/30 19:26:37.922503 00000049 MND sdb:mnode is reset
01/30 19:26:37.922504 00000049 MND sdb:qnode is reset
01/30 19:26:37.922507 00000049 MND sdb:snode is reset
01/30 19:26:37.922509 00000049 MND sdb:dnode is reset
01/30 19:26:37.922510 00000049 MND sdb:user is reset
01/30 19:26:37.922512 00000049 MND sdb:acct is reset
01/30 19:26:37.922517 00000049 MND sdb:stream is reset
01/30 19:26:37.922519 00000049 MND sdb:subscribe is reset
01/30 19:26:37.922520 00000049 MND sdb:consumer is reset
01/30 19:26:37.922522 00000049 MND sdb:topic is reset
01/30 19:26:37.922524 00000049 MND sdb:vgroup is reset
01/30 19:26:37.922526 00000049 MND sdb:sma is reset
01/30 19:26:37.922528 00000049 MND sdb:stb is reset
01/30 19:26:37.922530 00000049 MND sdb:db is reset
01/30 19:26:37.922533 00000049 MND sdb:func is reset
01/30 19:26:37.922538 00000049 MND sdb reset success
01/30 19:26:37.922540 00000049 MND start to read sdb file:/var/lib/taos//mnode/data/sdb.data
01/30 19:26:37.922780 00000049 MND db:1.test, tsdbPageSize set from 4 to default 4
01/30 19:26:37.922867 00000049 MND vgId:1, has ready mnode:1, status:ready
01/30 19:26:37.922873 00000049 MND vgId:1, ep:tdengine3-0.tdengine3.default.svc.cluster.local:6030 dnode:1
01/30 19:26:37.922883 00000049 MND vgId:1, mnode sync not reconfig since readyMnodes:1 updatingMnodes:0
01/30 19:26:37.922895 00000049 MND read sdb file:/var/lib/taos//mnode/data/sdb.data success, commit index:51 term:2 config:-1
01/30 19:26:37.923074 00000049 MND mnode-sdb is initialized
01/30 19:26:37.923192 00000049 MND mnode-profile is initialized
01/30 19:26:37.923234 00000049 MND mnode-show is initialized
01/30 19:26:37.923574 00000049 MND mnode-query is initialized
01/30 19:26:37.923588 00000049 MND vgId:1, start to open sync, replica:0 selfIndex:0
01/30 19:26:37.923866 00000049 SYN vgId:0, succceed to read sync cfg file /var/lib/taos//mnode/sync/raft_config.json
01/30 19:26:37.924909 00000049 SYN vgId:0, use sync config from sync cfg file
01/30 19:26:37.924918 00000049 SYN vgId:1, start to open sync node, replica:1 selfIndex:0
01/30 19:26:37.924928 00000049 SYN vgId:1, index:0 ep:tdengine3-0.tdengine3.default.svc.cluster.local:6030 dnode:1 cluster:8649067683950306017
01/30 19:26:37.925013 00000049 SYN vgId:1, sync addr:15864812462105690113, dnode:1 cluster:8649067683950306017 fqdn:tdengine3-0.tdengine3.default.svc.cluster.local ip:10.244.1.127 port:6030 ipv4:2130834442
01/30 19:26:37.925031 00000049 SYN vgId:1, sync addr:15864812462105690113, dnode:1 cluster:8649067683950306017 fqdn:tdengine3-0.tdengine3.default.svc.cluster.local ip:10.244.1.127 port:6030 ipv4:2130834442
01/30 19:26:37.925428 00000049 SYN vgId:1, succceed to read raft store file /var/lib/taos//mnode/sync/raft_store.json
01/30 19:26:37.925658 00000049 SYN vgId:1, sync node commitIndex initialized as 51
01/30 19:26:37.926415 00000049 SYN vgId:1, init sync log buffer. buffer: [51 51 57, 58)
01/30 19:26:37.926427 00000049 SYN vgId:1, sync open, node:0x2f9f820 electInterval:25000 heartbeatInterval:1000 heartbeatTimeout:20000, sync:follower, term:2, commit-index:51, first-ver:0, last-ver:57, min:-1, snap:51, snap-term:2, elect-times:0, as-leader-times:0, cfg-ch-times:0, hb-slow:0, hbr-slow:0, aq-items:-1, snaping:-1, replicas:1, last-cfg:-1, chging:0, restore:0, quorum:1, elect-lc-timer:0, hb:0, buffer:[51 51 57, 58), repl-mgrs:{0:0 [0 0, 0)}, members:{num:1, as:0, [tdengine3-0.tdengine3.default.svc.cluster.local:6030]}, hb:{0:1675077997925}, hb-reply:{0:1675077997925}
01/30 19:26:37.926434 00000049 MND mnode-sync is opened, id:2
01/30 19:26:37.926437 00000049 MND mnode-sync is initialized
01/30 19:26:37.926451 00000049 MND mnode-telem is initialized
01/30 19:26:37.926463 00000049 MND mnode open successfully
01/30 19:26:37.926469 00000049 UTL worker:mnode-query is initialized, min:4 max:4
01/30 19:26:37.926514 00000049 UTL worker:mnode-query:0 is launched, total:1
01/30 19:26:37.926528 00000066 UTL worker:mnode-query:0 is running, thread:00000066
01/30 19:26:37.926542 00000049 UTL worker:mnode-query:1 is launched, total:2
01/30 19:26:37.926551 00000067 UTL worker:mnode-query:1 is running, thread:00000067
01/30 19:26:37.926571 00000049 UTL worker:mnode-query:2 is launched, total:3
01/30 19:26:37.926581 00000068 UTL worker:mnode-query:2 is running, thread:00000068
01/30 19:26:37.926599 00000049 UTL worker:mnode-query:3 is launched, total:4
01/30 19:26:37.926603 00000049 UTL worker:mnode-query, queue:0x2fa5400 is allocated, ahandle:0x2f92f90
01/30 19:26:37.926607 00000069 UTL worker:mnode-query:3 is running, thread:00000069
01/30 19:26:37.926608 00000049 UTL worker:mnode-fetch is initialized, min:1 max:1
01/30 19:26:37.926894 00000049 UTL worker:mnode-fetch:0 is launched, total:1
01/30 19:26:37.926909 00000049 UTL worker:mnode-fetch, queue:0x2fa59e0 is allocated, ahandle:0x2f92f90
01/30 19:26:37.926910 00000070 UTL worker:mnode-fetch:0 is running, thread:00000070
01/30 19:26:37.926913 00000049 UTL worker:mnode-read is initialized, min:1 max:1
01/30 19:26:37.926953 00000049 UTL worker:mnode-read:0 is launched, total:1
01/30 19:26:37.926959 00000049 UTL worker:mnode-read, queue:0x2fa5c30 is allocated, ahandle:0x2f92f90
01/30 19:26:37.926963 00000071 UTL worker:mnode-read:0 is running, thread:00000071
01/30 19:26:37.926963 00000049 UTL worker:mnode-write is initialized, min:1 max:1
01/30 19:26:37.927024 00000049 UTL worker:mnode-write:0 is launched, total:1
01/30 19:26:37.927031 00000049 UTL worker:mnode-write, queue:0x2fa5e80 is allocated, ahandle:0x2f92f90
01/30 19:26:37.927035 00000049 UTL worker:mnode-sync is initialized, min:1 max:1
01/30 19:26:37.927037 00000072 UTL worker:mnode-write:0 is running, thread:00000072
01/30 19:26:37.927108 00000049 UTL worker:mnode-sync:0 is launched, total:1
01/30 19:26:37.927115 00000049 UTL worker:mnode-sync, queue:0x2fa60d0 is allocated, ahandle:0x2f92f90
01/30 19:26:37.927118 00000073 UTL worker:mnode-sync:0 is running, thread:00000073
01/30 19:26:37.927146 00000049 UTL worker:mnode-sync-ctrl is initialized, min:1 max:1
01/30 19:26:37.927184 00000049 UTL worker:mnode-sync-ctrl:0 is launched, total:1
01/30 19:26:37.927189 00000049 UTL worker:mnode-sync-ctrl, queue:0x2fa6320 is allocated, ahandle:0x2f92f90
01/30 19:26:37.927195 00000049 DND node:mnode, has been opened
01/30 19:26:37.927203 00000074 UTL worker:mnode-sync-ctrl:0 is running, thread:00000074
01/30 19:26:37.927205 00000049 DND node:vnode, start to open
01/30 19:26:37.928011 00000049 UTL worker:vnode-query is initialized, min:8 max:8
01/30 19:26:37.928023 00000049 UTL worker:vnode-stream is initialized as auto
01/30 19:26:37.928027 00000049 UTL worker:vnode-fetch is initialized, max:4
01/30 19:26:37.928029 00000049 UTL worker:vnode-mgmt is initialized, min:1 max:1
01/30 19:26:37.928060 00000049 UTL worker:vnode-mgmt:0 is launched, total:1
01/30 19:26:37.928062 00000049 UTL worker:vnode-mgmt, queue:0x2fa7c00 is allocated, ahandle:0x2fa64d0
01/30 19:26:37.928070 00000077 UTL worker:vnode-mgmt:0 is running, thread:00000077
01/30 19:26:37.928575 00000049 DND succceed to read vnode file /var/lib/taos//vnode/vnodes.json
01/30 19:26:37.928761 00000049 DND open 2 vnodes with 2 threads
01/30 19:26:37.928806 00000078 DND thread:0, start to open 1 vnodes
01/30 19:26:37.928837 00000079 DND thread:1, start to open 1 vnodes
01/30 19:26:37.939003 00000078 WAL vgId:2, reset commitVer to -1
01/30 19:26:37.940079 00000079 WAL vgId:3, reset commitVer to -1
01/30 19:26:37.955883 00000079 VND vgId:3, start to open sync, replica:3 selfIndex:0
01/30 19:26:37.955892 00000079 VND vgId:3, index:0 ep:tdengine3-0.tdengine3.default.svc.cluster.local:6030 dnode:1 cluster:8649067683950306017
01/30 19:26:37.955895 00000079 VND vgId:3, index:1 ep:tdengine3-1.tdengine3.default.svc.cluster.local:6030 dnode:2 cluster:8649067683950306017
01/30 19:26:37.955897 00000079 VND vgId:3, index:2 ep:tdengine3-2.tdengine3.default.svc.cluster.local:6030 dnode:3 cluster:8649067683950306017
01/30 19:26:37.956894 00000078 VND vgId:2, start to open sync, replica:3 selfIndex:0
01/30 19:26:37.956903 00000078 VND vgId:2, index:0 ep:tdengine3-0.tdengine3.default.svc.cluster.local:6030 dnode:1 cluster:8649067683950306017
01/30 19:26:37.956906 00000078 VND vgId:2, index:1 ep:tdengine3-1.tdengine3.default.svc.cluster.local:6030 dnode:2 cluster:8649067683950306017
01/30 19:26:37.956910 00000078 VND vgId:2, index:2 ep:tdengine3-2.tdengine3.default.svc.cluster.local:6030 dnode:3 cluster:8649067683950306017
01/30 19:26:37.957101 00000079 SYN vgId:0, succceed to read sync cfg file /var/lib/taos/vnode/vnode3/sync/raft_config.json
01/30 19:26:37.957260 00000078 SYN vgId:0, succceed to read sync cfg file /var/lib/taos/vnode/vnode2/sync/raft_config.json
01/30 19:26:37.957433 00000079 SYN vgId:0, use sync config from sync cfg file
01/30 19:26:37.957442 00000079 SYN vgId:3, start to open sync node, replica:3 selfIndex:0
01/30 19:26:37.957446 00000079 SYN vgId:3, index:0 ep:tdengine3-0.tdengine3.default.svc.cluster.local:6030 dnode:1 cluster:8649067683950306017
01/30 19:26:37.957448 00000079 SYN vgId:3, index:1 ep:tdengine3-1.tdengine3.default.svc.cluster.local:6030 dnode:2 cluster:8649067683950306017
01/30 19:26:37.957450 00000079 SYN vgId:3, index:2 ep:tdengine3-2.tdengine3.default.svc.cluster.local:6030 dnode:3 cluster:8649067683950306017
01/30 19:26:37.957527 00000079 SYN vgId:3, sync addr:15864812462105690113, dnode:1 cluster:8649067683950306017 fqdn:tdengine3-0.tdengine3.default.svc.cluster.local ip:10.244.1.127 port:6030 ipv4:2130834442
01/30 19:26:37.957579 00000078 SYN vgId:0, use sync config from sync cfg file
01/30 19:26:37.957584 00000078 SYN vgId:2, start to open sync node, replica:3 selfIndex:0
01/30 19:26:37.957587 00000078 SYN vgId:2, index:0 ep:tdengine3-0.tdengine3.default.svc.cluster.local:6030 dnode:1 cluster:8649067683950306017
01/30 19:26:37.957590 00000078 SYN vgId:2, index:1 ep:tdengine3-1.tdengine3.default.svc.cluster.local:6030 dnode:2 cluster:8649067683950306017
01/30 19:26:37.957592 00000078 SYN vgId:2, index:2 ep:tdengine3-2.tdengine3.default.svc.cluster.local:6030 dnode:3 cluster:8649067683950306017
01/30 19:26:37.957702 00000078 SYN vgId:2, sync addr:15864812462105690113, dnode:1 cluster:8649067683950306017 fqdn:tdengine3-0.tdengine3.default.svc.cluster.local ip:10.244.1.127 port:6030 ipv4:2130834442
01/30 19:26:37.961059 00000079 SYN ERROR failed to resolve ipv4 addr, fqdn: tdengine3-1.tdengine3.default.svc.cluster.local
01/30 19:26:37.961072 00000079 SYN ERROR vgId:3, failed to determine raft member id, peer:0
01/30 19:26:37.961079 00000079 SYN ERROR vgId:3, failed to open sync node
01/30 19:26:37.961125 00000079 VND ERROR vgId:3, failed to open sync since Invalid host name
01/30 19:26:37.961132 00000079 VND ERROR vgId:3, failed to open sync since Invalid host name
01/30 19:26:37.971332 00000079 DND ERROR vgId:3, failed to open vnode by thread:1
01/30 19:26:37.971367 00000079 DND thread:1, numOfVnodes:1, opened:0 failed:1
01/30 19:26:38.019628 00000078 SYN ERROR failed to resolve ipv4 addr, fqdn: tdengine3-1.tdengine3.default.svc.cluster.local
01/30 19:26:38.019657 00000078 SYN ERROR vgId:2, failed to determine raft member id, peer:0
01/30 19:26:38.019665 00000078 SYN ERROR vgId:2, failed to open sync node
01/30 19:26:38.019668 00000078 VND ERROR vgId:2, failed to open sync since Invalid host name
01/30 19:26:38.019670 00000078 VND ERROR vgId:2, failed to open sync since Invalid host name
01/30 19:26:38.033913 00000078 DND ERROR vgId:2, failed to open vnode by thread:0
01/30 19:26:38.033925 00000078 DND thread:0, numOfVnodes:1, opened:0 failed:1
01/30 19:26:38.033973 00000049 DND ERROR there are total vnodes:2, opened:0
01/30 19:26:38.033991 00000049 DND ERROR failed to open vnode since success
01/30 19:26:38.033994 00000049 DND ERROR failed to init vnodes-mgmt since success
01/30 19:26:38.033996 00000049 DND start to close all vnodes
01/30 19:26:38.034013 00000049 UTL worker:vnode-mgmt:0 is stopping
01/30 19:26:38.034022 00000077 UTL worker:vnode-mgmt:0 qset:0x2fa7b60, got no message and exiting, thread:00000077
01/30 19:26:38.034038 00000049 UTL worker:vnode-mgmt:0 is stopped
01/30 19:26:38.034044 00000049 UTL worker:vnode-mgmt is closed
01/30 19:26:38.034046 00000049 UTL worker:vnode-mgmt, queue:0x2fa7c00 is freed
01/30 19:26:38.034050 00000049 DND vnodes mgmt worker is stopped
01/30 19:26:38.034055 00000049 DND close 0 vnodes with 2 threads
01/30 19:26:38.034058 00000049 DND total vnodes:0 are all closed
01/30 19:26:38.034063 00000049 UTL worker:vnode-query is closed
01/30 19:26:38.034067 00000049 UTL worker:vnode-stream is closed
01/30 19:26:38.034069 00000049 UTL worker:vnode-fetch is closed
01/30 19:26:38.920492 00000049 WAL wal module is cleaned up
01/30 19:26:38.920521 00000049 DND ERROR node:vnode, failed to open since success
01/30 19:26:38.920525 00000049 DND ERROR node:vnode, failed to open since success
01/30 19:26:38.920527 00000049 DND ERROR failed to open nodes since success
01/30 19:26:38.920530 00000049 DND shutting down the service
01/30 19:26:39.122728 00000049 UDF udfd start to stop, need cleanup:1, spawn err:0
01/30 19:26:39.122858 00000049 UDF udfd is cleaned up
01/30 19:26:39.423915 00000049 DND dnode env is cleaned up
[root@node01 tdengine]# kubectl logs --tail=300 tdengine3-0|grep ERROR
01/30 19:27:29.071629 00000077 SYN ERROR failed to resolve ipv4 addr, fqdn: tdengine3-1.tdengine3.default.svc.cluster.local
01/30 19:27:29.071655 00000077 SYN ERROR vgId:2, failed to determine raft member id, peer:0
01/30 19:27:29.071666 00000077 SYN ERROR vgId:2, failed to open sync node
01/30 19:27:29.071718 00000077 VND ERROR vgId:2, failed to open sync since Invalid host name
01/30 19:27:29.071724 00000077 VND ERROR vgId:2, failed to open sync since Invalid host name
01/30 19:27:29.071817 00000078 SYN ERROR failed to resolve ipv4 addr, fqdn: tdengine3-1.tdengine3.default.svc.cluster.local
01/30 19:27:29.071831 00000078 SYN ERROR vgId:3, failed to determine raft member id, peer:0
01/30 19:27:29.071836 00000078 SYN ERROR vgId:3, failed to open sync node
01/30 19:27:29.071839 00000078 VND ERROR vgId:3, failed to open sync since Invalid host name
01/30 19:27:29.071840 00000078 VND ERROR vgId:3, failed to open sync since Invalid host name
01/30 19:27:29.083876 00000077 DND ERROR vgId:2, failed to open vnode by thread:0
01/30 19:27:29.084833 00000078 DND ERROR vgId:3, failed to open vnode by thread:1
01/30 19:26:37.961125 00000079 VND ERROR vgId:3, failed to open sync since Invalid host name 01/30 19:26:37.961132 00000079 VND ERROR vgId:3, failed to open sync since Invalid host name
看日志,应该是网络域名配置问题。
这里使用helm chart是自己写的,还是tdengine官方提供的?
这里使用helm chart是自己写的,还是tdengine官方提供的?
用的官方的charts。首次安装部署tdengine3.x集群时,正常。当建库建表并插入数据后,helm uninstall 再去helm install时,就出问题了,集群无法启动。
同样的问题,已经提了几年了,我们也很想在k8s布署。
同样的问题,重启之后其他几个dnode一直就是offline状态,有什么解决方案吗
遇到同样的问题,k8s集群出问题恢复后,TDengine无法启动,报错: 08/21 15:54:45.169462 00000066 MND ERROR failed to open mnode since Invalid host name 08/21 15:54:45.169472 00000066 MND start to close mnode 08/21 15:54:45.169474 00000066 MND mnode is closed 08/21 15:54:45.169476 00000066 DND ERROR failed to open mnode since Invalid host name 08/21 15:54:45.169478 00000066 DND ERROR node:mnode, failed to open since Invalid host name 08/21 15:54:45.169481 00000066 DND ERROR node:mnode, failed to open since Invalid host name 08/21 15:54:45.169482 00000066 DND ERROR failed to open nodes since Invalid host name
确实在重启后集群状态会出现问题
开源版不建议部署K8s 维护比较困难,建议使用虚拟机/物理机部署,不过我们会持续关注这个问题,keep open
这是来自QQ邮箱的假期自动回复邮件。 您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。