datasophon icon indicating copy to clipboard operation
datasophon copied to clipboard

V1.2.0版本部署3个fe节点的doris报错

Open laiwei1986 opened this issue 2 years ago • 1 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

当部署多fe节点的doris集群时部署失败,只有1个fe节点能启动成功,其他fe节点显示失败,日志如下: [INFO] 2023-12-08 10:54:08 TaskLogLogger-DORIS-DorisFE:[176] - fe/bin/status_fe.sh: line 50: [: -eq: unary operator expected [INFO] 2023-12-08 10:54:08 TaskLogLogger-DORIS-DorisFE:[176] - http request failed, return value is: [INFO] 2023-12-08 10:54:08 TaskLogLogger-DORIS-DorisFE:[176] - fe is not ready [INFO] 2023-12-08 10:54:13 TaskLogLogger-DORIS-DorisFE:[71] - check start result at times 19 [INFO] 2023-12-08 10:54:13 TaskLogLogger-DORIS-DorisFE:[175] - execute shell command : [bash, fe/bin/status_fe.sh, status, fe] [INFO] 2023-12-08 10:54:13 TaskLogLogger-DORIS-DorisFE:[176] - pid is 98561 [INFO] 2023-12-08 10:54:14 TaskLogLogger-DORIS-DorisFE:[176] - fe/bin/status_fe.sh: line 50: [: -eq: unary operator expected [INFO] 2023-12-08 10:54:14 TaskLogLogger-DORIS-DorisFE:[176] - http request failed, return value is: [INFO] 2023-12-08 10:54:14 TaskLogLogger-DORIS-DorisFE:[176] - fe is not ready [INFO] 2023-12-08 10:54:19 TaskLogLogger-DORIS-DorisFE:[71] - check start result at times 20 [INFO] 2023-12-08 10:54:19 TaskLogLogger-DORIS-DorisFE:[175] - execute shell command : [bash, fe/bin/status_fe.sh, status, fe] [INFO] 2023-12-08 10:54:19 TaskLogLogger-DORIS-DorisFE:[176] - pid is 98561 [INFO] 2023-12-08 10:54:19 TaskLogLogger-DORIS-DorisFE:[176] - fe/bin/status_fe.sh: line 50: [: -eq: unary operator expected [INFO] 2023-12-08 10:54:19 TaskLogLogger-DORIS-DorisFE:[176] - http request failed, return value is: [INFO] 2023-12-08 10:54:19 TaskLogLogger-DORIS-DorisFE:[176] - fe is not ready [INFO] 2023-12-08 10:54:24 TaskLogLogger-DORIS-DorisFE:[86] - start doris-1.2.6 timeout [ERROR] 2023-12-08 10:54:24 TaskLogLogger-DORIS-DorisFE:[75] - slave fe start failed 另外查看fe的log信息如下: 2023-12-08 11:04:37,971 WARN (main|1) [Env.getClusterIdAndRole():1008] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.200.26.137:9010] 2023-12-08 11:04:42,975 WARN (main|1) [Env.getFeNodeTypeAndNameFromHelpers():1135] failed to get fe node type from helper node: 10.200.26.137:9010. 2023-12-08 11:04:42,976 WARN (main|1) [Env.getClusterIdAndRole():1008] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.200.26.137:9010] 2023-12-08 11:04:47,981 WARN (main|1) [Env.getFeNodeTypeAndNameFromHelpers():1135] failed to get fe node type from helper node: 10.200.26.137:9010. 2023-12-08 11:04:47,982 WARN (main|1) [Env.getClusterIdAndRole():1008] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.200.26.137:9010] 目前发现status_be.sh脚本是通过fe18030端口获取信息判断fe是否启动成功,而但follow未被添加时,此fe节点的18030端口是没有打开的。

What you expected to happen

/

How to reproduce

/

Anything else

No response

Version

main

Are you willing to submit PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

laiwei1986 avatar Dec 08 '23 03:12 laiwei1986

You first need to determine the reason why FE has not been added to the cluster

datasophon avatar Dec 08 '23 03:12 datasophon