datasophon
datasophon copied to clipboard
[Bug] [Module Name] Bug title
Search before asking
- [X] I had searched in the issues and found no similar issues.
What happened
Deployment cluster prompts distribution failure, please check the agent side logs
What you expected to happen
[root@p-mn-01 logs]# cat datasophon-worker.log [INFO] 2023-05-16 19:00:46 com.datasophon.common.utils.ShellUtils:[96] - 脚本返回的数据如下: x86_64 [INFO] 2023-05-16 19:00:46 com.datasophon.common.utils.ShellUtils:[169] - stopping node [INFO] 2023-05-16 19:00:49 com.datasophon.common.utils.ShellUtils:[169] - End stop node. [INFO] 2023-05-16 19:00:49 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:00:49 com.datasophon.worker.WorkerApplicationServer:[166] - Worker server stopped [INFO] 2023-05-16 19:00:54 com.datasophon.common.utils.ShellUtils:[96] - 脚本返回的数据如下: x86_64 [INFO] 2023-05-16 19:00:54 akka.event.slf4j.Slf4jLogger:[92] - Slf4jLogger started [INFO] 2023-05-16 19:00:54 akka.remote.Remoting:[83] - Starting remoting [INFO] 2023-05-16 19:00:55 akka.remote.Remoting:[83] - Remoting started; listening on addresses :[akka.tcp://datasophon@p-mn-01:2552] [INFO] 2023-05-16 19:00:55 akka.remote.Remoting:[83] - Remoting now listens on addresses: [akka.tcp://datasophon@p-mn-01:2552] [INFO] 2023-05-16 19:00:55 com.datasophon.common.utils.ShellUtils:[169] - no node to stop [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - starting node, logging to /opt/datasophon/datasophon-worker/node/x86/logs/node-p-mn-01.out [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - nohup /opt/datasophon/datasophon-worker/node/x86/node_exporter > /opt/datasophon/datasophon-worker/node/x86/logs/node-p-mn-01.out 2>&1 & [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - End restart node. [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1003(hive) gid=1003(hadoop) 组=1003(hadoop) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1004(elastic) gid=1004(elastic) 组=1004(elastic) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1005(hdfs) gid=1003(hadoop) 组=1003(hadoop) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1006(yarn) gid=1003(hadoop) 组=1003(hadoop) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1007(mapred) gid=1003(hadoop) 组=1003(hadoop) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1008(hbase) gid=1003(hadoop) 组=1003(hadoop) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[67] - 脚本返回的数据如下:{coreNum: 12, totalMem: 31.2607, totalDisk: 492.06} [INFO] 2023-05-16 19:01:05 com.datasophon.worker.WorkerApplicationServer:[155] - host info collect result:com.datasophon.common.utils.ExecResult@7749bf93 [INFO] 2023-05-16 19:01:05 com.datasophon.worker.WorkerApplicationServer:[93] - start worker [INFO] 2023-05-16 19:01:05 com.datasophon.worker.actor.RemoteEventActor:[39] - akka.tcp://datasophon@p-mn-01:2552-->akka.tcp://datasophon@p-mn-01:2551 associated [WARN] 2023-05-16 19:01:05 akka.serialization.Serialization(akka://datasophon):[78] - Using the default Java serializer for class [com.datasophon.common.model.StartWorkerMessage] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage' [INFO] 2023-05-16 19:03:53 com.datasophon.common.utils.ShellUtils:[96] - 脚本返回的数据如下: x86_64 [INFO] 2023-05-16 19:03:53 com.datasophon.common.utils.ShellUtils:[169] - stopping node [INFO] 2023-05-16 19:03:56 com.datasophon.common.utils.ShellUtils:[169] - End stop node. [INFO] 2023-05-16 19:03:56 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:03:56 com.datasophon.worker.WorkerApplicationServer:[166] - Worker server stopped
How to reproduce
Follow this step to deploy version 1.1.1
https://datasophon.github.io/datasophon-website/docs/current/%E4%BD%BF%E7%94%A8%E6%89%8B%E5%86%8C/%E5%88%9B%E5%BB%BA%E9%9B%86%E7%BE%A4
Anything else
No response
Version
dev
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before asking
- [X] I had searched in the issues and found no similar issues.
What happened
Deployment cluster prompts distribution failure, please check the agent side logs
What you expected to happen
[root@p-mn-01 logs]# cat datasophon-worker.log [INFO] 2023-05-16 19:00:46 com.datasophon.common.utils.ShellUtils:[96] - 脚本返回的数据如下: x86_64 [INFO] 2023-05-16 19:00:46 com.datasophon.common.utils.ShellUtils:[169] - stopping node [INFO] 2023-05-16 19:00:49 com.datasophon.common.utils.ShellUtils:[169] - End stop node. [INFO] 2023-05-16 19:00:49 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:00:49 com.datasophon.worker.WorkerApplicationServer:[166] - Worker server stopped [INFO] 2023-05-16 19:00:54 com.datasophon.common.utils.ShellUtils:[96] - 脚本返回的数据如下: x86_64 [INFO] 2023-05-16 19:00:54 akka.event.slf4j.Slf4jLogger:[92] - Slf4jLogger started [INFO] 2023-05-16 19:00:54 akka.remote.Remoting:[83] - Starting remoting [INFO] 2023-05-16 19:00:55 akka.remote.Remoting:[83] - Remoting started; listening on addresses :[akka.tcp://datasophon@p-mn-01:2552] [INFO] 2023-05-16 19:00:55 akka.remote.Remoting:[83] - Remoting now listens on addresses: [akka.tcp://datasophon@p-mn-01:2552] [INFO] 2023-05-16 19:00:55 com.datasophon.common.utils.ShellUtils:[169] - no node to stop [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - starting node, logging to /opt/datasophon/datasophon-worker/node/x86/logs/node-p-mn-01.out [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - nohup /opt/datasophon/datasophon-worker/node/x86/node_exporter > /opt/datasophon/datasophon-worker/node/x86/logs/node-p-mn-01.out 2>&1 & [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - End restart node. [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1003(hive) gid=1003(hadoop) 组=1003(hadoop) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1004(elastic) gid=1004(elastic) 组=1004(elastic) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1005(hdfs) gid=1003(hadoop) 组=1003(hadoop) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1006(yarn) gid=1003(hadoop) 组=1003(hadoop) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1007(mapred) gid=1003(hadoop) 组=1003(hadoop) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - uid=1008(hbase) gid=1003(hadoop) 组=1003(hadoop) [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[169] - usermod:无改变 [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:01:05 com.datasophon.common.utils.ShellUtils:[67] - 脚本返回的数据如下:{coreNum: 12, totalMem: 31.2607, totalDisk: 492.06} [INFO] 2023-05-16 19:01:05 com.datasophon.worker.WorkerApplicationServer:[155] - host info collect result:com.datasophon.common.utils.ExecResult@7749bf93 [INFO] 2023-05-16 19:01:05 com.datasophon.worker.WorkerApplicationServer:[93] - start worker [INFO] 2023-05-16 19:01:05 com.datasophon.worker.actor.RemoteEventActor:[39] - akka.tcp://datasophon@p-mn-01:2552-->akka.tcp://datasophon@p-mn-01:2551 associated [WARN] 2023-05-16 19:01:05 akka.serialization.Serialization(akka://datasophon):[78] - Using the default Java serializer for class [com.datasophon.common.model.StartWorkerMessage] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage' [INFO] 2023-05-16 19:03:53 com.datasophon.common.utils.ShellUtils:[96] - 脚本返回的数据如下: x86_64 [INFO] 2023-05-16 19:03:53 com.datasophon.common.utils.ShellUtils:[169] - stopping node [INFO] 2023-05-16 19:03:56 com.datasophon.common.utils.ShellUtils:[169] - End stop node. [INFO] 2023-05-16 19:03:56 com.datasophon.common.utils.ShellUtils:[145] - script execute success [INFO] 2023-05-16 19:03:56 com.datasophon.worker.WorkerApplicationServer:[166] - Worker server stopped
How to reproduce
Follow this step to deploy version 1.1.1
https://datasophon.github.io/datasophon-website/docs/current/%E4%BD%BF%E7%94%A8%E6%89%8B%E5%86%8C/%E5%88%9B%E5%BB%BA%E9%9B%86%E7%BE%A4
Anything else
No response
Version
dev
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
- In order for us to understand your request as soon as possible, please provide detailed information, version or pictures.
I saw the following error in the log. Please refer to the previous text for a complete log
[INFO] 2023-05-16 19:01:05 com.datasophon.worker.actor.RemoteEventActor:[39] - akka.tcp://datasophon@p-mn-01:2552-->akka.tcp://datasophon@p-mn-01:2551 associated [WARN] 2023-05-16 19:01:05 akka.serialization.Serialization(akka://datasophon):[78] - Using the default Java serializer for class [com.datasophon.common.model.StartWorkerMessage] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage'
The host name does not match the actual one filled in. Please read the deployment document before proceeding:
[INFO] 2023-05-16 19:01:05 com.datasophon.worker.actor.RemoteEventActor:[39] - akka.tcp://datasophon@p-mn-01:2552-->akka.tcp://datasophon@p-mn-01:2551 associated