deepflow icon indicating copy to clipboard operation
deepflow copied to clipboard

[BUG] The problem of batch adding cluster agent nodes

Open yhcloud123 opened this issue 11 months ago • 2 comments

Search before asking

  • [X] I had searched in the issues and found no similar feature requirement.

DeepFlow Component

Agent

What you expected to happen

There are 180 cluster agents deployed for the first time, and they can be successfully joined after a while. The MySQL server is functioning normally When adding the second cluster, deploy 130 agents in bulk, and the agents will continue to wait. During this time, MySQL 16c CPU will run full All of them were select statements, which caused an increase in MySQL load. After half an hour, they were added to the cluster and returned to normal. 4afedd24b8d693fa5c1579937927e21

How to reproduce

This issue occurs when adding other cluster agents in bulk, and the logic of adding nodes does not meet expectations, such as loop queries

DeepFlow version

6.4 tls

DeepFlow agent list

No response

Kubernetes CNI

No response

Operation-System/Kernel version

No response

Anything else

No response

Are you willing to submit a PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

yhcloud123 avatar Mar 07 '24 09:03 yhcloud123

Hello, what is the current size of the k8s cluster? What is the corresponding node node configuration? How many resources are allocated to deepflow-server and deepflow-agent? And what are the resources allocated by mysql and clickhouse connected to deepflow-server?

1473371932 avatar Mar 11 '24 03:03 1473371932

Hello, what is the current size of the k8s cluster? What is the corresponding node node configuration? How many resources are allocated to deepflow-server and deepflow-agent? And what are the resources allocated by mysql and clickhouse connected to deepflow-server?

@SongZhen0704 deepflow-server 独立物理机80376 deepflow-agent 12048 外部:mysql 16*32 外部ck集群资源够 首次部署agent,不存在并发问题一次性写入,第二次批量加入大量的agent就会存在,循环select语句,直到最后写入成功,小批量加其他集群40个不会出现此问题。 看加入node agent的逻辑是先查表有没有这个ip,后面才会写入,不知道为何会长时间的不断查

yhcloud123 avatar Mar 11 '24 09:03 yhcloud123