clustershell
clustershell copied to clipboard
In clush treemode mode, the command execution in a large-scale environment is slow
Hi,I have a difficult problem with the treemode mode of clush. The clush version used by me is clush 1.9.1, and the python version used by me is 3.8.13. There are 1200 nodes in my environment. Use two of the nodes as gateway to enable the treemode mode. Set fanout to 15. run the ls command for three minutes. Howerer, To disable the treemode mode. Set fanout to 15. run the ls command for 48 s only. I read the log by enable the debug mode. Found that It takes a long time to check the _on_remote_node_close output. Can you tell me why the treemode mode is enabled and the execution speed becomes slower?What may cause this?Thank you very much. the log like below cost long time: DEBUG:ClushterShell.Worker.Tree:_on_remote_node_close computer66 666 via gw control1
After detailed analysis, we find that the cause of the slow execution of a large number of nodes is that the node names are not regular and cannot be aggregated.Our nodes are similar to a-1-b-2-c, a-5-b-7-c, a-2-b-3-c.. After log positioning and code walkthrough, we found that the execution speed of the len function in the RangeSet.py file was slow. I tried to optimize it. After the optimization, the time-consuming operation was not here, but the total duration was not shortened. We found that the time-consuming operation was transferred. Due to the lack of understanding, it is too difficult to read the entire code. Can you provide some optimization suggestions for this time-consuming problem? Thank you very much