clustershell icon indicating copy to clipboard operation
clustershell copied to clipboard

Bug fix:WorkTree Function Exception --rcopy

Open luxiaoyong opened this issue 2 years ago • 0 comments

I have a poblem with the tree execution mode, when i use rcopy params and copy a big file(more than 12M) from two romote host to local, problem like below:

command: clush -o -q -w host1,host2 -b -S --rcopy /home/collect.tar.gz --dest /home/tmp/

output: Exception in thread Task-2: Traceback (most recent call last): File "env/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "env/lib/python3.8/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "env/lib/python3.8/site-packages/ClusterShell/Task.py", line 390, in _thread_start self.excepthook(*sys.exc_info()) File "env/lib/python3.8/site-packages/ClusterShell/CLI/Clush.py", line 822, in clush_excepthook raise exp File "env/lib/python3.8/site-packages/ClusterShell/Task.py", line 388, in _thread_start self._resume() File "env/lib/python3.8/site-packages/ClusterShell/Task.py", line 790, in _resume self._run(self.timeout) File "env/lib/python3.8/site-packages/ClusterShell/Task.py", line 403, in _run self._engine.run(timeout) File "env/lib/python3.8/site-packages/ClusterShell/Engine/Engine.py", line 723, in run self.runloop(timeout) File "env/lib/python3.8/site-packages/ClusterShell/Engine/EPoll.py", line 157, in runloop client._handle_read(sname) File "env/lib/python3.8/site-packages/ClusterShell/Worker/Exec.py", line 192, in _handle_read node_msgline(key, msg, sname) # handle full msg line File "env/lib/python3.8/site-packages/ClusterShell/Worker/Exec.py", line 166, in _on_nodeset_msgline self.worker._on_node_msgline(nodes, msg, sname) File "env/lib/python3.8/site-packages/ClusterShell/Worker/Worker.py", line 277, in _on_node_msgline self.eh.ev_read(self, node, sname, msg) File "env/lib/python3.8/site-packages/ClusterShell/Communication.py", line 258, in ev_read self.recv(msg) File "env/lib/python3.8/site-packages/ClusterShell/Propagation.py", line 270, in recv self.recv_ctl(msg) File "env/lib/python3.8/site-packages/ClusterShell/Propagation.py", line 376, in recv_ctl metaworker._on_remote_node_close(node, rc, self.gateway) File "env/lib/python3.8/site-packages/ClusterShell/Worker/Tree.py", line 459, in _on_remote_node_close bnode, len(tmptar.getmembers()), File "env/lib/python3.8/tarfile.py", line 1791, in getmembers self._load() # all members, we first have to File "env/lib/python3.8/tarfile.py", line 2379, in _load tarinfo = self.next() File "env/lib/python3.8/tarfile.py", line 2312, in next raise ReadError("unexpected end of data") tarfile.ReadError: unexpected end of data

When files are copied from multiple remote nodes to a local node and the size of the copied files is large (for example, 12 M), the transmission uses the fragment mode, and the transmission of multiple nodes will not be ended at the same time. When one of the nodes finishes, it will receive the RET message, and the _on_remote_node_close function will be triggered. In this case, the nodes that have not finished the transmission will also extract files, leading to the error. I fixed this bug by modifying the this code, It is not clear whether these modifications will cause other abnormalities. I hope you can help review the code. Thank you very much.

luxiaoyong avatar Nov 26 '23 03:11 luxiaoyong