chef-rundeck icon indicating copy to clipboard operation
chef-rundeck copied to clipboard

Set global JVM Socket Timeout for long-running tasks that rundeck can't seem to interrupt.

Open noahlz opened this issue 5 years ago • 1 comments

Our rundeck is hanging on jobs for days due to socketRead0 see below.

I'm going to add the following to the rundeck profile erb in order to attempt to kill such hanging processes after say, 12 hours (it will be an attribute).

sun.net.client.defaultReadTimeout

https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html

"WinRM output reader for command [2B856BD0-D831-4BB9-A154-184286605B6A]" daemon prio=10 tid=0x00007f5d4477c800 nid=0x1267 runnable [0x00007f5d7996d000
]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:152)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
        at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
        at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
        at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
        at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:712)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:517)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient.doSendRequest(WinRmClient.java:421)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient.access$100(WinRmClient.java:102)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient$PrivilegedSendMessage.run(WinRmClient.java:393)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient$PrivilegedSendMessage.run(WinRmClient.java:382)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient.runPrivileged(WinRmClient.java:368)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient.sendRequest(WinRmClient.java:352)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient.receiveOutput(WinRmClient.java:191)
        at com.xebialabs.overthere.cifs.winrm.CifsWinRmConnection$2.run(CifsWinRmConnection.java:156)```

noahlz avatar Jul 29 '19 15:07 noahlz

Incidentally, the better approach is that WinRmClient supports a configuration option for socket timeouts, that we can configure in rundeck. I tried adding

        winrm-socket-timeout="43200000"

to the project etc/resource.xml as a shot in the dark - didn't work (obviously).

https://github.com/xebialabs/overthere#cifs_troubleshooting

noahlz avatar Jul 29 '19 16:07 noahlz