mina-sshd
mina-sshd copied to clipboard
Fix: close session when timeout during connect
Hi there,
We have encouraged an issue in which the SSHD doesn’t close the session when connecting timeout.
The code we are using is attached here. https://gist.github.com/onyas/fda8ccf2d373c743ccec8f648782d078
The TCP socket is still ESTABLISHED after sshClient.connect() threw an exception.

If I set the timeout to 1 millisecond, the sshClient.connnect() will throw an exception, since it’s timeout so the session is still null, and it doesn't help if we call session.close().
And we know we cloud call sshClient.stop(), but in our scenario, we reuse the same instance so we couldn’t do that.
After checking the source code, we find the SSHD didn’t close the session when timeout. So I created this pull request to help close the session.
Please help review and thanks.
As far as I see what happens is that the asynchronous connection attempt succeeds after the timeout.
I'd move the timeout flag and the isTimeout() method to DefaultSshFuture. I also think the flag must be set and read under lock, otherwise there can still be a race condition. And instead of
if (connectFuture.isTimeout()) {
session.close();
} else {
connectFuture.setSession(session);
}
there needs to be a mechanism to do this atomically. Something like
/** Like setValue; but sets the value only if not timed out yet. Throws TimeoutException otherwise. */
public void setValueIfNotTimedOut(Object value) throws TimeoutException {
synchronized(lock) {
if (result != null ) {
return;
}
if (isTimeout()) {
throw new TimeoutException();
}
result = (newValue != null) ? newValue : GenericUtils.NULL;
lock.notifyAll();
}
}
and then
private void setSessionSafely(ConnectFuture connectFuture, ClientSession session) throws IOException {
try {
connectFuture.setValueIfNotTimedOut(session);
} catch (TimeoutException e) {
session.close();
}
}
Finally, it would be really cool if this could be done before the ClientSession gets created at all (by closing the IoSession immediately). Creating the client session will already start the SSH protocol. But the ClientSession is created in AbstractSessionIoHandler.sessionCreated(IoSession), where we don't have access to the ConnectFuture, so I don't quite see how this could be done.
Hi, @tomaswolf Thank you for your insights, and I updated the code according to your suggestions. Please help review.
Thanks.
Hi, @tomaswolf Thank you for the suggestions, I updated the code, please help review.
Thanks.
Can we write a test for this?
I've created issue SSHD-1295 in our bug tracker for this, and have posted a test case there. Please incorporate that test into this change. (Note that it may need to be adapted if we never set the session on the future if the future had timed out before.)
Hi @tomaswolf All the suggestions are implemented, please take a look.
Thanks.
Looks good now. Is it OK for you if I squash this series and fix some of the javadoc comments, and then force-push onto this branch?
Sure, please go ahead.
@lgoldstein and @gnodet: I'd appreciate your opinions on this. Please also see my thoughts at SSHD-1295.
Hi, @tomaswolf Good day, another question, when is this version 2.9.2 planned to be released?
"2.9.2-SNAPSHOT" is just the current development version. I don't think we'll do another patch release for this. So most likely the change will come whenever 2.10.0 is released, which may be in a few months.
Just a heads up: the basic idea is going in the right direction, but cancelling a ConnectFuture has the same problem. I've come to the conclusion that cancellation has to be sorted out first, and then this time-out case can be based on that. (I.e., client code would get the possibility to specify in the verify() call whether to cancel a future automatically on a time-out.) That gives client code full control.
I'm working on this.
client code would get the possibility to specify in the verify() call whether to cancel a future automatically on a time-out
Let's keep backward compatibility as much as possible with existing API - i.e., keep the existing verify without the flag and delegate it to call the new verify using some default that you deem relevant.
BTW, @tomaswolf you are doing great work on the project (I really appreciate it) - I wish I had the time I used to, so I could match your prolific work both in quality and quantity.
keep the existing verify without the flag and delegate it to call the new verify using some default that you deem relevant
Yes, that was my thought exactly. The new default for the existing verify() would be to cancel to avoid this problem of having a "hidden" session established after the time-out that the user cannot really get to.
That would solve the reported problems; and I consider calling verify() on the same ConnectFuture in different threads so extremely borderline (unfortunately it's something that the current API has always allowed) that it is IMO acceptable to require client code that does that currently to be changed. A notice to that effect in the release notes will be needed.
The new default for the existing verify() would be to cancel to avoid this problem of having a "hidden" session established after the time-out that the user cannot really get to.
Seems reasonable - I trust your judgement on this.
"2.9.2-SNAPSHOT" is just the current development version. I don't think we'll do another patch release for this. So most likely the change will come whenever 2.10.0 is released, which may be in a few months.
Hi, @tomaswolf do we have a release date for 2.10.0?
do we have a release date for 2.10.0?
No.
Is there any possibility to release it early next year? our customer is eager to get this fixed. If not, can you provide any plan so that we can set the expectation for the customer.
I can't promise anything. All the maintainers in this project are unpaid volunteers doing this in their spare time. I wrote in October I'd be working on this, and I am, but progress depends on the amount of spare time I have and that I am willing to dedicate to this. (Evidently, I do many other things in my spare time, too.) So I can't give a deadline.
Got it. Thank you for your contribution.
Please see #315 for my take on this.
Superceded by #315.