ssh2
ssh2 copied to clipboard
Properly close connection on EOF
I ran into #1029 . The workaround in that issue works for most clients but fails with PuTTY scp (pscp
). Looking into it further, it seems that these clients send CHANNEL_EOF
when exit
, bye
, or quit
is executed. When the server does not close the connection, the session hangs. This PR addresses the issue by handling the eof
signal and properly closing the connection. This change has been tested to work for sftp
, pscp
, WinSCP
, and paramiko
.
I'm not completely sure that this is the correct place, but hopefully if it is not, it is helpful in pointing to the correct place to fix this.
I'm not sure that this is the correct behavior, especially if the local side still wants to send data. The EOF from the server just means it will send no more data.
If we send a channel close request to the server, it should send a close event back.... AFAIK that's what ssh2
should already be doing.
@mscdex This is changing the server response to the client. The client sends CHANNEL_EOF
and the server responds with closing the connection. Clients such as sftp
send CHANNEL_EOF
when the user runs exit
, quit
, or bye
. Currently, the server implementation causes sftp
to just hang. That's what was happening in the issue that I referenced. This change causes that client to exit successfully as well as many other clients that I tested. If the client has signaled it will no longer send data, I believe it would be correct for the server to close the connection. That also seems to be what all those clients are expecting.
There is definitely a bug in the server implementation. It is easy to reproduce as #1029 detailed. It's possible that this change is negatively affecting the client implementation as I have only been using this as a server. Perhaps there is a better place to handle this @mscdex?
This is the relevant part of the spec: https://datatracker.ietf.org/doc/html/rfc4254#section-5.3
I do not know why clients like sftp
send SSH_MSG_CHANNEL_EOF
instead of SSH_MSG_CHANNEL_CLOSE
. But, the client sending SSH_MSG_CHANNEL_EOF
specifically means that the party will no longer send more data to a channel
. At that point, the server initiating the close is valid according to the spec and all the clients I have tested with seem to expect that behavior and exit successfully.
@mscdex The change has been updated to only operate this way when the channel is SFTP
. That has been the only instance identified where this issue exists. The tests pass in this configuration
The master
branch currently has the lint error that these checks are indicating
Fixed.
On of my users of a project that I manage that is based on SSH2 just reported the same symptoms. This is just a "+1" on this issue. Will be watching here to review. Assuming that the code changes will be published as a new "npm" distribution when ready.
I'm the afformentioned user. I've manually applied the fix and will let you know if this is indeed the solution I needed.
@cabarnes - Lint says you've got an unused variable here: https://github.com/smartfile/ssh2/blob/fix-disconnect/lib/protocol/SFTP.js#L2596 which came from https://github.com/mscdex/ssh2/commit/f763271f41320e71d5cbee02ea5bc6a2ded3ca21
@jogoldberg I did not change that file in this PR. It was a lint error that was in master
when the tests were run. The error has since been fixed in master
, but the tests haven't been re-run
Ah. If you do
git fetch upstream
git merge upstream/master
git push
that should sort you.
If those commands don't work, you might first need to do:
git remote add upstream https://github.com/mscdex/ssh2.git
The tests won't run again until @mscdex runs them due to this: https://github.blog/changelog/2021-04-22-github-actions-maintainers-must-approve-first-time-contributor-workflow-runs/
The tests won't run again until @mscdex runs them due to this: https://github.blog/changelog/2021-04-22-github-actions-maintainers-must-approve-first-time-contributor-workflow-runs/
Yes, but when he does re-run the tests, the tests will be re-run against your fork. And your fork is 5 commits behind upstream master.
Done
@mscdex - I can now confirm that this solved my issue. Can you please merge/release this?
@mscdex - Just to add to the details, this solves the issue being experienced by client using OpenSSH_7.4p1
@mscdex - Can you please merge/release this? It is urgently needed for a production issue we're facing
@mscdex - This is urgently needed. Can you please have a look?
@jogoldberg I'd like to have a test for this first, I just haven't had time to write one. If @cabarnes can include one, that will help get this merged quicker. AFAIK this should be easily achievable using ssh2
SSH/SFTP implementations on both ends.
@mscdex I have added a test for this condition. It fails (hangs) without the change in this PR
@mscdex Requesting you to kindly approve. We too are in dire need of this in production
I should be able to take a look at it this weekend.
@mscdex I have it skipping the integration test on Windows like the other integration tests. I don't know why the Linux tests are failing. They pass for me on my Linux system with the same version of node and OpenSSH:
$ npm test
> [email protected] test
> node test/test.js
> Running test-exec.js ...
> Running test-integration-openssh-sftp.js ...
Testing with OpenSSH version: 8.2
> Running test-integration-openssh.js ...
Testing with OpenSSH version: 8.2
> Running test-misc-client-server.js ...
> Running test-openssh.js ...
> Running test-protocol-crypto.js ...
Crypto binding available
Testing cipher: null, mac: <none> (native encrypt, native decrypt) ...
Testing cipher: [email protected], mac: <implicit> (native encrypt, native decrypt) ...
Testing cipher: [email protected], mac: <implicit> (binding encrypt, native decrypt) ...
Testing cipher: [email protected], mac: <implicit> (native encrypt, binding decrypt) ...
Testing cipher: [email protected], mac: <implicit> (binding encrypt, binding decrypt) ...
Testing cipher: [email protected], mac: <implicit> (native encrypt, native decrypt) ...
Testing cipher: [email protected], mac: <implicit> (binding encrypt, native decrypt) ...
Testing cipher: [email protected], mac: <implicit> (native encrypt, binding decrypt) ...
Testing cipher: [email protected], mac: <implicit> (binding encrypt, binding decrypt) ...
Testing cipher: aes128-cbc, mac: [email protected] (native encrypt, native decrypt) ...
Testing cipher: aes128-cbc, mac: [email protected] (binding encrypt, native decrypt) ...
Testing cipher: aes128-cbc, mac: [email protected] (native encrypt, binding decrypt) ...
Testing cipher: aes128-cbc, mac: [email protected] (binding encrypt, binding decrypt) ...
Testing cipher: aes128-ctr, mac: hmac-sha1 (native encrypt, native decrypt) ...
Testing cipher: aes128-ctr, mac: hmac-sha1 (binding encrypt, native decrypt) ...
Testing cipher: aes128-ctr, mac: hmac-sha1 (native encrypt, binding decrypt) ...
Testing cipher: aes128-ctr, mac: hmac-sha1 (binding encrypt, binding decrypt) ...
Testing cipher: arcfour, mac: hmac-sha2-256-96 (native encrypt, native decrypt) ...
Testing cipher: arcfour, mac: hmac-sha2-256-96 (binding encrypt, native decrypt) ...
Testing cipher: arcfour, mac: hmac-sha2-256-96 (native encrypt, binding decrypt) ...
Testing cipher: arcfour, mac: hmac-sha2-256-96 (binding encrypt, binding decrypt) ...
> Running test-protocol-keyparser.js ...
> Running test-server-hostkeys.js ...
> Running test-sftp.js ...
> Running test-shell.js ...
> Running test-userauth-agent-openssh.js ...
> Running test-userauth-agent.js ...
> Running test-userauth.js ...
@cabarnes That's why I was hoping we could use ssh2
for both the client and server. AFAIK the client could just manually request the sftp subsystem and then call eof()
on the channel to trigger the newly added code path on the server?
@mscdex That should be able to be done. It would prove that if EOF
is sent that code will run. But, it doesn't prove that it fixes anything. The integration test was using the exact client that has the issue and proving that this change fixes it. I didn't think it would be an issue since I'm using Ubuntu with all the same versions as the test environment. However, something isn't working in that environment. I'm not sure how to debug it, so if you'd rather just have the unit test, I'll remove the integration test
@cabarnes I think a unit test will be the best way since we can control things at a lower level. Relying on OpenSSH to keep the same behavior probably isn't worth it, especially if there is problems with it on Windows.
@mscdex I have added the comment and the test. Let me know what you think
Hi @mscdex - cabarnes made your requested changes a little over a week ago. Do you think you might have time to review this today? :)
Hi @mscdex - Looks like the latest revision passed all your automated checks now. Can you please merge/release this fix? We've been operating based on a manual patching of the library now for quite some time. Thanks in advance! :)