Fix some EOF-handling issues in TLS
This fixes a pair of issues with EOF handling in TLSWrap. This originally came up because @codebytere noticed an incompatibility with Node and a recent BoringSSL change, but the root cause is a bug in Node that also impacts how it uses OpenSSL. I've split it into two commits in case you all want to backport only one of them, as the second changes how some errors are surfaced. Node was previously incorrectly representing a class of TLS errors are ECONNRESET, dropping the true error on the ground.
The changes are as follows:
tls: fix re-entrancy issue with TLS close_notify
Like errno, OpenSSL's API requires SSL_get_error and error queue be checked immediately after the failing operation, otherwise the error queue or SSL object may have changed state and no longer report information about the operation the caller wanted.
TLSWrap almost heeds this rule, except in TLSWrap::ClearOut. If SSL_read picks up a closing alert (detected by checking SSL_get_shutdown), Node calls out to JS with EmitRead(UV_EOF) and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error.
In particular, I've observed that, on close_notify, JS seems to sometimes call back into TLSWrap::DoShutdown, calling SSL_shutdown. (I think this comes from onStreamRead in stream_base_commons.js?)
Instead, SSL_get_error and the error queue should be sampled earlier. Back in https://github.com/nodejs/node/pull/1661, Node needed to account for GetSSLError being called after ssl_ was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case where ssl_ may be destroyed before SSL_get_error is a case where ssl_ or the error queue could change state, so it's a bug either way.)
I've done this by just moving the calls up a bit, to preserve the existing C++ => JS behavior, but that revealed a bigger issue about EmitRead(UV_EOF), which is the second commit.
tls: don't treat fatal TLS alerts as EOF
SSL_RECEIVED_SHUTDOWN means not just close_notify but also fatal alert. From what I can tell, treating fatal alert as EOF was just a mistake? OnStreamRead's comment suggests eof_ was intended to be for close_notify.
This fixes a bug in TLSSocket error reporting that seems to have made it into existing tests. If we receive a fatal alert, EmitRead(UV_EOF) would, via onConnectEnd in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node.
See the tests that had to be updated as part of this.
I should note, while I'm familiar with TLS and OpenSSL, I'm not at all familiar with Node's streams machinery or sockets APIs. Do you all actually expect to emit EOF on error? This change removes an EmitRead(UV_EOF) on fatal TLS alert. Is that consistent with JS state? I'm assuming so, because there are plenty of other fatal errors which TLSWrap did not emit EOF on (anything case where we reject something from the peer, rather than seeing an alert message, will not set SSL_RECEIVED_SHUTDOWN). So I assume this change just makes peer fatal alerts go through the same codepaths as other peer errors. But hopefully you all will be able to review this better.
I suspect there's also a way to construct a test which demonstrates the first fix standalone. We ran into it in BoringSSL because, to fix some other bugs in OpenSSL's error-handling, we had to treat SSL_ERROR_ZERO_RETURN more like the other SSL_ERROR_* constants. OpenSSL is still quite lax about it, which is masking this bug in the particular scenario where we hit it. But there's probably a more interesting case where reentrancy causes Node to lose a TLS alert. Except you all seem to already be losing TLS alerts until the second commit anyway, so I'm not sure. :-)
Review requested:
- [ ] @nodejs/crypto
CI: https://ci.nodejs.org/job/node-test-pull-request/46488/
(Updated the second commit to hopefully fix the lint errors.)
CI: https://ci.nodejs.org/job/node-test-pull-request/46639/
This should be reviewed by someone who is more familiar with this code and TLS. @bnoordhuis @jasnell?
Friendly ping. Is this waiting on anything from my end?
Is this waiting on anything from my end?
No. I'll try to ping @nodejs/crypto again.
Landed in ef8aa888669f96c5d091a0057964e3e0efae74f4...8b89d4d42156046326dffb32c51e256ee7045a53