nitox
nitox copied to clipboard
TLS client verification support?
It looks like Nitox doesn't support client verification as of now. I made some quick changes to pass TLS config to the native-tls
connector. When I tested this, I'm getting:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: TlsError(Ssl(Error { code: ErrorCode(1), cause: Some(Ssl(ErrorStack([Error { code: 336031996, library: "SSL routines", function: "SSL23_GET_SERVER_HELLO", reason: "unknown protocol", file: "s23_clnt.c", line: 794 }]))) }, X509VerifyResult { code: 0, error: "ok" }))', src/libcore/result.rs:997:5
Apparently, the certificate is verified, but the connection is being cut off. I'm guessing I'm missing something in the NATS protocol? Either that, or something's wrong with how nitox does TLS. The other nats crate however supports this feature.
Hey!
The other crate supports it simply for the reason that the TLS implementation is based on OpenSSL and it is kinda broken, especially if your local libssl is old (crypto/tls/ssl is hard so I can't blame Frank).
From what I can see you're trying to connect through SSLv2 which is a widely deprecated and unsupported protocol. (Even OpenSSL dropped support for it)
As such, native-tls
does not support it as indicated by the Protocol
enum in the docs: https://docs.rs/native-tls/0.2.2/native_tls/enum.Protocol.html
So, it's not really nitox-related, you should just upgrade your security protocols :)
Also, quick note but when acting as a client, you don't need to provide a root cert, only the host name is needed. The current code works fine in production for months for us, given that your certificates business is setup correctly as well.
For reference, client-side implementation of native-tls is the first example: https://github.com/sfackler/rust-native-tls#usage
The other crate supports it simply for the reason that the TLS implementation is based on OpenSSL and it is kinda broken, especially if your local libssl is old (crypto/tls/ssl is hard so I can't blame Frank).
From what I can see you're trying to connect through SSLv2 which is a widely deprecated and unsupported protocol. (Even OpenSSL dropped support for it)
NATS enforces TLS v1.2, so if I had an outdated protocol, NATS would've rejected the connection.
Also, quick note but when acting as a client, you don't need to provide a root cert, only the host name is needed. The current code works fine in production for months for us, given that your certificates business is setup correctly as well.
For reference, client-side implementation of native-tls is the first example: https://github.com/sfackler/rust-native-tls#usage
I was talking about mutual authentication (i.e., server-side verification of certs presented by the clients before accepting a connection). Since we're using NATS internally, we're using self-signed certs for both server and the clients (that's why we needed support for adding root CA certs).
I agree that the other crate isn't good (especially because it's sync) but it actually works. And, we're trying to move to nitox, but we're getting blocked by TLS support.
It turned out that this has got to do with NATS protocol itself and how nitox currently deals with TLS.
When I tried debugging this with openssl
, I got:
$ openssl s_client -connect ${NATS_IP}:4444 -CAfile ~/Desktop/nats-certs/ca-cert.pem -cert ~/Desktop/nats-certs/client-cert.pem -key ~/Desktop/nats-certs/client-key.pem -state -debug
CONNECTED(00000003)
SSL_connect:before/connect initialization
write to 0x1155460 [0x115a8a0] (326 bytes => 326 (0x146))
[[ REMOVED RAW BYTES ]]
SSL_connect:SSLv2/v3 write client hello A
read from 0x1155460 [0x115fe00] (7 bytes => 7 (0x7))
0000 - 49 4e 46 4f 20 7b 22 INFO {"
SSL_connect:error in SSLv2/v3 read server hello A
140712761509528:error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol:s23_clnt.c:794:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 7 bytes and written 326 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
Protocol : TLSv1.2
Cipher : 0000
Session-ID:
Session-ID-ctx:
Master-Key:
Key-Arg : None
PSK identity: None
PSK identity hint: None
SRP username: None
Start Time: 1551798460
Timeout : 300 (sec)
Verify return code: 0 (ok)
---
So, NATS actually sends server INFO
in plaintext before the TLS handshake (which is ridiculous!). After some digging, I found this issue (still active!). Apparently, all clients have been implemented knowing this fact. They parse the INFO
value in plaintext and only then, they start the handshake. The other crate does the same thing.
So, I guess the current implementation doesn't actually work? Because, you're upgrading to TLS immediately following the TCP connection, rather than after parsing INFO
?
Oh, and one minor thing that struck me - Nitox is parsing the URL to get the host name when TLS is enabled. But, the previous step tries to parse the URL as SocketAddr
(or resolve the DNS).
Both the steps are mutually exclusive. You can't parse / resolve an address with protocol, and you can't parse the address as Url
without a protocol.
Wow indeed, just looked at the mechanism that gnatsd
is using and after looking around I didn't find any documentation related to this behavior (besides the issue you mentioned). (Ridiculous indeed!)
Alright, let's just consider then that NATS TLS is just unsupported for nitox
as of now. :(
I can't determine the effort needed (besides what you already did about TLS configuration) to implement it, since INFO
messages are asynchronous and can occur anytime. For instance, the cluster protocol informs clients of new addresses within a cluster through pushed INFO
messages and thus I'm just not sure if this particular message will be sent with or without TLS.
And indeed we upgrade the TCP to TLS immediately even before trying to talk to the server. I have an idea to make it work quickly (thanks to the work done on the verbose mode Ack mechanism) but I'm really not sure.
About the SocketAddr
/ Url
business, I'll check it out, but the convention in NATS is to give something like nats://myserver.my.tld:port
, or simply xxx.xxx.xxx.xxx:port
. It doesn't seem weird either. It's just that with TLS, you have to give a full URL with hostname or the client cert resolution will just fail since despite being able to resolve the IP from itself, it'll just fail at trying to extract the hostname from the given cluster URI.
Hi, just to clarify that currently only the initial INFO message is plain text, once the connection has been upgraded to TLS the following INFO messages use the TLS transport.
@wallyqs Alright, thanks for the clarification, duly noted!
Also, same context, but what happens when the subsequent INFO
messages have a tls_required: false
. Should we downgrade to pure TCP? Or that just should not ever happen?
yes that wouldn't happen, only when tls_required
is announced then the server will expect the upgrade.
About the SocketAddr / Url business, I'll check it out, but the convention in NATS is to give something like nats://myserver.my.tld:port, or simply xxx.xxx.xxx.xxx:port. It doesn't seem weird either. It's just that with TLS, you have to give a full URL with hostname or the client cert resolution will just fail since despite being able to resolve the IP from itself, it'll just fail at trying to extract the hostname from the given cluster URI.
It's not weird, but it doesn't work in the current setup. All I'm saying is that it needs to be improved as part of the TLS suppport, because currently it fails if we enable TLS and pass a cluster address like 1.2.3.4:4444
(addr resolves, but URL cannot be parsed) or nats://1.2.3.4:4444
(cannot parse SocketAddr
or resolve the DNS)
I agree that how it works currently is definitely not foolproof and relies on the user giving proper params for each tls/tcp mode. tls = url form / tcp = url/ip form
That will change in the PR draft I created :)
I have opened a PR for this!