shadow-tls v0.2.3 tcp handshake failed.

2022-10-17T11:28:29.849344723+08:00 WARN  tcp handshake failed. peer: [::ffff:127.0.0.1]:42420, header too short, expecting 59 bytes, but found 3 bytes
2022-10-17T11:28:33.127358801+08:00 WARN  tcp handshake failed. peer: [::ffff:127.0.0.1]:42422, header too short, expecting 59 bytes, but found 3 bytes
2022-10-17T11:28:33.173111163+08:00 WARN  tcp handshake failed. peer: [::ffff:127.0.0.1]:42426, header too short, expecting 59 bytes, but found 3 bytes
2022-10-17T11:28:33.699706451+08:00 WARN  tcp handshake failed. peer: [::ffff:127.0.0.1]:42428, header too short, expecting 59 bytes, but found 3 bytes
2022-10-17T11:28:34.698409624+08:00 WARN  tcp handshake failed. peer: [::ffff:127.0.0.1]:48830, header too short, expecting 59 bytes, but found 3 bytes
2022-10-17T11:28:34.703100254+08:00 WARN  tcp handshake failed. peer: [::ffff:127.0.0.1]:48836, header too short, expecting 59 bytes, but found 3 bytes

new version works fine, but its show lots of handshake failed warning on shadowsocks-rust server side with 2022-blake3-aes-256-gcm encryption method, is this normal?

Oct 17 '22 03:10 ghost

If it works, I think you can just ignore the warning.

After a glance of shadowsocks-rust, it seems the warning(or error) comes from here. I believe this is a problem of shadowsocks-rust implementation. One poll_read without loop cannot guarantee the buffer fully filled. To fix it, it may save the current filled length in DecryptedReader and call poll_read in loop until the buffer is fully filled.

Oct 17 '22 03:10 ihciah

So, It's not shadow-tls' problem. Closed.

Oct 17 '22 04:10 ghost

And as I mentioned in my blog article, trojan-go mis-uses read too(code link ). In most cases it works, but in fact you cannot assume that.

Oct 17 '22 04:10 ihciah

I believe this is a problem of shadowsocks-rust implementation.

It's not a bug. This behavior is mandated by the Shadowsocks 2022 spec: https://github.com/Shadowsocks-NET/shadowsocks-specs/blob/main/2022-1-shadowsocks-2022-edition.md#313-detection-prevention

In most cases it works, but in fact you cannot assume that.

Yes you can. I'm not aware of a single TCP stack that does not behave like this.

Oct 17 '22 04:10 database64128

It's not a bug. This behavior is mandated by the Shadowsocks 2022 spec: https://github.com/Shadowsocks-NET/shadowsocks-specs/blob/main/2022-1-shadowsocks-2022-edition.md#313-detection-prevention

This is absolutely fine if the underlying transport is dataframe; or, still using tcp but with a userspace protocol stack which can gives us ability to control the details we want.

But using tcp stack implemented by the kernel, I think it's a implementation that sacrifices correctness. It indeed solves the problem, but it may make a normal request fail. With retry on client side, I think it is acceptable(works but not good).

Yes you can. I'm not aware of a single TCP stack that does not behave like this.

First problem here is, this way relies on implementation details rather than interface convention. Even it works, it's still not a good practice. Second, I don't agree that all tcp stacks will definitly pack data within a single write syscall to a single tcp frame. I don't know how you came to the conclusion, but I think its hard to do investigation on all of tcp stack implementations and keep it updated. For example, Windows is not open source and may update its implementation details. Third, even for linux, I guess it does not behave like this if NO_DELAY is not set. After a short write, linux kernel will keep the data for a very short while, and waiting for more data to make it more efficient. If you do another write then, the second data may be splited into two tcp frames. Fourth, the entire data transfer channel must satisfy your constraints rather than treat it as a normal tcp. So in a sense you have created a new tcp protocol with tighter behavioral constraints. And the middle services that works in Layer4 can not guarntee that(they do not even know that).

Oct 17 '22 05:10 ihciah

I use sudo sysctl -w net.ipv4.tcp_notsent_lowat=59 temporarily fixed the warning.

Oct 17 '22 05:10 ghost

First problem here is, this way relies on implementation details rather than interface convention. Even it works, it's still not a good practice.

We've had extensive discussions on this, and we can't think of a better approach than this. The goal is to mimic a typical TCP server without exposing how many bytes we need to read, and reject misbehaving clients.

If you look at existing implementations of legacy Shadowsocks, many of them (e.g. Clash, ShadowRocket) enable NO_DELAY and write out the salt as a separate write call. This results in a TCP segment being sent with only the salt (16/32 bytes) as payload. This requirement was added to reject such misbehaving clients.

Second, I don't agree that all tcp stacks will definitly pack data within a single write syscall to a single tcp frame. I don't know how you came to the conclusion, but I think its hard to do investigation on all of tcp stack implementations and keep it updated. For example, Windows is not open source and may update its implementation details. Third, even for linux, I guess it does not behave like this if NO_DELAY is not set. After a short write, linux kernel will keep the data for a very short while, and waiting for more data to make it more efficient. If you do another write then, the second data may be splited into two tcp frames.

I certainly didn't come to that conclusion, and I'm aware of how NO_DELAY works. The spec simply requires that the first 16/32 + 11 + 16 bytes (salt + fixed-length header) must be read in a single read call. As long as the client writes it in a single write call (which is also mandated by the spec), the payload is well within a typical MSS, and will be delivered in a single segment.

Fourth, the entire data transfer channel must satisfy your constraints rather than treat it as a normal tcp. So in a sense you have created a new tcp protocol with tighter behavioral constraints. And the middle services that works in Layer4 can not guarntee that(they do not even know that).

Well, if the "middle service" was designed to work with such protocols, then maybe they have to provide such guarantees.

Oct 17 '22 06:10 database64128

Ok I see.

After a short write, linux kernel will keep the data for a very short while, and waiting for more data to make it more efficient. If you do another write then, the second data may be splited into two tcp frames.

I thought there maybe a problem like this. But since there is no data before the handshake data, so this problem does not exist. I agreed that normally programmers will implement tcp stack like what you assume here.

And for middle L4 proxies, they may not meet shadowsocks-rust-2022's additional requiements.

So here's the conclustion for this issue: Due to shadowsocks-rust-2022 protocol design, not all tcp proxies work for it. And unfortunately shadow-tls v0.2.3 implementation does not meet its requirements.

Supporting it maybe considered later.

Oct 17 '22 07:10 ihciah

It's not just shadowsocks-rust. My shadowsocks-go and @nekohasekai's sing-box both impose the same requirement. sing-box also has support for ShadowTLS v1 and v2. I'm curious how sing-box solves the issue. Not sure if it's related: SagerNet/sing-box#149

Oct 17 '22 07:10 database64128

In theory it is possible to make it work. In my implementation I only make sure read and write to shadow-tls server not have data length characteristic to defend active detection.

Oct 17 '22 07:10 ihciah

shadow-tls shadow-tls copied to clipboard

v0.2.3 tcp handshake failed.

shadow-tls
shadow-tls copied to clipboard