shadowsocks-org [SIP] Shadowsocks v2

This issue is to discuss the changes we want in the next major revision of Shadowsocks protocol. Right now I've done some preliminary research based on the SOCKS6 RFC draft and I have a prototype security layer that provides forward secrecy (except for early data) and 1-RTT latency (or 0-RTT if used with TCP Fast Open).

So here're the things I have in mind (no particular order of importance, and most are optional):

v2 protocol roughly based on SOCKS6 (which is still a moving target).
New security layer with PFS and 0/1-RTT (w/o TFO). (related issue https://github.com/shadowsocks/shadowsocks-org/issues/54)
Basic auth so we can officially support single-port multi-users without hacks.
Native solution for DNS. (related issue https://github.com/shadowsocks/shadowsocks-org/issues/156)
Better-defined semantics of proxy and VPN regarding errors and ICMP packets (related issue https://github.com/shadowsocks/shadowsocks-org/issues/144)
Multiplexing over single TCP connection (similar to HTTP/2) to reduce latency when TFO is not possible.

Please feel free to discuss the changes.

Feb 20 '20 17:02 riobard

Could be good to add links to related issue in this repo.

Feb 20 '20 17:02 Mygod

Can v2 server also provide v1 service?

Multiplexing over single TCP connection

In HTTP/3, they multiplexing over single UDP "connection" to avoid TCP flow control stalled all sub stream in single TCP connection. SCTP is an option too, it's designed to multiplexing, but it's too rare...

And just mention here: First proposal (in clowwindy's original post) for shadowsocks is public key encryption. (如果有其他同学有兴趣加入的话，也许可以进一步做成公钥加密的。)

Feb 20 '20 18:02 ghost

@studentmain For co-existence of v2 and v1: this is up to the implementation to decide as long as it's not possible to do down-grade attack.

Multiplexing over TCP (and HTTP/2 in particular) is a compromise because of UDP throttling. If HTTP/3 becomes popular and works well enough in practice we could switch to UDP as well. Meanwhile I have to deal with TCP and broken middle boxes killing TFO packets.

I've almost finished the public key encryption part (without RTT penalty).

Feb 20 '20 18:02 riobard

Then problem became how client choose correct protocol. Change in URL format may required.

HTTP/3 is modified QUIC, so I think at least it's good enough in Google's data center. But I'm not sure it's good enough too under pacific ocean. v2ray has QUIC support, maybe we can take a look at them. https://blog.apnic.net/2018/05/15/how-much-of-the-internet-is-using-quic/ https://w3techs.com/technologies/details/ce-quic

We should use quantum safe cipher in public key encryption. https://github.com/open-quantum-safe/liboqs/tree/master#supported-algorithms

Feb 20 '20 19:02 ghost

A ss2:// scheme should work fine.

I need to see hard evidence that UDP works without significant throttling before investing time on it.

Quantum security is beyond the scope. Public key encryption is mostly to support multi-users without too much security downsides.

Feb 20 '20 19:02 riobard

Regarding the proposal, this is definitely too much. I prefer a minimalistic approach and offload features to plugins whenever possible. In fact, we could even make the default AEAD encryption as a (default) plugin and always run in plain (I guess that reduces v2 to simply socks6 over *, but KISS).

Detailed comments:

Regarding 2: I still stand the point that public-key crypto/key-exchange is expensive and unnecessary and I am strongly against it. Furthermore, you can offload it to plugins like TLS.
Regarding 3: We can simply employ basic auth from socks5/6 for this.
Regarding 6: I disdain the idea of multiplexing. I can only see it useful in case of server push in HTTP/2.
Regarding post-quantumness: Again, just let the plugins do their job.

Feb 20 '20 19:02 Mygod

@Mygod At minimal I'd like ss2-server to work like a regular SOCKS6 server with some special behaviors regarding authentication as to not leak its existence. But right now there's no other SOCKS6 clients to test with.

It's 2020 and public key crypto is easy and efficient with modern primitives. I've considered using just TLS but there are several major blocking issues, namely only TLS 1.3 technically supports 0/1-RTT mode, but many implementations (like the one in Go's stdlib) does not support 0-RTT at all, and there's no plan to add it any time soon. Additionally, TLS brings in a host of other issues regarding certificate management and domain verification that I don't want to force it on people. And the complexity of TLS is… well just read the RFC and judge it yourself. The new security layer aims to be very simple, efficient, and secure. If you do not care about any of those nice things, you can always run SOCKS-over-TLS (many existing clients support it) and no need to bother with Shadowsocks at all.

I'm still considering multiplexing. It has significant benefits in Shadowsocks use case, namely 1) reliable 0-RTT connection establishment even when TFO does not work, 2) better utilization of network bandwidth due to TCP congestion control, and 3) it cuts the number of connections/open files in half on the server side. However it does come with obvious drawbacks as well, like complexity and head-of-line blocking, both of which cannot be avoided. I'm experimenting with HTTP/2 CONNECT proxy now, and it does work better than I expected. But it's difficult to integrate and provide reasonable proxy semantics (mostly communicating errors between local and remote side).

Feb 21 '20 05:02 riobard

Authentication without leaking existence can be achieved simply via socks6 over blank. It seems like socks6 draft specifies that the client can send payload immediately after authentication header so I don't see why this is an issue.

0-RTT cannot work. TLS 1.3 does 0-RTT by using a session ID. You need a handshake to establish a key exchange/session ID. You are not going to want to encrypt each packet using public-key encryption.

TLS has the added advantages for traffic hiding that a new protocol cannot provide. The advantage of TLS is that it makes traffic indistinguishable from other TLS traffic, say HTTPS, except for inspecting packet length distribution (see sssniff, etc), which I believe is somewhat reliable at best.

Also, the reason I oppose you building protocols from public-key crypto directly is exactly the reason I opposed OTA. We are living in the sad world where security is not as composable as you would like it to be, and you are not going to get world's best security experts to audit your protocol (despite how complicated TLS is, it is of popular interest and gets audited by everyone).

Multiplexing is useless, except for connection reuse, which should be implemented by plugins. I agree connection reuse could be useful but again this should be implemented by a plugin where the mimicked traffic does use such feature, say HTTP/2. We should make hiding traffic our priority instead of performance (especially when it's as minor as number of RTTs), etc. The plain protocol should not have long idle connections.

Feb 21 '20 05:02 Mygod

In conclusion, we should just do socks6 over [blank]. @madeye What do you think?

Feb 21 '20 05:02 Mygod

@Mygod Unfortunately you are wrong on many levels…

Multi-user authentication won't work securely without public key crypto, see #54
0-RTT works fine, except for early data (TLS 1.3 also shares this caveat), and client can choose how much early data to send. The trick is to pre-share server public key. We need this for server authentication anyway so no extra problem either. Also see issue #54 (I need to update it with forward secrecy tho).
Sure, I completely understand the benefit of TLS. But for obfuscation we all agree it should be done by plugin so there's no disagreement. We just need to provide a default when people don't want to bother with TLS. Current default is insufficient.
The new security layer is basically vastly simplified TLS 1.3 so I'm confident. You might not agree and it's perfectly fine to use TLS instead (and accept your chosen TLS lib's limitations).
Fewer RTTs is important for user experience. Long idle connections are the norm. Just check how many connections you phone keep to various clouds. And it's strange for a client to keep dozens of TCP connections to a single server in an increasingly HTTP/2 world.

Feb 21 '20 06:02 riobard

I am not going to argue with your opinions so just some technical comments.

"securely" depends on your security model. I mentioned in https://github.com/shadowsocks/shadowsocks-org/issues/54#issuecomment-589523707 that your proposal actually does not achieve what you want (in particular I constructed an attacker in your model). However, I would argue that TLS does it pretty decently.
You can make a plugin to do what you describe but I do not feel comfortable making an ad-hoc protocol a default choice.

My opinion: TLS isn't too hard to set up actually.

Feb 21 '20 06:02 Mygod

We can discuss the technical issues separately in #54.

I'm not against TLS. It's just that the TLS in Go stdlib does not provide what I want (0-RTT) and I don't want to bother with certificates and domain verification. Like I said before, you can always run SOCKS-over-TLS so there's no disagreement here.

Feb 21 '20 07:02 riobard

Recently, I'm thinking about a side channel key exchange approach.

For example, do a Wireguard like key exchange (https://www.wireguard.com/protocol/) in a side channel (a standard 443 port, a random port, or even a different host server), then communicate using the current shadowsocks protocol.

Feb 21 '20 08:02 madeye

@madeye The benefit is?

I think some of the commercial operators offer HTTPS-based subscription to do similar things. But I don't fully understand the reasoning behind that.

Feb 21 '20 09:02 riobard

@riobard So ss server itself can know which user will connect before any packet received. That will make active probing useless.

Feb 21 '20 10:02 ghost

@studentmain Could you please explain a bit more in detail under what scenario will make it immune to probing?

Feb 21 '20 10:02 riobard

Client connected to side channel and finish handshake here. Then server will get client's IP address (maybe with IV user will used) before client connected to it. Attacker can't pass side channel handshake, so when a packet come in, server has no information about it, then server can reset connection or do whatever it like.

Feb 21 '20 10:02 ghost

If it's IP-based firewalling, it seems very fragile given the mass deployment of Carrier-Grade NAT (CGNAT), in which you cannot guarantee the client's public IP when connecting to the authorization server is the same one when connecting to the relay server.

So the safest bet is for the client to get some kind of auth token from the authorization server and use that token to connect to the relay server. At this point I'm confused as to how it will be different than sending just a PSK?

Feb 21 '20 11:02 riobard

So the safest bet is for the client to get some kind of auth token from the authorization server and use that token to connect to the relay server.

Yes, token obtained in safe side channel is ok.

Feb 21 '20 11:02 ghost

How is it different from the current approach with respect to active probing? I still don't understand the advantage of the split approach.

Feb 21 '20 11:02 riobard

In split approach, server can detect probe easier and more accurate. It works similar to port knocking for SSH.

Feb 21 '20 11:02 ghost

So in the normal setup, we have to detect replay attack on the server in a black list style (previously used nonce will be rejected). But in the split approach, at least on the relay server, I assume it's more like a white list style (only authorized tokens will be accepted).

Then we're moving the attack surface from the relay server to the auth server. Also now because the auth server and relay server are different now, we have to consider the additional synchronization issue (client gets an auth token from auth server, but auth server has not yet delivered that token to the relay server when the client connects to the relay server).

It does not look too promising either. Or am I missing something here?

Feb 21 '20 11:02 riobard

Auth server can hide behind normal website (that's why it use 443) and works less frequent than relay server. So I think it's attack surface is much smaller, you need find it from many TLS website first.

additional synchronization issue

That's the problem need to be resolve. My solution is send keys to client after received relay server's confirmation. That will introduce more RTT for first packet. Auth server can send few dozens key to client for use in other connection, so it only affect first connection.

Feb 21 '20 12:02 ghost

I see. My concern is that the split approach introduces many moving parts (it's officially a distributed system now) and the benefits are not very clear cut.

Feb 21 '20 12:02 riobard

This is a too complicated solution for a problem that TLS can solve.

EDIT: Also you are making it easier to fingerprint the server.

Feb 21 '20 14:02 Mygod

So here's three solution right?

Modified SOCKSv6 + new security layer
Extended SOCKSv6 + TLS
Side channel handshake

Feb 22 '20 05:02 ghost

Modified SOCKSv5 + TLS (via simple-obfs and v2ray-plugin) already tested for a long time.

About side channel handshake, as it needn't redesign packet format, we can test it on current code base.

Feb 22 '20 06:02 ghost

More likely modified/simplified SOCKS6 + interchangeable security layer (custom/tls/plugin)?

Only issue is that SOCKS6 is still a draft and it's not clear if it will be widely adopted.

Feb 22 '20 06:02 riobard

Only problem of TLS is they need a domain name.

Feb 22 '20 14:02 ghost

@studentmain And certificates and renewal handling (acme most likely), which is a hassle for many.

Feb 22 '20 15:02 riobard