mtprotoproxy icon indicating copy to clipboard operation
mtprotoproxy copied to clipboard

proxy -> telegram connection multiplexing

Open seriyps opened this issue 6 years ago • 6 comments

It looks like it's possible to multiplex many client -> proxy connections into single proxy -> telegram server connection based on RPC connection ID

https://github.com/alexbers/mtprotoproxy/blob/ed227da7c3afe67c876cd922661eedd0ec87da78/mtprotoproxy.py#L330 https://github.com/alexbers/mtprotoproxy/blob/ed227da7c3afe67c876cd922661eedd0ec87da78/mtprotoproxy.py#L358

You just have to maintain mapping between client -> proxy and proxy -> telegram sockets and send packets as a whole. And, probably, you will need to have some kind of proxy -> telegram connection pool (separate pool for each dc_id)

seriyps avatar Jun 10 '18 22:06 seriyps

Why do you need it? Does the scheme with multiplexing give some advantages?

alexbers avatar Jun 12 '18 10:06 alexbers

I personally don't need this, I have my own implementation now =) But talking about advantages: it's resource consumption / utilization + latency. Telegram tcp socket connections in general are not very active. You spend more time doing nothing, waiting for events to be sent / received. So, you can have a lots of connections and relatively small amount of network traffic. For example, on one of my nodes I have ~17k client connections and just 6MiB/s traffic. But each socket connection occupies some memory in kernel and userspace buffers and structs (on this node it uses more than 1Gb of RAM). With current schema we have 2x number of open sockets to number of client connections. In multiplexed scheme we could have 1x + Const where const will be equal to number of datacenter IPs or a bit more, so this will help us to lower memory consumption a lot by the price of code complexity. Another reason is connection setup time. Each time TG client connects to our proxy, it waits for us to establish a connection to Telegram server. It takes some time (not that much, but still infinity times worse than 0 in case when connection was already established). FYI, here is a heatmap chart of how long does it take to connect to Telegram server from my proxy nodes:

screenshot from 2018-06-12 13-21-03

seriyps avatar Jun 12 '18 11:06 seriyps

I was having an another idea how to get lower latency and memory consumption. Now, all proxy-servers with advertising, including an official one, read a full message from a one side of connection and then forward it to the other side. The messages can have up to 1MByte size. The proxy stores it into memory until it is fully read, so in the worst case we can have a big memory consumption (1MByte per client + socket buffers) and big latency.

The idea is to send a message as early as possible. For example, when we got the first 10KBytes of 1MByte message, we can reformat and forward that bytes to the connection and wait for more data.

In the nearest future I am planning to stabilize the current program, release the first version and write about it on some site like habr.com. After that I plan to work on memory consumption and lattency.

Now I set up test proxies on Digital Ocean droplets and trying to advertise them by telegram channels. I have only about 1250 users connected, and the cpu and memory consumption are very low: image image

alexbers avatar Jun 12 '18 12:06 alexbers

The idea is to send a message as early as possible.

Yep, that's how it's done in official proxy, as far as I understand: https://github.com/TelegramMessenger/MTProxy/blob/master/mtproto/mtproto-proxy.c#L1814 and it also prevents from potential DoS attack.

But, FYI, I have monitoring for telegram protocol packet sizes and it's relatively small for majority of cases (UP is upstream == telegram client, Down is downstream == Telegram server), so, left chart is what telegram client sends to server and right one - what TG server sends to client:

screenshot from 2018-06-12 14-42-14

seriyps avatar Jun 12 '18 12:06 seriyps

Nice graphs! I think I will forward small messages (<4096 bytes) directly and big messages by parts. Also, for big messages it is possible to implement speed throttling on tcp level.

I tested the official proxy and I managed it to occupy a 1MB per client. The interesting thing, when I tried to test with about 300 clients, the official client always crashes: image

I sent the program which predictably crashes a proxy by its address to the official proxy authors, hope they will fix soon.

alexbers avatar Jun 12 '18 19:06 alexbers

The messages can have up to 1MByte size.

Isn't it 64Mb? https://github.com/alexbers/mtprotoproxy/blob/master/mtprotoproxy.py#L304

3 bytes = 24bits and you multiply it by 4, so: 2**24 * 4 / 1024 / 1024 = 64

seriyps avatar Jun 13 '18 08:06 seriyps