odyssey icon indicating copy to clipboard operation
odyssey copied to clipboard

Limiting max number of backend connections

Open ruimarinho opened this issue 5 years ago • 12 comments

Since https://github.com/yandex/odyssey/commit/8496cde80aa292d0365c73315e2091469807c23e, I thought it would be possible to define the maximum number of backend connections using server_max_routing.

However, I can set pool_size = 2 and server_max_routing = 1 and I can still execute two concurrent queries. This is either a bug or counter-intuitive.

Could you please clarify what would be the best way to limit max number of connections per database?

ruimarinho avatar Jan 28 '20 02:01 ruimarinho

Hi @ruimarinho ! server_max_rounting is symmetrical to client_max_routing - both define how many connections are in a process of negotiation. If you want to limit the number of server connection set pool_size = 1, and there will be only one server connection.

x4m avatar Jan 28 '20 05:01 x4m

What does that mean exactly? The number of in-flight connections to the backend?

If you could give me an example of a use-case I would appreciate.

ruimarinho avatar Jan 28 '20 11:01 ruimarinho

Just jumping in to say, I would appreciate a better definition of many of the configuration parameters as well. The documentation isn't very clear on what each value does and is sometimes very difficult to parse as English.

Particularly:

  • workers
  • server_max_routing
  • client_max_routing
  • pool_size
  • pool_discard
  • pool_cancel

atrauzzi avatar Jun 09 '20 18:06 atrauzzi

Each worker is effectively one thread. Number of threads is a number of CPU parallelism. For pgbouncer this number is always 1, unless you used cascade of port-sharing pgbouncers.

server_max_routing - is a number of servers being opened concurrently. If you have 100 clients which want to start a transaction and your server_max_routing = 3, you will be opening 3 server connections at any given moment. Meanwhile opening connection can be around 100ms, and if your query takes 1ms, probably you will just open 6 connections: you will start opening 3 connections, they will be opened in 100ms, you will start opening 3 more, while first 3 connections will perform queries of all clients.

client_max_routing - is a number of clients doing concurrent TLS handshake, doing auth and routing to specific server pool. This setting exists because TLS handshake is consuming a lot of CPU. If you have 30000 incoming connections and start TLS handshake of all of them - probably noone will connect before timing out. This setting exists to allow connection for as much simultaneous connections as possible.

pool_size - is a maximum number of active connections in transaction. If we reach this limit, client wanting to open transaction will wail for pool_ttl for a connection to be released by other transactions. If this does not happen - client will get an error.

pool_discard - send DISCARD ALL before giving server connection from one client to another.

pool_cancel - if we lost connection to client - then we send CANCEL request to server.

@ruimarinho sorry, I lost a conversation in January :(

x4m avatar Jun 10 '20 05:06 x4m

Hey @x4m is there any way to specify a minimum pool size so that all connections are established at start, rather than waiting for clients to cause them to be made over time?

atrauzzi avatar Jun 15 '20 20:06 atrauzzi

@atrauzzi nope, but you can create such a setting. Just copy-pase everything from pool_ttl and modify exiration in od_router_expire_cb(). I'll happily review a PR.

x4m avatar Jun 16 '20 05:06 x4m

@x4m in your example, it seems like you're saying that even if server_max_routing = 3, odyssey may still open 6 backend connections to the storage because that setting controls CPU parallelism only (i.e. server_max_routing defaults to number of workers).

I get that, but ultimately what you're suggesting is that the maximum number of connections to the backend is really the sum of all pool_size connections being used (or rather all connections in use by each pool, but let's assume 100% usage).

If I understood this correctly, then it becomes quite difficult to ensure a global odyssey backend limit when using a large number of users all being routed via the default_db.default_user route, because each user may consume up to pool_size N times.

pgbouncer has a max_db_connections setting precisely for this:

Do not allow more than this many server connections per database (regardless of user). This considers the PgBouncer database that the client has connected to, not the PostgreSQL database of the outgoing connection.

Without this, dynamically scaling odyssey to N machines requires re-calculating all pool_size values across different machines and even that isn't trivial when using a default rule.

What do you recommend in this scenario?

ruimarinho avatar Apr 24 '21 17:04 ruimarinho

@x4m any feedback here would be greatly appreciated.

ruimarinho avatar May 19 '21 00:05 ruimarinho

Uh, sorry for the delay. I think we could implement pool_size not only for a route (database+user), but for a database and a global limit too. Does this suits your case?

x4m avatar May 19 '21 09:05 x4m

Yes. There are two problems that would solve:

  1. There isn't an unbounded limit from odyssey to the backend when using lots of different users. Let's say you accept 20 different users. The current solution I've found is using a very small default pool_size, so that 20 * pool_size doesn't consume all resources, and create specific rules for more resource-consuming users.
  2. Make load balancing more predictable so that scaling more odyssey instances allows dynamically rebalancing the overall pool by reducing the global pool_size limit by the number of instances.

On server_max_routing:

server_max_routing - is a number of servers being opened concurrently. If you have 100 clients which want to start a transaction and your server_max_routing = 3, you will be opening 3 server connections at any given moment.

In the example above (100 clients with pool_size of 10), how many backend connections should we expect on postgres?

ruimarinho avatar May 19 '21 19:05 ruimarinho

@x4m have you had a chance to review this issue? I can't stress enough how important this feature is to avoid overwhelming backend connections :)

ruimarinho avatar Jan 28 '22 17:01 ruimarinho

@reshke how do you limit max number of storage/backend connections in total used by the pooler? Otherwise you have unconstrained backend resource usage while adding new users, right?

ruimarinho avatar Feb 21 '22 22:02 ruimarinho