mobc
mobc copied to clipboard
mobc completely stops serving connections.
Hey,
we use mobc as part of Prisma and we getting into a situation where mobc complete stops serving any connections. If I create a HTTP server using hyper and create a mobc pool via Quaint.
A repo with the reproduction can be found here https://github.com/garrensmith/mobc-error-example
I then use apache benchmark with a request like this:
ab -v 4 -c 200 -t 120 http://127.0.0.1:4000/
Once apache benchmark has stopped. The connections in postgres go to either to a much lower than the original number of connections I've set to open or completely to zero. If I log State
from Mobc it will report it has 10 active connections. Which is incorrect.
However if I try and start apache benchmark and run it again, it will either run a lot slower and with fewer connections. Or not run at all because it cannot acquire a new connection from mobc.
I've tried a few things in the code but I cannot see why this is happening. I even tried https://github.com/importcjj/mobc/pull/60 but that didn't fix it.
Any help would be really appreciated.
hi @importcjj have you had a chance to look at this issue. Any ideas or suggestions I can look at?
I've been diving into this a bit more and I now understand why Mobc can reach a point of dropping connections and deadlocking.
The issue is happening over here https://github.com/importcjj/mobc/blob/master/src/lib.rs#L664
First some context, in our situation, we can have a lot of concurrent requests (over 1000) for a connection from the connection pool. The connection pool will only have a small number of connections for example 10. All the waiting requests have a oneshot channel created, with the Sender added to a Queue over here https://github.com/importcjj/mobc/blob/master/src/lib.rs#L542
What can happen then is that all those waiting requests are destroyed, this happens in the case when those connection requests are coming from web requests that have been aborted. So now the conn_requests
queue has over 1000 channel Senders to Receivers that have been dropped.
Now when an active connection is returned to the pool, what is supposed to happen is that mobc will go through the list of Senders and try and send the connection to a waiting Receiver. And if the Receiver has been cancelled or dropped, the connection is returned and another Sender is tried until it finds a Sender with a waiting Receiver. This is the code I mentioned earlier https://github.com/importcjj/mobc/blob/master/src/lib.rs#L664
However, when there are a large number of Receivers that have been dropped, this doesn't seem to work and the connection gets accidentally dropped. The internals.num_open
is not decremented at this point. So Mobc thinks it still has active connections when in fact it does not. So it doesn't create new connections or have any connections to pass to any new connection requests.
I have an idea to solve this. But it would involve replacing the channels with a Semaphore
. This would be similar to how deadpool works https://github.com/bikeshedder/deadpool/ I've tested it and it works with Mobc. But it would be quite a large change.
The reason the move to a Semaphore
would be better is that when a connection is returned to the pool, there is no chance of it being dropped. It would be added to the list of free_conns
. The oneshot channels will be replaced waiting for access to the semaphore. So if the request is cancelled, another request can grab the connection and there is no chance of it being accidentally lost.
To conclude this in case someone else comes across this. We are hosting a forked Mobc with fixes for this issue over here https://github.com/prisma/mobc
Hello,
We are experiencing exactly same issues on Prisma connection pool.
The application is a backend api developed with NestJS.
Could someone please explain how to implemented this fix with mobc please ?
Thanks.
@w8ze-devel can you open a ticket on the prisma repository to track this.
The latest 0.8.1 release fixes this.