Consider releasing connections bound to a feed
Since the v4 protocol, the main reason to force a feed to be the only query on a connection is gone. It should be relatively safe[1] to run a query and a feed on the same connection
We could add an option to release the connection bound to a feed.
[1] benchmark needed
I'm having significant problems due to "too many connections" and changefeeds with my production application (https://cloud.sagemath.com). I would thus be willing to test anything along these lines that gets done, or maybe even write something. My current plan has been to either modify rethinkdbdash to re-use connections (for multiple queries), or to not use rethinkdbdash, or to change my application to reduce the number of changefeeds I use. Anyways, I'm willing to stress test things.
@williamstein -- I would have a few questions for you:
- Do you hit a limit of the number of maximum files open? What the issues you have with too many connections?
- How often do you open a feed? How often do you close them? (or did you actually close them)?
So I can think of two ways to release the connections:
- Just release the connection when we get a feed. It's pretty easy to do, we just need to not release twice the connection for a given feed - but that's easy.
- We can release the connection and put at the bottom of the stack. This would be slightly better I think in the case where you have you open a feed, get the feed, open a new feed, get the feed etc. The first solution in this case would clog the connection with too many feeds. The problem with that solution is that we should replace the double queue with a linked list+hash to be able to efficiently timeout/unshifrt/pop connection.
One more solution for your problem @williamstein maybe would be to provide another pool for feeds where connection will be reused? Or keep X connections per pool reserved for feeds?
Actually a simple linked list should do the trick.
I finished replacing the circular buffer with a linked list.
Tests are passing with and without the new argument.
Install (branch linkedList)
npm install https://github.com/neumino/rethinkdbdash#ff01120c2ab588c95239d992b5b05235764f50d6
Use
var r = require('rethinkdbdash')({releaseFeed: true});
Under the hood
The connections bound to a feed will be immediately released and unshifted. Meaning that in case you open only feeds, your feeds are round-robined over buffer connections. If you have a more complicated workload, it depends :)
It's not released yet, I want to check a few things first (typically edge cases and stuff). But @williamstein (or anyone else) if you want to give it a go, feel free to.
It seems like an old issue. Is it resolved already (or planned to be)? I'm encountering a similar issue (might be my own leaky code that does not close some cursors but still)
Hum, I created a branch a long time ago and it was working, but there wasn't a strong need for this, and I think I eventually did not merge it (it's still in the linkedlist branch though)
Hi,
Thanks for the response. I've found the leak in my code causing the issue but I still have about 5 change-feeds per connected user (notifications from various tables) and that puts a limit on the number of concurrent users that can connect (I could just listen to a changefeed on the whole table instead of just the the keys between the ones relevant to the connected users but it seems a waste). In addition there is no exception/timeout when there is no connection available, it just silently stuck forever...
Does the linkedlist branch work with the latest rethinkdb? what features are missing?
One other thing, when I'm looking at the r.getPoolMaster().getAvailableLength() and it's constantly growing. Does it mean that it doesn't reuse idle connections and constantly creates new ones?
I've rebased the linkedList branch on top of the latest master in #374, and all tests (apart from two backtrace formatting changes) are passing.