rethinkdbdash icon indicating copy to clipboard operation
rethinkdbdash copied to clipboard

Consider releasing connections bound to a feed

Open neumino opened this issue 10 years ago • 8 comments

Since the v4 protocol, the main reason to force a feed to be the only query on a connection is gone. It should be relatively safe[1] to run a query and a feed on the same connection

We could add an option to release the connection bound to a feed.

[1] benchmark needed

neumino avatar Aug 17 '15 05:08 neumino

I'm having significant problems due to "too many connections" and changefeeds with my production application (https://cloud.sagemath.com). I would thus be willing to test anything along these lines that gets done, or maybe even write something. My current plan has been to either modify rethinkdbdash to re-use connections (for multiple queries), or to not use rethinkdbdash, or to change my application to reduce the number of changefeeds I use. Anyways, I'm willing to stress test things.

williamstein avatar Aug 18 '15 17:08 williamstein

@williamstein -- I would have a few questions for you:

  • Do you hit a limit of the number of maximum files open? What the issues you have with too many connections?
  • How often do you open a feed? How often do you close them? (or did you actually close them)?

So I can think of two ways to release the connections:

  • Just release the connection when we get a feed. It's pretty easy to do, we just need to not release twice the connection for a given feed - but that's easy.
  • We can release the connection and put at the bottom of the stack. This would be slightly better I think in the case where you have you open a feed, get the feed, open a new feed, get the feed etc. The first solution in this case would clog the connection with too many feeds. The problem with that solution is that we should replace the double queue with a linked list+hash to be able to efficiently timeout/unshifrt/pop connection.

One more solution for your problem @williamstein maybe would be to provide another pool for feeds where connection will be reused? Or keep X connections per pool reserved for feeds?

neumino avatar Aug 19 '15 04:08 neumino

Actually a simple linked list should do the trick.

neumino avatar Aug 19 '15 04:08 neumino

I finished replacing the circular buffer with a linked list.

Tests are passing with and without the new argument.

Install (branch linkedList)

npm install https://github.com/neumino/rethinkdbdash#ff01120c2ab588c95239d992b5b05235764f50d6

Use

var r = require('rethinkdbdash')({releaseFeed: true});

Under the hood The connections bound to a feed will be immediately released and unshifted. Meaning that in case you open only feeds, your feeds are round-robined over buffer connections. If you have a more complicated workload, it depends :)

It's not released yet, I want to check a few things first (typically edge cases and stuff). But @williamstein (or anyone else) if you want to give it a go, feel free to.

neumino avatar Aug 20 '15 03:08 neumino

It seems like an old issue. Is it resolved already (or planned to be)? I'm encountering a similar issue (might be my own leaky code that does not close some cursors but still)

ronzeidman avatar Jan 23 '17 13:01 ronzeidman

Hum, I created a branch a long time ago and it was working, but there wasn't a strong need for this, and I think I eventually did not merge it (it's still in the linkedlist branch though)

neumino avatar Jan 30 '17 01:01 neumino

Hi, Thanks for the response. I've found the leak in my code causing the issue but I still have about 5 change-feeds per connected user (notifications from various tables) and that puts a limit on the number of concurrent users that can connect (I could just listen to a changefeed on the whole table instead of just the the keys between the ones relevant to the connected users but it seems a waste). In addition there is no exception/timeout when there is no connection available, it just silently stuck forever... Does the linkedlist branch work with the latest rethinkdb? what features are missing? One other thing, when I'm looking at the r.getPoolMaster().getAvailableLength() and it's constantly growing. Does it mean that it doesn't reuse idle connections and constantly creates new ones?

ronzeidman avatar Jan 30 '17 07:01 ronzeidman

I've rebased the linkedList branch on top of the latest master in #374, and all tests (apart from two backtrace formatting changes) are passing.

mxstbr avatar Mar 12 '18 14:03 mxstbr