pooler
pooler copied to clipboard
Starting members not actually limited to batch size of init_count
This is not a bug per se, just something I wanted to bring to your attention -- on the off chance it's unintentional or has potential to cause anybody any grief.
Pooler grows the member pool by batches of init_count, but that is not strictly enforced. I don't necessarily think it should be enforced, but some interesting behavior might arise in certain cases because it's not.
What I mean by not strictly enforced is: https://github.com/seth/pooler/blob/master/src/pooler.erl#L560
Here's a sample case to illustrate the point...all members are currently in use, but there is room for growth:
init_count = 10 max_count = 100 free_pids = [] in_use_count = 10 starting_members = []
Somebody tries to take a member. That results in add_members_async(10, ...)
starting up another batch of 10, and it returns error_no_members
to the caller. All good so far, and now:
starting_members = [...10 elements...]
Immediately thereafter (prior to any of the new batch being fully started/added/accepted), another caller tries to take a member. This gets evaluated:
NumToAdd = max(min(InitCount - NonStaleStartingMemberCount, NumCanAdd), 1)
...which equates to max(min(10 - 10, 80), 1)
, so NumCanAdd = 1. That results in add_members_async(1, ...)
starting up another single member, and it returns error_no_members
to the caller.
Now we have 11 members being started. This may not be a problem per se, but if (a) it takes long enough for new members to start, and (b) consumers are trying to take a member at a fast enough rate, it can result in a # of starting members that can grow quite large. Worst case scenario (I believe) is that you can have up to max_count - init_count
members starting at once.
(a) and (b) is the perfect storm that can potentially overload things.
This may be a non-issue, but I wanted to bring it to your attention. If this ever bites anybody, a fix would be simple...just placing a configurable hard limit on it, i.e. max_starting_count
. It could default to something like max(init_count,1)
out of the box, and users could raise/lower it as they see fit.
What do you think?
Hi there,
I've only had a chance to ready your report quickly, but looks to me like you've found a bug and something that could be a problem. Perhaps it would be worth adding the config you suggest.
@oferrigni thoughts?
@seth @oferrigni FWIW, I've been seeing this "member leakage" behavior in our production environment. We a setup where init_count
is 1000, and we can see our pool grow like:
- 1000
- 2001
- 3003
- 4008
- 5017
The overflow is the number of consumers trying to get a member while pooler
is off starting up a new batch of members.
If you're not opposed to my suggestion above, I'll submit a PR...