dyno icon indicating copy to clipboard operation
dyno copied to clipboard

Connection pool configuration contains port and ignores supplied port from host

Open cporter71 opened this issue 10 years ago • 9 comments

It is not clear if this is misunderstanding on my end or if this is the intended behavior but I see that your connection pool configuration includes a host port. That port is then used to overwrite the port in a supplied host:

ConnectionPoolImpl.java public boolean addHost(Host host, boolean refreshLoadBalancer) { host.setPort(cpConfiguration.getPort());

In our deployed env we would have N-redis instances on a single host. Following the documentation we would then setup N nodes (dynomite + redis) on that host. This would mean N ports.

Please let me know what I am missing.

cporter71 avatar Nov 12 '14 00:11 cporter71

The problem with this is most notable in the Host hash/equals where port is not considered. This makes the use of Set<Host> elsewhere in code leak hosts. Since Host has a java.net.InetSocketAddress as a member perhaps leverage it's hash/equals.

cporter71 avatar Nov 30 '14 18:11 cporter71

I faced the similar problem while I'm testing a fail over so that Dyno will get a connection to another node host I have supplied. However, the node host will always be overridden by the ConnectionPoolConfiguration default port making Dyno to connect to wrong host. Would appreciate if the intention behind this can be explained to us.

ckiatng avatar Dec 10 '14 06:12 ckiatng

@ckiatng -- If interested it was a fairly straight forward thing to remedy. This commit addressed the issue and a few other items I encountered during initial HA testing: https://github.com/cporter71/dyno/commit/e479872736886a9625d18d00c183d7e4fc9f8220

cporter71 avatar Dec 12 '14 01:12 cporter71

Hey Folks, apologies for not getting to this earlier. yeah there isn't a very good reason for having port in 2 different places. If I recall correctly, port was baked into the host object from the beginning, and then I realized that port should actually be a configuration option for the connection pool just like the other properties like maxConnsPerHost etc. In that sense it is consistent to have it in ConnectionPoolConfiguration.

ConnectionPoolConfig should be the single source of truth when it comes to determining what port to connect to when talking to Dynomite. You should not have to keep your host supplier in sync with the connection pool config object. That is why I had that hacky line of code where I copy the host port onto the host object.

I'm going to try and see how much code breaks when I remove the port from the host object. Will then send in my PR and merge it, if all tests look good after that.

And yes, as for using host objects in sets etc, if you have multiple dynomite services on the same host (something that we never do at Netflix), then I believe that you should just use multiple dyno connection pools to talk to the multiple services. This way there shouldn't be any object leaks if you use dedicated pools.

I'll start to fix this by deleting the port field from the host obj. Since host is a top level obj, this may break code for anyone who is creating his own host supplier etc.

Will keep you folks posted on this.

opuneet avatar Dec 15 '14 20:12 opuneet

Thanks opuneet -

In terms of modeling I would think that you DO NOT want the port in the connection pool configuration but rather with the host. Would you add the host name to the connection pool configuration? No, since then you would have to create N (host count) ConnectionPoolConfiguration objects. Adding port to the ConnectionPoolConfiguration forces a similar issue for those use cases where a single host runs more than a single instance of redis/dynomite.

From a modeling perspective "port" relates to the host, not the runtime behavior of connection management. It seems that using port from the Host object works for the Netflix use case as well as providing flexibility to those use cases outside of 1-to-1 / single node-to-host model and for that reason feels like the right direction.

Cheers.

cporter71 avatar Jan 12 '15 19:01 cporter71

I've recently run into this as well. I've mostly gotten around it by subclassing the host object to ignore the port set done by the connection pool.

For me, instead of a bunch of smaller AWS instances we use some rather large boxes (128gb/48 core). Redis and dyno seem to work better with many instances per box. Its easier to configure this with different ports than it is to beg for more ips on the lan for this.

So I'd most definitely vote to keep the port configuration at the host level. The dynomite server instance doesn't seem to have any issues with this.

alosix avatar Aug 13 '16 15:08 alosix

@alosix What are the advantages you have seen with running multiple Dynomite/Redis in the same bigger instance? If the instance goes down, you need to warm up two token ranges.

ipapapa avatar Aug 16 '16 17:08 ipapapa

@ipapapa For me its to more efficiently use the hardware at hand for processing higher loads vs availability. This is being using in some lambda counting in conjunction with twitter's summingbird, So the data this creates is ephemeral and used for about 2 hours until the map reduce side dumps the finalized counts into cassandra for longer term storage.

alosix avatar Aug 16 '16 17:08 alosix

@alosix sounds a very interesting use case. Thank you for the information.

ipapapa avatar Aug 16 '16 19:08 ipapapa