guava icon indicating copy to clipboard operation
guava copied to clipboard

A Pool<T> for expensive objects

Open gissuebot opened this issue 11 years ago • 38 comments
trafficstars

Original issue created by kohanyi.robert on 2011-08-10 at 06:03 AM


I've a use-case where where I pool Sockets (SocketChannels) and ByteBuffers and I've managed to hack-up a simple Pool / BlockingPool interface for myself with an implementation which uses Suppliers to supply objects for a pool on-demand. I'd like to propose to add something similar to Guava.

Basically the interface(s) goes like this: http://pastebin.com/QJ6v7MxD

When poll() / take() is called the Pool would use a Supplier to create a new object for the pool. A Pool could be initialized like this:

Pools.<O>create(int capacity, Supplier<O> supplier);

supplier here could create a new object every time its get() method is called (in my case it creates a new unconnected SocketChannel).

As I searched for pooling on SO the only thing I learned was that nobody likes to use commons-pool, so I thought maybe Guava would extend its rich set of features to cover this, as everybody seems to like Guava. So, is this idea any good? (Btw. here: http://pastebin.com/fwYtKNg2 is my implementation, not tested extensively.)

gissuebot avatar Oct 31 '14 17:10 gissuebot

Original comment posted by [email protected] on 2011-08-29 at 06:31 PM


I believe pooling to be a bona fide Hard Problem. But we have some initial thoughts about it which we may get around to posting.

The API should be the easier part. There are two basic approaches,

passive (note, simpler in JDK 7):

  Pool<Expensive> pool = ...   Lease<Expensive> lease = pool.leaseObject();   try {     Expensive o = lease.leasedObject();     doSomethingWith(o);     ...   } finally {     lease.close();   }

or active:

  Pool<Expensive> pool = ...   pool.nameThisMethod(       new Receiver<Expensive>() {         public void receive(Expensive o) {           doSomethingWith(o);         }       });

And yes, it'd need something like a Supplier to generate instances as needed. There are endless possibilities for how something like might need to be configured though.


Status: Accepted Labels: Type-Enhancement

gissuebot avatar Oct 31 '14 19:10 gissuebot

Original comment posted by [email protected] on 2011-12-10 at 04:14 PM


(No comment entered for this change.)


Labels: Package-Concurrent

gissuebot avatar Oct 31 '14 20:10 gissuebot

Original comment posted by [email protected] on 2012-03-20 at 02:38 AM


Any plans or progress for the timeline on this feature?

I am desperately in need for this in guava as currently I am using commons-pool which performs really bad.

gissuebot avatar Oct 31 '14 23:10 gissuebot

Original comment posted by [email protected] on 2012-03-20 at 02:27 PM


We have no plans to work on this at this time. Sorry. :-(

gissuebot avatar Oct 31 '14 23:10 gissuebot

Original comment posted by [email protected] on 2012-03-30 at 03:45 PM


Here's a question: there's an occasional use case that is similar to a Pool in some ways -- you want pre-created instances on hand ready to hand out, if you dip below a certain # in stock you want to create a new batch of them, etc. -- but the difference is that users only take() the objects, they don't lease() and then return them.

At a glance, that's not a "pool" at all, but something like a Supply<T> or Stockpile<T>. But if the only difference between them is whether you call take() or lease(), it doesn't necessarily seem worth forcing them into separate utilities. Any ideas?

gissuebot avatar Oct 31 '14 23:10 gissuebot

Original comment posted by kohanyi.robert on 2012-03-30 at 04:02 PM


I think that what you've described is just a special kind of pool, which could throw when a client returns a leased object.

Pools that "replenish" their resources if their number drops below a limit are special pools too, which wouldn't throw if clients try to returns taken objects. Instead they would reuse the returned resources (or destroy them when they're full).

So, if the actual question was "Should there be separate Pool and Stockpile interfaces?" then my opinion is that a single Pool interface could suffice.

gissuebot avatar Oct 31 '14 23:10 gissuebot

Original comment posted by wasserman.louis on 2012-04-10 at 10:40 PM


I am interested in pursuing this project for srs bsns.

gissuebot avatar Oct 31 '14 23:10 gissuebot

Original comment posted by [email protected] on 2012-05-11 at 07:45 AM


I just encountered a usecase for this with selenium. I want to have a pool of selenium drivers to take from. I'd say +1 for the ACTIVE approach, I'm going to be implementing something akin to that internally.

Edit/Re-post: I derp'd and +1'd the wrong approach.

gissuebot avatar Oct 31 '14 23:10 gissuebot

Original comment posted by [email protected] on 2012-05-11 at 02:28 PM


Some information about what I ran into when implementing this:

Pools should have a configurable object acquisition policy, be it thread blocking or simply spawning a new object when one becomes available or throwing an exception and perhaps more that I haven't thought of yet.

Pools should handle GC gracefully, in our implementation when an object is being leased, the pool doesn't hold a reference to it, should it be GC'd we have a disposer strategy akin to a removal listener from the caching API which handles any odd cases.

My Pool interface right now only has: void lease(Reciever<T>) void empty()

gissuebot avatar Oct 31 '14 23:10 gissuebot

Original comment posted by ogregoire on 2012-05-11 at 03:30 PM


Unlike Emily, I prefer the passive approach as it is more like all we handle everyday. The try-with-resource has become quite popular within our team - some even chase the old trys to replace them with the new one. Encapsulation of bracket squares makes the code less readable and leads to the creation of unnecessary temporary objects and the management of final variables, which is quite noisy. The passive approach on the contrary uses a known code structure that is quite explicit and understandable to everyone, though the structure itself needs to be learned by the user of the API.

gissuebot avatar Oct 31 '14 23:10 gissuebot

Original comment posted by [email protected] on 2012-05-14 at 10:30 AM


I think that if the structure of active vs. passive needs to be learnt then there are more fundamental issues to address.

The active approach means that you have to add a lot of boiler plate to your code, if this is something you access regularly then the code can quickly become cluttered.

The managment of final variables isn't hard and is only a concern for anon classes, its entierly possible to do this without.

gissuebot avatar Oct 31 '14 23:10 gissuebot

Original comment posted by [email protected] on 2012-05-30 at 07:43 PM


(No comment entered for this change.)


Labels: -Type-Enhancement, Type-Addition

gissuebot avatar Oct 31 '14 23:10 gissuebot

Original comment posted by ogregoire on 2012-07-11 at 01:28 PM


I'd like to withdraw my comment of May 11. We've been experimenting with embedded for the last month and we find the active method not as bad as we did. We still prefer the passive way, but we don't see any objections to the active one anymore.

gissuebot avatar Nov 01 '14 00:11 gissuebot

Original comment posted by ogregoire on 2013-01-23 at 10:24 AM


After using this kind of pattern for the last few months, we consider that the active way is useful when the code doesn't throw any checked exception and the passive way is good when the code does actually throw checked exception.

So what we do with our pool implementation is that we offer the two ways. Some might say it's redundant, but it leaves the choice to use the best option when programmers are faced to various cases. All that for "only" one more method and one more interface.

gissuebot avatar Nov 01 '14 01:11 gissuebot

Original comment posted by [email protected] on 2013-04-28 at 03:49 PM


I am a big fan of the guava libraries. Since no open source object pool implementation out there is to my liking, I have been working on implementing a object pool.

Currently it is still in experimental phase. It supports 2 apis:

Template.doOnPooledObject(new ObjectPool.Handler<PooledObject, SomeException>() {   @Override   public void handle( PooledObject object) throws SomeException {     object.doStuff();   } }, pool, IOException.class); // will throw a IOException

or a more "classic" approach:

PooledObject object = pool.borrowObject(); try {   object.doStuff();   pool.returnObject(object,null); } catch (Exception e) {   pool.returnObject(object,e);   throw e; }

A key feature of my API, is to allow the pool user to provide exception feedback, to allow the pool to retire defective objects...

If you want to take a look, maybe you find some worthwhile ideas:

http://code.google.com/p/spf4j/source

implementation is in org.spf4j.pool package.

let me know what you think.

cheers.

gissuebot avatar Nov 01 '14 01:11 gissuebot

Original comment posted by Ben.Manes on 2013-07-09 at 09:15 AM


I guess I'll throw my hat into the ring...

I was asked by the Cassandra folks if I could implement a class mixing CLHM (predecessor to Guava's Cache), an object pool, and a multimap. The use-case is for maintaining a bound (size, ttl, or idle) on the total number of SSTable random access file readers, with the ability to pool multiple readers per SSTable. As this could impact latencies, a goal was to make it highly concurrent.

The interface is classic and not very interesting.

Internally the resources are denormalized into a cache of synthetic keys to resources. A weak value cache of key to transfer queue acts as a view layer to manage the available resources that category type. A transfer queue is used to provide a fast exchange between producers (release) and consumers (borrow), as elimination helps alleviate contention. The resource's synthetic key retains a hard reference to its queue, allowing unused queues to be aggressively garbage collected by weak references.

Each resource operates within a simple state machine: IDLE, IN_FLIGHT, RETIRED, and DEAD. The idle and in-flight states are self explanatory, indicating only if the resource is in the transfer queue. The retired state is transitioned to when the cache evicts a resource currently being used, thereby requiring the release() to transition it to the dead state. The lifecycle listeners allows the resource to be reset, closed, etc. as needed.

The time-to-idle is a bit naive, as I didn't want to complicate it early on. A secondary cache is used so that the idle time is counted as the time the resource is not in-flight. This could be optimized by using the lock amortization technique directly and not be bang against the hash table's locks. When the idle cache evicts, it transitions the resource to the retired state and invalidates it in the primary cache.

This was written over the July 4th holiday for a specific use-case, so I am sure there's more that could be flushed out. That also means that while it has unit tests, it has not been benchmarked.

https://github.com/ben-manes/multiway-pool

Cheers, Ben

gissuebot avatar Nov 01 '14 01:11 gissuebot

Original comment posted by Ben.Manes on 2013-08-10 at 06:19 AM


Switched to an elimination backoff stack. This is probably the best structure to design a pool around.

EBS - 28M/s LTQ - 15M/s CLQ - 16.5M/s ABQ - 13.5M/s LBQ - 9M/s

gissuebot avatar Nov 01 '14 01:11 gissuebot

Original comment posted by [email protected] on 2013-09-24 at 09:17 PM


Hi Ben, I wrote a while ago implementation, global_pool <--> thread_local_pool which should perform in my opiniion better than EBS in high load cases.

implementtaion is at:

http://code.google.com/p/spf4j/source/browse/#svn%2Ftrunk%2Fspf4j-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fspf4j%2Fpool

I have a simple benchmark against apache commons pool (which is not hard to beat):

http://code.google.com/p/spf4j/source/browse/trunk/spf4j-core/src/test/java/org/spf4j/pool/impl/ObjectPoolVsApache.java

Could run your test against this implementation to see how it performans against EBS?

gissuebot avatar Nov 01 '14 01:11 gissuebot

Original comment posted by Ben.Manes on 2013-09-25 at 02:00 AM


Probably what you implemented, and something I thought of after posting the above, is a global list of handles that a thread retains a thread-local reference to one of. That way a thread will likely claim and release the same resource without contending with another thread. The slight complexity is stealing idle resources when necessary. If that is what you implemented, I agree it should be fundamentally faster than an EBS.

gissuebot avatar Nov 01 '14 01:11 gissuebot

Original comment posted by [email protected] on 2013-09-26 at 11:29 PM


Yup, that is pretty much it. This implementation will "bias" the pooled objects to threads and will not steal a object from other threads if another object can be created or one is available in the "global" bag. Objects can be unbiased from a thread also by a "maintenance" thread.

gissuebot avatar Nov 01 '14 01:11 gissuebot

What about this issue? I recently stumbled upon this implementation by Louis Wasserman. Isn't this implementation a good one? I tested it with load and it performed okay. Some features are missing, naturally, but it's a great start, isn't it?

ogregoire avatar Mar 04 '15 16:03 ogregoire

I was not aware Louis had gotten that much of a start on it!

Note that for it to work most pleasantly, Lease should implement AutoCloseable, meaning the whole thing needs to live in a library that depends on JDK 7, which Guava currently doesn't....

kevinb9n avatar Mar 06 '15 23:03 kevinb9n

To be perfectly honest, I had completely forgotten about the existence of that.

lowasser avatar Mar 06 '15 23:03 lowasser

out of curiosity, given java 6 has been eol'ed for almost 3 years, at what point does guava move on?

jgangemi avatar Mar 06 '15 23:03 jgangemi

Its use on Android which hasn't even fully supported all of the Java 7 APIs will artificially limit it for years. On Mar 6, 2015 6:45 PM, "Jae Gangemi" [email protected] wrote:

out of curiosity, given java 6 has been eol'ed for almost 3 years, at what point does guava move on?

— Reply to this email directly or view it on GitHub https://github.com/google/guava/issues/683#issuecomment-77657042.

JakeWharton avatar Mar 06 '15 23:03 JakeWharton

Extended support for Java 6 ends in December 2018, according to Oracle.

ogregoire avatar Mar 06 '15 23:03 ogregoire

what about a 'guava-extended' library that could offer this type of functionality for those using a newer jdk?

jgangemi avatar Mar 06 '15 23:03 jgangemi

and that aside, i'd still like to see this as part of the library even if i have to call close myself.

jgangemi avatar Mar 06 '15 23:03 jgangemi

This discussion belongs elsewhere, but I believe the expected outcome is that a Guava for Java 8 is likely, but the difference between 6 and 7 is small enough that maintaining yet another fork is not obviously a worthwhile tradeoff.

lowasser avatar Mar 06 '15 23:03 lowasser

As far as this actual issue goes, I would expect significantly more work to be required before adding my implementation to Guava, with a specific emphasis on experimenting with replacing existing pool implementations with this interface and seeing what problems arise?

lowasser avatar Mar 07 '15 00:03 lowasser