appengine-java-vm-runtime
appengine-java-vm-runtime copied to clipboard
Thread intensive applications blocking on VmRequestThreadFactory
Some applications are contending on the mutex within VmRequestThreadFactory when creating lots of threads.
@ludoch @meltsufin I have committed an update to VmRequestThreadFactory to the async-support-331 branch.
Looking at the code, there was already protection for creating threads from the default instance, so the issue must be for a request thread factory.
This patch makes the collection a concurrent one, plus I have added some protection to ensure that the state change during interrupt is seen by all calls to newThread before interrupt iteration begins.
This should improve the issue somewhat, however creating >100 threads per request is not going to be a good scalable solution. A thread pool may be needed to further improve performance?
The patch contains some additional reformatting, which hopefully will make it merge nicely with the master branch.
https://github.com/GoogleCloudPlatform/appengine-java-vm-runtime/pull/332
I mentioned the change to Bin, in charge of building the custom memcache image... Stay tune.
On Wed, Nov 9, 2016 at 5:14 PM, Greg Wilkins [email protected] wrote:
#332 https://github.com/GoogleCloudPlatform/appengine-java-vm-runtime/pull/332
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/appengine-java-vm-runtime/issues/331#issuecomment-259574903, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE4zV4e54SKLMSD_MMpGAJABcS8JfmKks5q8m_pgaJpZM4KuIMM .
@ludoch Do you think we should merge this change in and let users try it out, or wait for Bin?
I would wait for Bin to produce a new custom image that he can give to the customer. I sent an internal email for this to happen. Thanks
On Nov 10, 2016 6:47 AM, "meltsufin" [email protected] wrote:
@ludoch https://github.com/ludoch Do you think we should merge this change in and let users try it out, or wait for Bin?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/appengine-java-vm-runtime/issues/331#issuecomment-259708409, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE4zUV1BKoM8qZ4hAUDLxcjPjuv_Fveks5q8y6CgaJpZM4KuIMM .
@meltsufin @ludoch I would also appreciate some extra eyeballs looking closely over the lock free strategy I've employed in the patch (@sbordet also). My experience with writing lock free code is that it is never as simple as it looks and there are often lurking gotchas.
The latest information on the bottle neck also indicates that this may not be just a straight forward case of 100 threads hammering on the same mutex.
So I think the image should be built and tested, but careful thought is needed before we merge this into a master/async branch. It may be that adding a thread pool could have greater benefit than removing the mutex.
@gregw Agreed. This kind of code is always error prone and should be carefully reviewed. I haven't had a chance to look at it in detail since being away this week. A thread pool sounds like a more effective solution right now anyway.