Enable asynchronous (async) Server-Tentacle communication (Halibut) to increase concurrent Server Tasks
Prerequisites
- [x] I have searched open and closed issues to make sure it isn't already requested
- [x] My team has started working on this issue
- [x] I have written a descriptive issue title
The enhancement
Communicating with many Tentacles at once is prone to locking up Octopus Server. This enhancement aims to improve on that by making communication to Tentacles use asynchronous code paths.
The Need
Communication with Tentacles is synchronous, which is limiting the number of Tentacles that can be communicated with concurrently.
The problem is when Octopus Server communicates to many Tentacles concurrently, each synchronous communication path blocks the thread it is on while communication is occurring.
These threads come from the worker thread pool. So as more Tentacles are being communicated with at the same time, the more threads from the thread pool are being blocked.
With enough concurrent communication, this will cause all worker pool threads of the CLR to be consumed, leaving no more worker pool threads available for other work (for example, handling UI requests, or other deployments).
This results in Octopus Server appearing to be hung, even though there is no CPU usage, and has been found to cause Deployment and Runbook Run failures.
Solution
The solution is to switch to using asynchronous IO for all communications with Tentacles.
This will mean that when a thread is waiting for communication, the thread from the thread pool it is running on will be free to perform other tasks.
Impact on Responsiveness
Only a few users have reported problems from this issue. But for those who need to communicate to a lot of Tentacles at once, and improve responsiveness while communication is happening, this will remove a significant bottleneck.