Make a polling task timeout configurable
An internal task poller currently hardcodes the polling task timeout to 70 seconds. However, it should be set to dynamicconfig.MatchingLongPollExpirationInterval + some delta to account for the full round trip to matching.dynamicconfig.MatchingLongPollExpirationInterval (default is 60 seconds) is the maximum wait time for a new task, after which the server will return empty.
This change will make the polling task timeout configurable when the poller is created.
Can you share the use case for altering the polling timeout from the default server side? If you are decreasing it, the extreme max client side should be acceptable except in cases of RPC hang/failure at which point waiting until 70s may be ok. If we do allow customizing this option, we have to do this for all SDKs, not just Go.
One reason to hesitate to allow this to be customizable is that there are expectations server side and in proxies on this value that, if improperly configured, could break things in surprising ways for users including non-obvious data loss unless the user controls all endpoints and is very sure what they are doing. Also the documentation refers to self-hosted dynamic config only but there are many cloud users for which that doesn't make sense.
Connectivity between Temporal workers and the Temporal cluster goes through a few additional layers. One of them that is a gateway has configured a timeout setting that is less than the default value of 70s. This results in a 504 Gateway timeout error when there are no tasks in queue. I believe deploying workers behind a gateway is a common pattern.
It is a common pattern but people usually set the gateway timeout to a high enough setting to support Temporal's long poll timeout. I believe the server-side setting is mostly available for internal tests. How low are you setting this value? Let me confer with the team to see if we want client-side long poll timeout to be customizable in all of our SDKs.
After discussion with the team, we do not want user-customizable client-side long polling timeout at this time. That it's customizable on the server is more of a testing feature. If we did support this as a full-featured capability, we likely wouldn't take this route but instead have the server inform the SDK of its setting. But this setting isn't really meant to be altered.
We recommend for proxies/gateways to have at least a 70s timeout for calls (80s is better just in case so client's 70s is respected before proxy's timeout is hit). This will enable them to work with all SDKs past and present and CLIs.
thanks @cretz for your clarification and raising this concern to the team. I completely agree with your points that we should prioritize the platform consistency across SDKs and also between self-host/cloud users. I'll discuss alternative solutions with the team internally. I'll close this PR as well.