nodejs-pubsub
nodejs-pubsub copied to clipboard
Unhandled promises in Topic.flush and PubSub.close prevent proper shutdown
Both methods Topic.flush and PubSub.close indicate that it is ok to call them with no parameter and await their completion. Unfortunately when passing no callback, the function internals don't properly await and instead have unhandled promises which can in certain cases lead to message not being sent correctly.
This could be a reason to the recently observed increase in Total timeout of API google.pubsub.v1.Publisher exceeded 60000 milliseconds before any response was received. errors on Google Cloud Functions and Cloud Run because those runtime instances assumingly switch to idle before all messages have been send.
Take a look at Topic.flush:
https://github.com/googleapis/nodejs-pubsub/blob/38fba8bb98636d21c1d2005be4352fb8190e1813/src/topic.ts#L193-L199
The call to this.publisher.flush(callback!); is not awaited nor returned so there is no way for the caller of Topic.flush to await the internal Promise completion without passing a callback.
Similar issue exists in PubSub.close.
After a little more inspection of the Topic.flush code path, I think the problem with unhandled promises is actually more wide-spread. Take a look at the following line of code from Publisher.flush:
https://github.com/googleapis/nodejs-pubsub/blob/38fba8bb98636d21c1d2005be4352fb8190e1813/src/publisher/index.ts#L110-L124
In line 113 the internal Queue.publish method is called to publish the remaining messages. The callback from line 110 isn't passed down into Queue.publish. Instead Promise.all is used to wait for the queue to publish. But the Queue.publish method itself is void; it is not returning a Promise that can be awaited.
https://github.com/googleapis/nodejs-pubsub/blob/38fba8bb98636d21c1d2005be4352fb8190e1813/src/publisher/message-queues.ts#L162-L173
With this, it seems currently impossible to reliably await the complete publication of all message with this library, no?
@cdupuis Thanks for the report. That does look wrong at first glance, but I wonder if this code was relying on promisify(), and I missed updating it or something.
I'm moving this to our internal backlog to look at along with other unhandled promise and graceful shutdown issues.
Re-opening this upon request from TSEs.
This got some changes from the exactly once delivery work, and it looks like the current unit tests should cover it, so I think this is finished. Please comment if you're still having the issue.