nodejs-pubsub icon indicating copy to clipboard operation
nodejs-pubsub copied to clipboard

Unhandled promises in Topic.flush and PubSub.close prevent proper shutdown

Open cdupuis opened this issue 3 years ago • 4 comments
trafficstars

Both methods Topic.flush and PubSub.close indicate that it is ok to call them with no parameter and await their completion. Unfortunately when passing no callback, the function internals don't properly await and instead have unhandled promises which can in certain cases lead to message not being sent correctly.

This could be a reason to the recently observed increase in Total timeout of API google.pubsub.v1.Publisher exceeded 60000 milliseconds before any response was received. errors on Google Cloud Functions and Cloud Run because those runtime instances assumingly switch to idle before all messages have been send.

Take a look at Topic.flush:

https://github.com/googleapis/nodejs-pubsub/blob/38fba8bb98636d21c1d2005be4352fb8190e1813/src/topic.ts#L193-L199

The call to this.publisher.flush(callback!); is not awaited nor returned so there is no way for the caller of Topic.flush to await the internal Promise completion without passing a callback.

Similar issue exists in PubSub.close.

cdupuis avatar Jan 15 '22 12:01 cdupuis

After a little more inspection of the Topic.flush code path, I think the problem with unhandled promises is actually more wide-spread. Take a look at the following line of code from Publisher.flush:

https://github.com/googleapis/nodejs-pubsub/blob/38fba8bb98636d21c1d2005be4352fb8190e1813/src/publisher/index.ts#L110-L124

In line 113 the internal Queue.publish method is called to publish the remaining messages. The callback from line 110 isn't passed down into Queue.publish. Instead Promise.all is used to wait for the queue to publish. But the Queue.publish method itself is void; it is not returning a Promise that can be awaited.

https://github.com/googleapis/nodejs-pubsub/blob/38fba8bb98636d21c1d2005be4352fb8190e1813/src/publisher/message-queues.ts#L162-L173

With this, it seems currently impossible to reliably await the complete publication of all message with this library, no?

cdupuis avatar Jan 15 '22 18:01 cdupuis

@cdupuis Thanks for the report. That does look wrong at first glance, but I wonder if this code was relying on promisify(), and I missed updating it or something.

feywind avatar Jan 17 '22 22:01 feywind

I'm moving this to our internal backlog to look at along with other unhandled promise and graceful shutdown issues.

feywind avatar Mar 24 '22 19:03 feywind

Re-opening this upon request from TSEs.

feywind avatar Mar 28 '22 14:03 feywind

This got some changes from the exactly once delivery work, and it looks like the current unit tests should cover it, so I think this is finished. Please comment if you're still having the issue.

feywind avatar Feb 08 '23 20:02 feywind