beats icon indicating copy to clipboard operation
beats copied to clipboard

[Filebeat] gcp pubsub input stopped with error: stream terminated by RST_STREAM with error code: INTERNAL_ERROR

Open kwinstonix opened this issue 1 year ago • 3 comments

  • Version: 8.3.2
  • Operating System: Linux
  • Steps to Reproduce:

we run filebeat with gcp-pubsub input, after some days FileBeat stopped pulling message. Here is the error log:

"log.origin":{"file.name":"gcppubsub/input.go","file.line":143},"message":"rpc error: code = Internal desc = stream terminated by RST_STREAM with error code: INTERNAL_ERROR","service.name":"filebeat" "log.origin":{"file.name":"gcppubsub/input.go","file.line":144},"message":"Pub/Sub input worker has stopped."

It looks like there is no retry mechanism for the unexpected errors. We should improve the stability of pulling message.

https://github.com/elastic/beats/blob/45f722f492dcf1d13698c6cf618b339b1d4907be/x-pack/filebeat/input/gcppubsub/input.go#L132-L148

kwinstonix avatar Jul 29 '22 08:07 kwinstonix

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

elasticmachine avatar Aug 03 '22 08:08 elasticmachine

The gcp pubsub library is supposed to retry on it's own. 😞 So this must be considered a non-retryable error. We considered adding our own retry in the past (https://github.com/elastic/beats/issues/29352), and now it looks like we need it.

andrewkroh avatar Aug 03 '22 15:08 andrewkroh

Yes, that is a non-retryable error and GCP SDK can't handle this error for now. So if we can restart input worker to keep filebeat pulling message, that would improves filebeat stability.

kwinstonix avatar Aug 04 '22 02:08 kwinstonix