beats Add test for elasticsearch re-connection after network error & allow graceful shutdown

Add test for elasticsearch re-connection after network error & allow graceful shutdown

Open belimawr opened this issue 5 months ago • 16 comments

Proposed commit message

This commit reworks the eslegclient.Connection to accept a context in its Connect method, this allows the caller to cancel any in flight requests made by the connection by cancelling the context.

The libbeat outputs.Connectable interface (used by outputs.NetworkClient) had to be updated to accept the context, which required refactoring in most of the outputs to also accept a context on connect.

The worker from libbeat/publisher/pipeline/client_worker.go now uses a context for it's cancellation instead of a channel, this context is also used when creating a connection to Elasticsearch.

An integration test is added to ensure the ES output can always recover from network errors.

Checklist

[x] My code follows the style guidelines of this project
[x] I have commented my code, particularly in hard-to-understand areas
~~[ ] I have made corresponding changes to the documentation~~
~~[ ] I have made corresponding change to the default configuration files~~
[x] I have added tests that prove my fix is effective or that my feature works
[x] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

It's a bug fix, there is no disruptive user impact

~~## Author's Checklist~~

How to test this PR locally

Build Filebeat
Get it sending data to ES
Disconnect from the network, stop ES, do anything that will prevent Filebeat from reaching ES
Wait for network error logs
Re-start ES/reconnect to the network
Filebeat should recover and start sending data again.

Related issues

https://github.com/elastic/beats/issues/40705

~~## Use cases~~ ~~## Screenshots~~ ~~## Logs~~

Sep 12 '24 17:09 belimawr

beats beats copied to clipboard

Add test for elasticsearch re-connection after network error & allow graceful shutdown

Proposed commit message

Checklist

Disruptive User Impact

How to test this PR locally

Related issues

beats
beats copied to clipboard