logstash-output-opensearch icon indicating copy to clipboard operation
logstash-output-opensearch copied to clipboard

[QUESTION] Clarification on Data Processing Order During Outages

Open nw-engineer opened this issue 5 months ago • 2 comments

Hello,

I hope this message finds you well. I am currently utilizing Logstash 8.4.3 with the OpenSearch output plugin for my data pipeline and have some questions regarding its behavior during network outages or service disruptions.

I understand that in the event of a connectivity issue to OpenSearch, the plugin is designed to retry indefinitely. While I recognize the importance of ensuring data delivery, I am concerned about the potential implications of this behavior during prolonged outages. Specifically, I am interested in understanding how Logstash prioritizes data processing and retry attempts under such circumstances.

Could you please clarify if, during a prolonged outage where Logstash cannot connect to OpenSearch, Logstash prioritizes retrying the buffered or queued data over processing new incoming data? In other words, does Logstash attempt to clear the backlog of retries before it starts processing newly received data once the connection is re-established?

This information is crucial for planning our data pipeline resilience and understanding how Logstash would handle scenarios where the target index might become unavailable (e.g., moved to UltraWarm in AWS Elasticsearch Service) and further complicates the retry logic.

Thank you for your time and assistance in addressing this query. Your insights will be incredibly valuable for our ongoing and future implementations.

Best regards

nw-engineer avatar Feb 27 '24 10:02 nw-engineer