data-prepper
data-prepper copied to clipboard
[BUG] Empty DLQ Objects and DLQ objects with data even though data is loaded correctly
Describe the bug
- Pipeline with
dyanamodbas the source andOpenSearch Serverlesssink is creating empty dlqObjects{"dlqObjects":[]} - non-empty dlqObjects are created even though data is loaded into OpenSearch. Seeing messages like these
"status":0,"message":"Number of retries reached the limit of max retries (configured value 10)
To Reproduce Steps to reproduce the behavior:
- Define a pipeline with a dynamodb table as the source (ideally with at least 10M records)
- Define an OpenSearch serverless sink
- Define S3 bucket and prefix for dlq
- Run pipeline
- DLQ S3 bucket will have several empty s3 objects that are 17.0 bytes in size (
{"dlqObjects":[]} - Some DLQ S3 objects have data, but those items are loaded in OpenSearch
Expected behavior
- No DLQ objects are created if the data has been loaded successfully.
- If data load is not successful and dlq s3 object is created, then
dlqObjectsshould be populated with relevant data. - If data is ingested in OpenSearch dlq object with the id should not be created
Screenshots If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
- OS: [e.g. Ubuntu 20.04 LTS]
- Version [e.g. 22]
Additional context
- max_retries is set to 10
- Pipeline has has min 1 OCU and max 20 OCU
- dynamodb table has ~100M records
- OpenSearch Serverless sink
@amitkirdatt , We are releasing Data Prepper 2.8.0 today with a fix that may resolve this. See #4301.