data-prepper icon indicating copy to clipboard operation
data-prepper copied to clipboard

Address Scale Items for lambda plugin

Open srikanthjg opened this issue 1 year ago • 0 comments

Description

Following are the changes made:

Address Scale issues:

  1. Added support for lambda async client in lambda processor and sink 1.1 We will have a asynchronous call to lambda at a batch level, ie, we could send multiple batches to lambda at the same time. We will wait for the futures only after the entire set of records that was received by the processors are done. 1.2 Handle metrics and buffer per batch based on futures processing.
  2. Make sdk timeout a user configurable parameter.
  3. Add Codec for request and response from lambda. NOTE: Json codec as input and output is the current default. And lambda response codec always assumes json array as the response.
  4. Removed payload_model and will no more support SINGLE_EVENT, will only support BATCH based calls by default

Acknowledgements: 5. Address Acknowledgements for processor and sink. For Processors: 5.1. When a batch of N events that is configured by the user (N could be <=pipeline batch) ie, request to lambda contains N events in a batch , the lambda could return back N responses or M responses(N!=M). When N responses are sent as a json array, we resuse the original records and clear the old event data and populate it with the response from lambda, that way the acknowledgement set need not be changed. 5.2 When M responses(N!=M), we create new events and populate them to the original acknowledgement set. The older events are also retained in the ack set but will be released by core later.

Handling Failure: 6. Address failures at process the events, the events in the processor will be tagged and forwarded. This processor will NOT drop events on failure. 7. Lambda sink will send to DLQ on failure and will acknowledge as true. If a dlq is not setup, we will send a negative acknowledgement.

Refactor: 8. Refactor aws lambda plugin to have a class for common methods between processor and sink.

Issues Resolved

Resolves #5031

Check List

  • [x] New functionality includes testing.
  • [ ] New functionality has a documentation issue. Please link to it in this PR.
    • [ ] New functionality has javadoc added
  • [x] Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

srikanthjg avatar Oct 08 '24 06:10 srikanthjg