kafka-connect-elasticsearch Add option to use auto-generated IDs on indexing

Add option to use auto-generated IDs on indexing

Open gjw13 opened this issue 2 years ago • 3 comments

Problem

While setting the document ID when indexing does provide exactly once delivery, it does put more load on Elasticsearch and is not necessary for all use cases.

PRs have been made for this issue before (https://github.com/confluentinc/kafka-connect-elasticsearch/pull/393) and (https://github.com/confluentinc/kafka-connect-elasticsearch/pull/510). This PR is largely an update to the most recent one, as again there were many merge conflicts that needed resolving there as it fell out of date.

Addresses https://github.com/confluentinc/kafka-connect-elasticsearch/issues/139 and https://github.com/confluentinc/kafka-connect-elasticsearch/issues/97

Solution

Add a new option to use the autogenerated document id on index requests. The new option (use.autogenerated.ids) will default to false and only be applicable when write.method is set to INSERT.

Note that the large diff in the DataCoverter class on the convertRecord method is a result of having to pull a chunk of that code out into a separate method. The checkstyle plugin was throwing errors when an extra statement was added in that the cyclomatic complexity got too high.

Does this solution apply anywhere else?

[ ] yes
[x] no

If yes, where?

Test Strategy

Testing done:

[x] Unit tests
[ ] Integration tests
[ ] System tests
[x] Manual tests

As with (https://github.com/confluentinc/kafka-connect-elasticsearch/pull/510), we are running live connectors leveraging this feature.

Release Plan

Mar 15 '23 15:03 gjw13

I attempted to sign the CLA, but the URL doesn't resolve.

Mar 15 '23 15:03 gjw13

All committers have signed the CLA.

Sep 11 '23 09:09 cla-assistant[bot]

kafka-connect-elasticsearch kafka-connect-elasticsearch copied to clipboard

Add option to use auto-generated IDs on indexing

Problem

Solution

Does this solution apply anywhere else?

If yes, where?

Test Strategy

Testing done:

Release Plan

kafka-connect-elasticsearch
kafka-connect-elasticsearch copied to clipboard