kafka-connect-elasticsearch icon indicating copy to clipboard operation
kafka-connect-elasticsearch copied to clipboard

Ignore 'document_parsing_exception'

Open hullarb opened this issue 1 year ago • 1 comments

Hi All,

We are running connector version 14.0.7 with elasticsearch 8.10 with datastreams and we configured ignoring malformed documents. Unfortunately when elasticsearch cannot index some document with document_parsing_exception the connector task fails. Could you add this error to the ignored errors as well? it is thrown also because malformed document issue.

part of our connector config for reference:

behavior.on.malformed.documents: IGNORE
behavior.on.null.values: IGNORE
errors.tolerance: all
errors.deadletterqueue.topic.name: failed-ingestion

and a sample elasticsearch error message what i got after trying to ingest the document manually:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "parsing_exception",
        "reason" : "Failed to parse object: expecting token of type [START_OBJECT] but found [VALUE_STRING]",
        "line" : 1,
        "col" : 234
      }
    ],
    "type" : "document_parsing_exception",
    "reason" : "[1:234] failed to parse field [properties] of type [flattened] in document with id   REDACTED'",
    "caused_by" : {
      "type" : "parsing_exception",
      "reason" : "Failed to parse object: expecting token of type [START_OBJECT] but found [VALUE_STRING]",
      "line" : 1,
      "col" : 234
    }
  },
  "status" : 400
}

Thanks, Bela

hullarb avatar Dec 20 '23 09:12 hullarb

i've opened a PR with a possible fix for this https://github.com/confluentinc/kafka-connect-elasticsearch/pull/748

hullarb avatar Feb 07 '24 19:02 hullarb