gcs-connector-for-apache-kafka icon indicating copy to clipboard operation
gcs-connector-for-apache-kafka copied to clipboard

What should be the config for output to be in parquet format?

Open Rstar1998 opened this issue 3 years ago • 0 comments

I have the following file for gcs sink connector for events to be stored in parquet format? But this thing is not working. Is there anything more needed for parquet conversion ?

{
  "name": "GCS_CONN_REG",
  "config": {
    "connector.class": "io.aiven.kafka.connect.gcs.GcsSinkConnector",
    "gcs.bucket.name": "name",
    "file.name.prefix": "test/",
    "format.output.type": "parquet",
    "name": "GCS_CONN_REG",
    "value.converter.schemas.enable": "false",
    "format.output.fields": "key,value,offset,timestamp,headers",
    "gcs.credentials.json": "",
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "topics.regex": "abc.*",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "file.compression.type": "gzip",
    "file.name.template": "{{topic}}/{{timestamp:unit=yyyy}}/{{timestamp:unit=MM}}/{{timestamp:unit=dd}}/{{partition}}-{{start_offset}}.parquet",
    "errors.tolerance": "all",
    "consumer.override.auto.offset.reset": "latest",
    "errors.log.enable": "true",
    "errors.deadletterqueue.topic.name": "gcs_external_connector",
    "errors.log.include.messages": "true"
  }
}

Rstar1998 avatar Nov 01 '22 11:11 Rstar1998