stream-reactor icon indicating copy to clipboard operation
stream-reactor copied to clipboard

java.lang.IllegalStateException when starting S3SinkTask

Open Lucas3oo opened this issue 2 years ago • 5 comments

Issue Guidelines

Please review these questions before submitting any issue?

What version of the Stream Reactor are you reporting this issue for?

3.0.1-2.5.0

Are you running the correct version of Kafka/Confluent for the Stream reactor release?

Yes 2.13-2.8.0 (at least I think so) Vanilla local installation

Do you have a supported version of the data source/sink .i.e Cassandra 3.0.9?

kafka-connect-aws-s3-3.0.1-2.5.0-all.jar

Have you read the docs?

yes

What is the expected behaviour?

That the S3 connector shall start.

I tested with aws cli (e.g. aws s3 ls glassbeam-exporter) using the same access key and secret key and I could see files in the bucket.

What was observed?

[2022-04-14 16:53:50,863] ERROR WorkerSinkTask{id=c360-s3-connector -0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:184) java.lang.IllegalStateException: java.lang.ExceptionInInitializerError at io.lenses.streamreactor.connect.aws.s3.sink.ThrowableEither.toThrowable(ThrowableEither.scala:34) at io.lenses.streamreactor.connect.aws.s3.sink.S3SinkTask.start(S3SinkTask.scala:65) at org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:308)

What is your Connect cluster configuration (connect-avro-distributed.properties)?

What is your connector properties configuration (my-connector.properties)?

[2022-04-14 16:53:50,511] INFO S3ConfigDefBuilder values: connect.s3.aws.access.key = [hidden] connect.s3.aws.auth.mode = Credentials connect.s3.aws.region = eu-north-1 connect.s3.aws.secret.key = [hidden] connect.s3.custom.endpoint = connect.s3.disable.flush.count = false connect.s3.error.policy = THROW connect.s3.http.max.retries = 5 connect.s3.http.retry.interval = 50 connect.s3.kcql = insert into glassbeam-exporter:testresults select * from outbox.event.TestResultTelemetricV1 STOREAS JSON WITH_FLUSH_COUNT = 5000 connect.s3.local.tmp.directory = connect.s3.max.retries = 20 connect.s3.retry.interval = 60000 connect.s3.vhost.bucket = false

Please provide full log files (redact and sensitive information)

[2022-04-14 16:53:50,349] WARN The configuration 'metrics.context.connect.kafka.cluster.id' was supplied but isn't a known config. (org.apache.kafka.clients.consumer.ConsumerConfig:380) [2022-04-14 16:53:50,350] INFO Kafka version: 2.8.0 (org.apache.kafka.common.utils.AppInfoParser:119) [2022-04-14 16:53:50,350] INFO Kafka commitId: ebb1d6e21cc92130 (org.apache.kafka.common.utils.AppInfoParser:120) [2022-04-14 16:53:50,350] INFO Kafka startTimeMs: 1649948030350 (org.apache.kafka.common.utils.AppInfoParser:121) [2022-04-14 16:53:50,356] INFO Created connector c360-s3-connector (org.apache.kafka.connect.cli.ConnectStandalone:109) [2022-04-14 16:53:50,357] INFO [Consumer clientId=connector-consumer-c360-s3-connector -0, groupId=connect-c360-s3-connector] Subscribed to topic(s): outbox.event.TestResultTelemetricV1 (org.apache.kafka.clients.consumer.KafkaConsumer:965) [2022-04-14 16:53:50,404] INFO START: Executing ConfigDef processor .... [2022-04-14 16:53:50,511] INFO S3ConfigDefBuilder values: connect.s3.aws.access.key = [hidden] connect.s3.aws.auth.mode = Credentials connect.s3.aws.region = eu-north-1 connect.s3.aws.secret.key = [hidden] connect.s3.custom.endpoint = connect.s3.disable.flush.count = false connect.s3.error.policy = THROW connect.s3.http.max.retries = 5 connect.s3.http.retry.interval = 50 connect.s3.kcql = insert into glassbeam-exporter:testresults select * from outbox.event.TestResultTelemetricV1 STOREAS JSON WITH_FLUSH_COUNT = 5000 connect.s3.local.tmp.directory = connect.s3.max.retries = 20 connect.s3.retry.interval = 60000 connect.s3.vhost.bucket = false (io.lenses.streamreactor.connect.aws.s3.config.S3ConfigDefBuilder:372) [2022-04-14 16:53:50,863] ERROR WorkerSinkTask{id=c360-s3-connector -0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:184) java.lang.IllegalStateException: java.lang.ExceptionInInitializerError at io.lenses.streamreactor.connect.aws.s3.sink.ThrowableEither.toThrowable(ThrowableEither.scala:34) at io.lenses.streamreactor.connect.aws.s3.sink.S3SinkTask.start(S3SinkTask.scala:65) at org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:308) at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:196) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:182) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:231) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833)

Lucas3oo avatar Apr 14 '22 15:04 Lucas3oo

I did try with kafka_2.12-2.8.1 too, same issue And also tested kafka_2.12-2.5.0, same issue

Lucas3oo avatar Apr 14 '22 15:04 Lucas3oo

This is the AWS policy: { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::glassbeam-exporter", "arn:aws:s3:::glassbeam-exporter/*" ] } ] }

Lucas3oo avatar Apr 14 '22 15:04 Lucas3oo

Are you sure this is the full log file? I would have expected to see a cause which would help diagnose this.

You really need the build for 2.8 which is currently in final stages of testing.

You can get it here:

https://drive.google.com/drive/folders/1egDmcFv3DQToUQnbUGEqZrgBKcFmQNKO

Let me know if this helps

davidsloan avatar Apr 27 '22 13:04 davidsloan

Hi, it is not the full log since that is a lot of log statements. I was hoping the exception message should be more informative.

But this issue might be related to that the AWS client it is using tries to call the some REST endpoint to find in which region the bucket is in. And that might go bad and the root exception is hidden.

If I can read somewhere what permission is needed for the connector it would be great. I also suspect when comparing the permission the Confluent S3 connector needs that "s3:PutObject", "s3:GetObject", "s3:ListBucket" might not be enough.

Lucas3oo avatar May 01 '22 08:05 Lucas3oo

Hi, it is not the full log since that is a lot of log statements. I was hoping the exception message should be more informative.

Without the log information I cannot help you establish the cause.

But this issue might be related to that the AWS client it is using tries to call the some REST endpoint to find in which region the bucket is in. And that might go bad and the root exception is hidden.

Yes we have a ticket open for this one. It seems the library we are using is making extra unexpected calls to getBucketLocation. Please see my latest comment for a fix: https://github.com/lensesio/stream-reactor/issues/826

If I can read somewhere what permission is needed for the connector it would be great. I also suspect when comparing the permission the Confluent S3 connector needs that "s3:PutObject", "s3:GetObject", "s3:ListBucket" might not be enough.

I will aim to add this to the docs at the next update.

davidsloan avatar May 03 '22 07:05 davidsloan