seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[Bug] [Module Name] org.apache.hadoop.fs.s3a.S3AStorageStatistics cannot be cast to org.apache.hadoop.fs.s3a.S3AStorageStatistics

Open keith002 opened this issue 1 year ago • 6 comments

Search before asking

  • [x] I had searched in the issues and found no similar issues.

What happened

Encountering issues while synchronizing from MySQL data source to S3 using the following configuration

I also added it as requested To use this connector you need put hadoop-aws-3.1.4.jar and aws-java-sdk-bundle-1.11.271.jar in ${SEATUNNEL_HOME}/lib dir.

But when executing the method, it will prompt that the corresponding package cannot be found

SeaTunnel Version

seatunnel 2.3.3

SeaTunnel Config

`env {
  # You can set flink configuration here
  execution.parallelism = 2
  job.mode = "BATCH"
}

source {
  Jdbc {
    "result_table_name": "table",
    "url": "url",
    "driver": "com.mysql.cj.jdbc.Driver",
    "connection_check_timeout_sec": 100,
    "user": "root",
    "password": "123456",
    "query": "SELECT * FROM test_table"
  }
}

transform {
}

sink {
   S3File {
    bucket = "s3a://cs"
       tmp_path = "path"
       path="/test"
       fs.s3a.endpoint="http://endpoint.cn"
       fs.s3a.aws.credentials.provider="org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"
       access_key = "xxxxxxxx"
       secret_key = "xxxxxxxx"
       file_format_type = "orc"
   }
}`

Running Command

The main method used  org.apache.seatunnel.example.engine.SeaTunnelEngineExample

Error Exception

Caused by: java.lang.ClassCastException: org.apache.hadoop.fs.s3a.S3AStorageStatistics cannot be cast to org.apache.hadoop.fs.s3a.S3AStorageStatistics
	at org.apache.hadoop.fs.s3a.S3AFileSystem.createStorageStatistics(S3AFileSystem.java:358) ~[hadoop-aws-3.1.4.jar:?]
	at org.apache.hadoop.fs.s3a.S3AFileSystem.<init>(S3AFileSystem.java:191) ~[hadoop-aws-3.1.4.jar:?]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_181]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_181]
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_181]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_181]
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) ~[hadoop-common-3.1.4.jar:?]
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3302) ~[hadoop-common-3.1.4.jar:?]
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) ~[hadoop-common-3.1.4.jar:?]
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352) ~[hadoop-common-3.1.4.jar:?]
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320) ~[hadoop-common-3.1.4.jar:?]
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479) ~[hadoop-common-3.1.4.jar:?]
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:227) ~[hadoop-common-3.1.4.jar:?]
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:463) ~[hadoop-common-3.1.4.jar:?]
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) ~[hadoop-common-3.1.4.jar:?]
	at org.apache.seatunnel.shade.connector.file.org.apache.orc.OrcFile.createWriter(OrcFile.java:857) ~[connector-file-s3-2.3.3.jar:2.3.3]
	at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.getOrCreateWriter(OrcWriteStrategy.java:126) ~[connector-file-s3-2.3.3.jar:2.3.3]
	at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.write(OrcWriteStrategy.java:74) ~[connector-file-s3-2.3.3.jar:2.3.3]
	at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:126) ~[connector-file-s3-2.3.3.jar:2.3.3]
	at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:43) ~[connector-file-s3-2.3.3.jar:2.3.3]
	at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:227) ~[classes/:?]
	... 16 more

Zeta or Flink or Spark Version

Zeta 2.3.3

Java or Scala Version

java 1.8.0.181

Screenshots

Image

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

keith002 avatar Mar 13 '25 09:03 keith002

hi,local run may depend on conflicts, try the -local mode

corgy-w avatar Mar 13 '25 16:03 corgy-w

Hello, sorry for the late recovery Image I use /seatunnel.sh --config /data/seatunnelTest/doris-s3.conf -e local Error in executing returns Currently, I have deployed these jar packages in plugins Image

keith002 avatar Mar 14 '25 09:03 keith002

Try deploy it locally and then submit the task to start.

corgy-w avatar Mar 15 '25 01:03 corgy-w

After local deployment, use the interface hazelcast/rest/maps/submit-job
{ "env": { "job.mode": "batch" }, "source": [ { "plugin_name": "Jdbc", "result_table_name": "lh_score_grade", "url": "jdbc:mysql://127.0.0.1:3306/lh_financial_data_test?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=UTC&nuLlCatalogMeansCurrent=true", "driver": "com.mysql.cj.jdbc.Driver", "connection_check_timeout_sec": 100, "user": "test", "password": "123456", "query": "SELECT customer_code,data_point,type,code,value,op_time FROM lh_customer_financial_data limit 10" } ], "transform": [], "sink": [ { "plugin_name": "S3File", "source_table_name": [ "lh_score_grade" ], "bucket": "s3a://cs", "tmp_path": "/data/test", "path": "/test", "fs.s3a.endpoint": "http://endpoint.cn", "fs.s3a.aws.credentials.provider": "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider", "access_key": "xxxxxxxx", "secret_key": "xxxxxxxx", "file_format_type": "orc" } ] }

Image { "jobId": "1", "jobName": "synctest", "jobStatus": "RUNNING", "envOptions": { "job.mode": "batch" }, "createTime": "2025-03-17 14:09:39", "jobDag": { "vertices": [ { "id": 1, "name": "Source[0]-Jdbc(id=1)", "parallelism": 1 }, { "id": 3, "name": "Sink[0]-S3File-MultiTableSink(id=3)", "parallelism": 1 } ], "edges": [ { "inputVertex": "Source[0]-Jdbc", "targetVertex": "Sink[0]-S3File-MultiTableSink" } ] }, "pluginJarsUrls": [ { "jarPath": "file:/opt/seatunnel/plugins/transforms-overwrite-2.3.5-2.12.15.jar" }, { "jarPath": "file:/opt/seatunnel/plugins/mysql-connector-java-8.0.28.jar" }, { "jarPath": "file:/opt/seatunnel/plugins/hadoop-aws-3.1.4.jar" }, { "jarPath": "file:/opt/seatunnel/connectors/connector-file-s3-2.3.5.jar" }, { "jarPath": "file:/opt/seatunnel/plugins/mssql-jdbc-9.2.1.jre8.jar" }, { "jarPath": "file:/opt/seatunnel/plugins/aws-java-sdk-bundle-1.12.692.jar" }, { "jarPath": "file:/opt/seatunnel/connectors/connector-jdbc-2.3.5.jar" } ], "isStartWithSavePoint": true, "metrics": { "SourceReceivedCount": "10", "SinkWriteCount": "10" } }
He has been stuck in the running query status display

keith002 avatar Mar 17 '25 06:03 keith002

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Apr 17 '25 00:04 github-actions[bot]

Hello, I encountered an error while running the version mentioned above. However, I upgraded the version I was using and was able to synchronize successfully. The previous version did not work~

keith002 avatar Apr 24 '25 06:04 keith002

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Jul 18 '25 00:07 github-actions[bot]

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.

github-actions[bot] avatar Jul 25 '25 00:07 github-actions[bot]