seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[Bug] [Module Name] mysql2paimon org.xerial.snappy.SnappyNative.maxCompressedLength error

Open yxfff opened this issue 1 year ago • 5 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

when I use zeta engine to sync mysql to apache paimon ,it appears java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.maxCompressedLength(I)I error 。my paimon version is 0.9

SeaTunnel Version

2.3.7

SeaTunnel Config

env {
  parallelism = 1
  job.mode = "BATCH"
}
source {
  Jdbc {
        url = "jdbc:mysql://xxx:3306/test"
        driver = "com.mysql.cj.jdbc.Driver"
        connection_check_timeout_sec = 100
        user = "root"
        password = "xxx"
        query = "select name,age from test.paimon_test"
        partition_column = "name"
        split.size = 10000
    }
}

sink {
  Paimon {
    catalog_name="paimon_test"
    warehouse="hdfs://xxx:8020/warehouse/tablespace/external/hive/"
    database="yxf"
    table="paimon_test3"
    paimon.table.write-props = {
      #bucket = 4
      #bucket-key="create_time"
      snapshot.num-retained.min = 3
      snapshot.num-retained.max = 10
      file.format = "orc"
      deletion-vectors.enabled = "true"
    }
  }
}

Running Command

./bin/seatunnel.sh -c ./config/paimon/mysql2paimon.conf

Error Exception

Caused by: java.lang.RuntimeException: Exception occurs when preparing snapshot #1 (path hdfs://xxx:8020/warehouse/tablespace/external/hive/yxf.db/paimon_test3/snapshot/snapshot-1) by user 11dde3ef-bee7-4766-acff-89a9c8b7b3c0 with hash 9223372036854775807 and kind APPEND. Clean up.
        at org.apache.paimon.operation.FileStoreCommitImpl.tryCommitOnce(FileStoreCommitImpl.java:829)
        at org.apache.paimon.operation.FileStoreCommitImpl.tryCommit(FileStoreCommitImpl.java:605)
        at org.apache.paimon.operation.FileStoreCommitImpl.commit(FileStoreCommitImpl.java:248)
        at org.apache.paimon.table.sink.TableCommitImpl.commitMultiple(TableCommitImpl.java:192)
        at org.apache.paimon.table.sink.TableCommitImpl.commit(TableCommitImpl.java:186)
        at org.apache.paimon.table.sink.TableCommitImpl.commit(TableCommitImpl.java:165)
        at org.apache.paimon.table.sink.TableCommitImpl.commit(TableCommitImpl.java:160)
        at org.apache.seatunnel.connectors.seatunnel.paimon.sink.PaimonSinkWriter.<init>(PaimonSinkWriter.java:125)
        ... 20 more
Caused by: java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.maxCompressedLength(I)I
        at org.xerial.snappy.SnappyNative.maxCompressedLength(Native Method)
        at org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:320)
        at org.apache.paimon.shade.org.apache.avro.file.SnappyCodec.compress(SnappyCodec.java:55)
        at org.apache.paimon.shade.org.apache.avro.file.DataFileStream$DataBlock.compressUsing(DataFileStream.java:398)
        at org.apache.paimon.shade.org.apache.avro.file.DataFileWriter.writeBlock(DataFileWriter.java:407)
        at org.apache.paimon.shade.org.apache.avro.file.DataFileWriter.sync(DataFileWriter.java:428)
        at org.apache.paimon.shade.org.apache.avro.file.DataFileWriter.flush(DataFileWriter.java:437)
        at org.apache.paimon.format.avro.AvroBulkWriter.flush(AvroBulkWriter.java:45)
        at org.apache.paimon.format.avro.AvroFileFormat$RowAvroWriterFactory$1.flush(AvroFileFormat.java:126)
        at org.apache.paimon.io.SingleFileWriter.close(SingleFileWriter.java:144)
        at org.apache.paimon.io.RollingFileWriter.closeCurrentWriter(RollingFileWriter.java:108)
        at org.apache.paimon.io.RollingFileWriter.close(RollingFileWriter.java:145)
        at org.apache.paimon.manifest.ManifestFile.write(ManifestFile.java:94)
        at org.apache.paimon.operation.FileStoreCommitImpl.tryCommitOnce(FileStoreCommitImpl.java:766)
        ... 27 more

        ... 11 more

        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:203)
        ... 2 more

Zeta or Flink or Spark Version

zeta

Java or Scala Version

java 1.8.0_351

Screenshots

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

yxfff avatar Oct 10 '24 08:10 yxfff

Did you upgrade paimon to 0.9 in the paimon sink? At the moment it seems to rely on 0.7. The exception is usually that the jar in the classpath contains a different version of snappy. Like this pr https://github.com/apache/seatunnel/pull/6449.

image

dailai avatar Oct 11 '24 00:10 dailai

assign to me

hawk9821 avatar Oct 11 '24 00:10 hawk9821

OK,Thanks

yxfff avatar Oct 11 '24 01:10 yxfff

please refer https://github.com/apache/hadoop/pull/3385 @hawk9821 @dailai @yxfff

Hisoka-X avatar Oct 11 '24 02:10 Hisoka-X

https://github.com/apache/seatunnel/blob/dev/docs/en/connector-v2/sink/Paimon.md#changelog Only the changelog-producer of ST paimon is supported the none and input mode. deletion-vectors.enabled=true used the lookup mode. It didn't reappear this issues after I upgraded the Piamon version

env {
  parallelism = 1
  job.mode = "BATCH"
}
source {
  Jdbc {
    url = "jdbc:mysql://xxx.xxx.xxx.xxx:3306?rewriteBatchedStatements=true"
    driver = "com.mysql.cj.jdbc.Driver"
    connection_check_timeout_sec = 100
    user = "xxx"
    password = "xxxxx"
    table_path = "source.user"
    query = "select * from source.user"
    partition_column = "id"
    split.size = 10000
  }
}


sink {
  Paimon {
    warehouse = "hdfs:///tmp/paimon"
    database = "sink"
    table = "user3"
    paimon.table.write-props = {
      #bucket = 1
      #bucket-key="create_time"
      snapshot.num-retained.min = 3
      snapshot.num-retained.max = 10
      file.format = "orc"
    }
    paimon.hadoop.conf = {
      fs.defaultFS = "hdfs://nameservice1"
      dfs.nameservices = "nameservice1"
      dfs.ha.namenodes.nameservice1 = "nn1,nn2"
      dfs.namenode.rpc-address.nameservice1.nn1 = "dp06:8020"
      dfs.namenode.rpc-address.nameservice1.nn2 = "dp07:8020"
      dfs.client.failover.proxy.provider.nameservice1 = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
      dfs.client.use.datanode.hostname = "true"
    }
  }
}

image

hawk9821 avatar Oct 11 '24 05:10 hawk9821

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Nov 11 '24 00:11 github-actions[bot]

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.

github-actions[bot] avatar Nov 18 '24 00:11 github-actions[bot]