paimon
paimon copied to clipboard
[Bug] s3://paas-flink-prod/.../bucket-0/data-5da975ee-318e-4ba4-b3f7-ad112dae5247-0.parquet is not a Parquet file. Expected magic number at tail, but found [21, 0, 21, -32]
Search before asking
- [x] I searched in the issues and found nothing similar.
Paimon version
paimon-flink-1.20-1.0.1.jar paimon-s3-1.0.1.jar paimon-flink-action-1.0.1.jar
Compute Engine
flink-1.20.0
Minimal reproduce step
Use mysql cdc sync table to paimon table which on s3. it cannot complet checkpoint, taskmanager report:
Caused by: java.lang.RuntimeException: s3://paas-flink-prod/flink-paimon/wh/chen.db/department/bucket-0/data-65dbb220-7017-468d-affb-1de9dd6e4105-0.parquet is not a Parquet file. Expected magic number at tail, but found [21, 0, 21, -32]
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:162) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:243) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.format.parquet.ParquetUtil.getParquetReader(ParquetUtil.java:85) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.format.parquet.ParquetUtil.extractColumnStats(ParquetUtil.java:52) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.format.parquet.ParquetSimpleStatsExtractor.extractWithFileInfo(ParquetSimpleStatsExtractor.java:78) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.format.parquet.ParquetSimpleStatsExtractor.extract(ParquetSimpleStatsExtractor.java:71) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.io.StatsCollectingSingleFileWriter.fieldStats(StatsCollectingSingleFileWriter.java:105) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.io.KeyValueDataFileWriter.result(KeyValueDataFileWriter.java:169) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.io.KeyValueDataFileWriter.result(KeyValueDataFileWriter.java:58) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.io.RollingFileWriter.closeCurrentWriter(RollingFileWriter.java:135) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.io.RollingFileWriter.close(RollingFileWriter.java:167) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.mergetree.MergeTreeWriter.flushWriteBuffer(MergeTreeWriter.java:235) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
at org.apache.paimon.mergetree.MergeTreeWriter.prepareCommit(MergeTreeWriter.java:264) ~[paimon-flink-1.20-1.0.1.jar:1.0.1]
I have downloaded this parquet and checked it is ok.
cdc params:
local:///opt/flink/usrlib/paimon-flink-action-1.0.1.jar
mysql_sync_table
--warehouse s3://paas-flink-prod/flink-paimon/wh
--database chen
--table department
--mysql_conf hostname=rm-xxx.mysql.rds.aliyuncs.com
--mysql_conf username=**
--mysql_conf password='**'
--mysql_conf database-name='xxx'
--mysql_conf table-name='department'
What doesn't meet your expectations?
it's cannot use s3 as paimon warehouse backend storage, hdfs is ok.
Anything else?
No response
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!