hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Open TengHuo opened this issue 1 year ago • 2 comments

Change Logs

  1. Remove marker delete code in CompactionPlanOperator, which could cause corrupted parquet files issue if compaction tasks were cancelled
  2. Fix HUDI-4108 in another way, ignore the marker file if it is already exist when creating

More background detail in https://issues.apache.org/jira/browse/HUDI-4880

Impact

No API changed, minor change for fixing bug.

Risk level: none

Contributor's checklist

  • [x] Read through contributor's guide
  • [x] Change Logs and Impact were stated clearly
  • [x] Adequate tests were added if applicable
  • [ ] CI passed

TengHuo avatar Sep 21 '22 10:09 TengHuo

@hudi-bot run azure

TengHuo avatar Sep 22 '22 06:09 TengHuo

CI pipeline failed because of Connection refused issue, let me re-run it again.

TengHuo avatar Sep 26 '22 02:09 TengHuo

@hudi-bot run azure

TengHuo avatar Sep 26 '22 02:09 TengHuo

@TengHuo please rebase master; there were some flaky test fixes

xushiyan avatar Oct 31 '22 06:10 xushiyan

@xushiyan

sure, np, just rebased it to the latest master

TengHuo avatar Oct 31 '22 06:10 TengHuo

Just reverted the code about ignoring duplicate marker error. The code will throw error if there is an existing duplicate marker file now.

TengHuo avatar Nov 01 '22 08:11 TengHuo

Something wrong in maven build, not related with this PR.

Error:  Failed to execute goal on project hudi-utilities_2.12: Could not resolve dependencies for project org.apache.hudi:hudi-utilities_2.12:jar:0.13.0-SNAPSHOT: Failed to collect dependencies at io.confluent:kafka-avro-serializer:jar:5.3.4: Failed to read artifact descriptor for io.confluent:kafka-avro-serializer:jar:5.3.4: Could not transfer artifact io.confluent:kafka-avro-serializer:pom:5.3.4 from/to confluent (https://packages.confluent.io/maven/): transfer failed for https://packages.confluent.io/maven/io/confluent/kafka-avro-serializer/5.3.4/kafka-avro-serializer-5.3.4.pom: Connection reset -> [Help 1]

TengHuo avatar Nov 02 '22 08:11 TengHuo

CI report:

  • 861db5109feea40129392a38d17c10f84397d258 UNKNOWN
  • d3d5a30845177e6a0fe981e2fee5b6600556da76 Azure: FAILURE
Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

hudi-bot avatar Nov 02 '22 15:11 hudi-bot