iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Spark 3.5: Add max allowed failed commits to RewriteDataFiles when partial progress is enabled

Open manuzhang opened this issue 1 year ago • 1 comments

When partial progress is enabled, rewrite_data_files Spark app can succeed without any "progress" (commits). This PR adds an option partial-progress.max-failed-commits and a RuntimeException will be thrown if the number of failed commits exceed it. The default value is partital-progress.max-commits to be consistent with current behavior.

manuzhang avatar Feb 01 '24 11:02 manuzhang

@amogh-jahagirdar This is based on our discussion in #9400, but I'd like to go one step further. Throwing exception and fail can work with alert system easier than logging an error. WDYT?

manuzhang avatar Feb 04 '24 04:02 manuzhang

Sorry for the delay. I didn't forget.

aokolnychyi avatar Apr 12 '24 01:04 aokolnychyi

Looks like there are some related test failures:

TestRewriteDataFilesAction > testParallelPartialProgressWithMaxFailedCommits() FAILED
    org.opentest4j.AssertionFailedError: [We shouldn't have changed the data] 

aokolnychyi avatar Apr 12 '24 21:04 aokolnychyi

I think we should continue to use assertEquals method (which is our custom assert method). It has proper value equality for Spark.

aokolnychyi avatar Apr 12 '24 21:04 aokolnychyi

Thanks, @manuzhang! Thanks for reviewing, @nastra @RussellSpitzer!

aokolnychyi avatar Apr 15 '24 21:04 aokolnychyi