paimon [FLINK-31338] support infer parallelism for flink table store

Mar 06 '23 08:03 zhangjun0x01

I run the e2e test org.apache.flink.table.store.tests.FileStoreBatchE2eTest of master branch in my computer ,it has the same error.I think it is may be unstable


[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 931.583 s <<< FAILURE! - in org.apache.flink.table.store.tests.FileStoreBatchE2eTest
[ERROR] testWithoutPk  Time elapsed: 930.599 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: 
Result is still unexpected after 60 retries.
Expected: {20211111, 08, Alice, Food, 90=1, 20211110, 08, Alice, Drink, 20=1, 20211111, 08, Alice, Drink, 100=1, 20211110, 08, Bob, Food, 30=1, 20211110, 09, Alice, Food, 50=1, 20211111, 09, Alice, Food, 130=1, 20211111, 08, Bob, Food, 110=1, 20211111, 09, Bob, Food, 150=1, 20211111, 09, Bob, Drink, 160=1, 20211111, 09, Alice, Drink, 140=1, 20211110, 09, Alice, Drink, 60=1, 20211110, 08, Alice, Food, 10=1, 20211110, 09, Bob, Drink, 80=1, 20211110, 08, Bob, Drink, 40=1, 20211111, 08, Bob, Drink, 120=1, 20211110, 09, Bob, Food, 70=1}
Actual: {}
        at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:39)
        at org.junit.jupiter.api.Assertions.fail(Assertions.java:134)
        at org.apache.flink.table.store.tests.E2eTestBase.checkResult(E2eTestBase.java:261)
        at org.apache.flink.table.store.tests.FileStoreBatchE2eTest.testWithoutPk(FileStoreBatchE2eTest.java:112)

Mar 07 '23 06:03 zhangjun0x01

@zhangjun0x01 Can we consider to use bucket number as streaming default parallelism? And use parallelism inference only for batch source.

Mar 08 '23 03:03 JingsongLi

I guess why there are test failures is because too much parallelism has been derived, resulting in scheduling failures.

Should we disable parallelism inference in default?

Mar 08 '23 03:03 zhangjun0x01

@zhangjun0x01 Can we consider to use bucket number as streaming default parallelism? And use parallelism inference only for batch source.

ok

Mar 08 '23 03:03 zhangjun0x01

The same problem of E2E test I have met before. I've pushed some commits to try to solve the problem. You can rebase master.

Mar 08 '23 06:03 yuzelin

The same problem of E2E test I have met before. I've pushed some commits to try to solve the problem. You can rebase master. yeah，I resubmitted

Mar 09 '23 01:03 zhangjun0x01

@zhangjun0x01 Can we consider to use bucket number as streaming default parallelism? And use parallelism inference only for batch source.

I updated it, and disabled the scan.infer-parallelism in default . it is fine, It may be really caused by too much parallelism

Mar 09 '23 01:03 zhangjun0x01

@zhangjun0x01, could you resolve the conflicts of above conflicting files?

Mar 22 '23 10:03 SteNicholas

You can rebase and update.

Mar 28 '23 05:03 JingsongLi

Apr 01 '23 13:04 zhangjun0x01