paimon icon indicating copy to clipboard operation
paimon copied to clipboard

[Bug] compact_database StreamingCompactorSource Busy(max) 100%

Open skdfeitian opened this issue 9 months ago • 1 comments

Search before asking

  • [x] I searched in the issues and found nothing similar.

Paimon version

1.0

Compute Engine

flink1.17.1

Minimal reproduce step

public class CompactDatabaseTestSourceBusy extends CompactActionITCaseBase { public static void main(String[] args) throws Exception { CompactDatabaseAction action; action = createAction( CompactDatabaseAction.class, "compact_database", "--warehouse", "hdfs:///user/paimon/warehouse_dw", "--mode", "combined", "--including_databases", "dap_dev_test", "--table_conf", "snapshot.num-retained.min=3", "--table_conf", "snapshot.time-retained=5m", "--table_conf", "full-compaction.delta-commits=5", "--table_conf", CoreOptions.CONTINUOUS_DISCOVERY_INTERVAL.key() + "=120s");

    Configuration conf = new Configuration();
    conf.setString(RestOptions.BIND_PORT, "8081-8089");
    conf.setBoolean("rest.flamegraph.enabled",true);
    StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(conf);
    env.enableCheckpointing(30000);
    action.withStreamExecutionEnvironment(env).build();
    env.executeAsync();

}

}

Image

What doesn't meet your expectations?

My question is why the following two operators have consistently been showing "Busy(max): 100%": Source: Combine-MultiBucketTables--StreamingCompactorSource Source: Combined-UnawareBucketTables-StreamingCompactorSource

Under this warehouse, there are only 8 databases, and the dap_dev_test database contains only 2 tables. The computation for merging tables shouldn't consume this much CPU. Moreover, through analyzing the source code, I found that the Thread.sleep(monitorInterval) in the following two functions is indeed working properly:

org.apache.paimon.flink.source.operator.CombinedAwareStreamingSource.Reader#pollNext

org.apache.paimon.flink.source.operator.CombinedUnawareStreamingSource.Reader#pollNext

This indicates that the program enters these functions and sleeps for 120 seconds as intended. Therefore, it is unclear what operations are causing the two operators to remain in the "Busy(max): 100%" state.

Anything else?

No response

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

skdfeitian avatar Apr 14 '25 09:04 skdfeitian

Another simple insert into will also have this problem:

Image

Image

skdfeitian avatar Apr 24 '25 02:04 skdfeitian