[Bug] compact_database StreamingCompactorSource Busy(max) 100%
Search before asking
- [x] I searched in the issues and found nothing similar.
Paimon version
1.0
Compute Engine
flink1.17.1
Minimal reproduce step
public class CompactDatabaseTestSourceBusy extends CompactActionITCaseBase { public static void main(String[] args) throws Exception { CompactDatabaseAction action; action = createAction( CompactDatabaseAction.class, "compact_database", "--warehouse", "hdfs:///user/paimon/warehouse_dw", "--mode", "combined", "--including_databases", "dap_dev_test", "--table_conf", "snapshot.num-retained.min=3", "--table_conf", "snapshot.time-retained=5m", "--table_conf", "full-compaction.delta-commits=5", "--table_conf", CoreOptions.CONTINUOUS_DISCOVERY_INTERVAL.key() + "=120s");
Configuration conf = new Configuration();
conf.setString(RestOptions.BIND_PORT, "8081-8089");
conf.setBoolean("rest.flamegraph.enabled",true);
StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(conf);
env.enableCheckpointing(30000);
action.withStreamExecutionEnvironment(env).build();
env.executeAsync();
}
}
What doesn't meet your expectations?
My question is why the following two operators have consistently been showing "Busy(max): 100%": Source: Combine-MultiBucketTables--StreamingCompactorSource Source: Combined-UnawareBucketTables-StreamingCompactorSource
Under this warehouse, there are only 8 databases, and the dap_dev_test database contains only 2 tables. The computation for merging tables shouldn't consume this much CPU. Moreover, through analyzing the source code, I found that the Thread.sleep(monitorInterval) in the following two functions is indeed working properly:
org.apache.paimon.flink.source.operator.CombinedAwareStreamingSource.Reader#pollNext
org.apache.paimon.flink.source.operator.CombinedUnawareStreamingSource.Reader#pollNext
This indicates that the program enters these functions and sleeps for 120 seconds as intended. Therefore, it is unclear what operations are causing the two operators to remain in the "Busy(max): 100%" state.
Anything else?
No response
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
Another simple insert into will also have this problem: