hudi
hudi copied to clipboard
[HUDI-4606]fix flink timeline marker invalid when use EmbeddedTimelineServerReuse
Change Logs
when enable EmbeddedTimelineServerReuse and disable EmbeddedTimelineServer, TimelineBasedMarkers will fall back to DirectWriteMarkers, Flink default enable EmbeddedTimelineServerReuse and disable EmbeddedTimelineServer, It cause TimelineBasedMarkers can't work
Impact
none
**Risk level: medium
Contributor's checklist
- [x] Read through contributor's guide
- [x] Change Logs and Impact were stated clearly
- [x] Adequate tests were added if applicable
- [ ] CI passed
CI report:
- 405334c622e8bd597f789ce1e1bf1b7d8e791327 Azure: FAILURE
Bot commands
@hudi-bot supports the following commands:@hudi-bot run azurere-run the last Azure build
and this is the code of build writeConfig of writeClient. enableEmbeddedTimelineService is false
public static HoodieWriteConfig getHoodieClientConfig( Configuration conf, boolean enableEmbeddedTimelineService, boolean loadFsViewStorageConfig) { ... .withEmbeddedTimelineServerEnabled(enableEmbeddedTimelineService) .withEmbeddedTimelineServerReuseEnabled(true) // make write client embedded timeline service singleton ... }
public static HoodieWriteConfig getHoodieClientConfig(Configuration conf, boolean loadFsViewStorageConfig) { return getHoodieClientConfig(conf, false, loadFsViewStorageConfig); }
in class org.apache.hudi.util.StreamerUtil
AbstractStreamWriteFunction use this function to init writeConfig
and this is the code of build writeConfig of writeClient. enableEmbeddedTimelineService is false
public static HoodieWriteConfig getHoodieClientConfig( Configuration conf, boolean enableEmbeddedTimelineService, boolean loadFsViewStorageConfig) { ... .withEmbeddedTimelineServerEnabled(enableEmbeddedTimelineService) .withEmbeddedTimelineServerReuseEnabled(true) // make write client embedded timeline service singleton ... }
public static HoodieWriteConfig getHoodieClientConfig(Configuration conf, boolean loadFsViewStorageConfig) { return getHoodieClientConfig(conf, false, loadFsViewStorageConfig); }in class org.apache.hudi.util.StreamerUtilAbstractStreamWriteFunction use this function to init writeConfig
Even though the fix may work for flink, it is not general, the only truth for distinguishing remote request is the storage type, it has no relationship with whether we reuse the timeline server.
and this is the code of build writeConfig of writeClient. enableEmbeddedTimelineService is false
public static HoodieWriteConfig getHoodieClientConfig( Configuration conf, boolean enableEmbeddedTimelineService, boolean loadFsViewStorageConfig) { ... .withEmbeddedTimelineServerEnabled(enableEmbeddedTimelineService) .withEmbeddedTimelineServerReuseEnabled(true) // make write client embedded timeline service singleton ... }public static HoodieWriteConfig getHoodieClientConfig(Configuration conf, boolean loadFsViewStorageConfig) { return getHoodieClientConfig(conf, false, loadFsViewStorageConfig); }in class org.apache.hudi.util.StreamerUtil AbstractStreamWriteFunction use this function to init writeConfigEven though the fix may works for flink, but it is not general, the only true for remote request is the
storage type, it has no relationship with whether we reuse the timeline server.
ok, I understand. Do I need to submit a new commit for this?
and this is the code of build writeConfig of writeClient. enableEmbeddedTimelineService is false
public static HoodieWriteConfig getHoodieClientConfig( Configuration conf, boolean enableEmbeddedTimelineService, boolean loadFsViewStorageConfig) { ... .withEmbeddedTimelineServerEnabled(enableEmbeddedTimelineService) .withEmbeddedTimelineServerReuseEnabled(true) // make write client embedded timeline service singleton ... }public static HoodieWriteConfig getHoodieClientConfig(Configuration conf, boolean loadFsViewStorageConfig) { return getHoodieClientConfig(conf, false, loadFsViewStorageConfig); }in class org.apache.hudi.util.StreamerUtil AbstractStreamWriteFunction use this function to init writeConfigEven though the fix may works for flink, but it is not general, the only true for remote request is the
storage type, it has no relationship with whether we reuse the timeline server.ok, I understand. Do I need to submit a new commit for this?
Yes, we can fire a new PR, or better we can reuse this PR and do a force-push ?