hudi icon indicating copy to clipboard operation
hudi copied to clipboard

add an option of the log file block size

Open ymZhao1001 opened this issue 3 years ago • 1 comments
trafficstars

Change Logs

on each hoodie log append, hdfs used will be increased with the length of the block(512M), not teh actual data length(). Consider in a scenario,I use many writers to append concurrently to a large number of files(bucket file),but each time I append only 10 bytes. dfs used will be increased with the length of the block(512M),this will cause the datanode to report in-sufficient disk space on data write. even though it related to HDFS, We should also have the option to modify the configuration.It helps reduce the rate of increase during the du.

Impact

The rate at which the dfsused space grows can be controlled

Risk level: none | low | medium | high

none

Contributor's checklist

  • [x] Read through contributor's guide
  • [x] Change Logs and Impact were stated clearly
  • [x] Adequate tests were added if applicable
  • [x] CI passed

ymZhao1001 avatar Aug 11 '22 07:08 ymZhao1001

the ci failure cause by IT hudi-flink moudle

ymZhao1001 avatar Aug 12 '22 04:08 ymZhao1001

@ymZhao1001 Could you follow the process here by filing and claiming a Jira ticket?

yihua avatar Sep 06 '22 21:09 yihua

@ymZhao1001 Could you follow the process here by filing and claiming a Jira ticket?

done https://issues.apache.org/jira/projects/HUDI/issues/HUDI-4794

ymZhao1001 avatar Sep 07 '22 03:09 ymZhao1001

CI report:

  • 9e8e5113a5dd1419282a3b0aa17b796b74b7f886 Azure: FAILURE
Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

hudi-bot avatar Sep 07 '22 10:09 hudi-bot