hudi
hudi copied to clipboard
[SUPPORT] StreamWriteFunction support Exectly-Once in Flink ?
Describe the problem you faced
flink1.14.3 + hudi 0.12.1 when i use org.apache.hudi.sink.StreamWriteFunction in flink stream job, if jobmanager.execution.failover-strategy, region is set, it will be lost data? because this function has no state to restore ?
To Reproduce
Expected behavior
A clear and concise description of what you expected to happen.
Environment Description
- Hudi version : 0.12.1
- Hadoop version : 3.1.1
- Storage (HDFS/S3/GCS..) : HDFS
- Running on Docker? (yes/no) : no
Additional context
Add any other context about the problem here.
Stacktrace
Add the stacktrace of the error.
The checkpoint would trigger commit to hudi table.
eg. flink stream job like kafka_source -> window -> bucket_write, when bucket_write operator failed, the buffer data lost, although checkpoint failed for the first time, but after buckert_write restore with empty, it will be succeeded next time.
The write task holds the write statuses in the state which would be resubmitted to the driver for committing to Hudi.
@seekforshell Do you need any other help here. Feel free to close if you have all your doubts resolved on this. Thanks.