Danny Chan comments

Results 408 comments of


                                            Danny Chan

[SUPPORT] Inconsistent Checkpoint Size in Flink Applications with MoR

It is as expected because even if the incremental checkpointing is enabled, Flink triggers a full checkpointing every N delta checkpinting, it is not relevent with compaction actually.

[SUPPORT] Inconsistent Checkpoint Size in Flink Applications with MoR

That inc/full checkpointing is managed by Flink, in hudi, we do have an option 'index.ttl' to control the liveness of the index items but it is not suggested because that...

[SUPPORT] Inconsistent Checkpoint Size in Flink Applications with MoR

It's the mapping from hoodie record key to location, for a location it is comprised by a partition path and file group id.

[SUPPORT] Inconsistent Checkpoint Size in Flink Applications with MoR

you can choose bucket index, bucket index does not support updates among multiple partitions and the bucket number can not scale well if it not consistent hashing.

[SUPPORT] Inconsistent Checkpoint Size in Flink Applications with MoR

We have support for consistent hashing index which can scales the bucket number automically.

[SUPPORT] Flink Incremental read task use 'payload.class' configure does not work

It should work if the payload are not merged by the writer, otherwise the writer just takes the onus of merging.

[SUPPORT] How to skip some partitions in a table when readStreaming in Spark at the init stage

Did you try to add filter condition with the partition fields?

[SUPPORT] How to skip some partitions in a table when readStreaming in Spark at the init stage

> but I want a config that can tell source that only reads the partition that in my configs so I do not need to use filter That does not...

[SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC?

Currently, you should stop the streaming job and execute the alter table cmd with spark then restart the job.

[SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC?

No automatic schema evolution for streaming writer now, the limitation is from the Flink engine, the Flink table API already assumes constant schema for all the records there, so for...