Danny Chan
Danny Chan
> But for target MOR, in fact, it doesn't really check the existence of records in target table while deciding which action to execute: match or not match. I guess...
Hmm, then we might need to debug in details what stage is missing for MOR, is it the missed location tag you mentioned above, does it bring in large cost...
One way is we check the `HoodieOperation` of the record, but first of all, it needs to be set up correctly, and it looks like the issue is only related...
> If we somehow set up the HoodieOperation of the record correctly (really check the existence of records), we slow down performance significantly (especially for BUCKET index). If not -...
> hi guys, any updates for this issue now? yes, we are working on it.
@the-other-tim-brown I think the `emitDeletes` support for HoodieRecord iterator brings in too much overhead than I thought, can we drop it in this PR, the delete keys fetching should be...
> Streaming write to the MDT is only used for Spark and I don't think there are plans to use it for other engines. That's true, for Flink and Java,...
> Added separate configs for glue and datahub to set database/table name in sync client. @vineethNaroju can you explain why we need a new options key for the db/table name...
> it always gets created with `hoodie.table.name` @vinishjail97 I have no access to the link, I see there are already some options like `hoodie.gcp.bigquery.sync.table_name` in the `BigQuerySyncConfig` on master: https://github.com/apache/hudi/blob/f1faabe2f577d7f33fdb0194a490e7c18b22546c/hudi-gcp/src/main/java/org/apache/hudi/gcp/bigquery/BigQuerySyncConfig.java#L83
> Yes, we want to have similar config for glue and datahub catalog. That's okay, can we add similiar inference logic just in the config option so that we only...