Yujiang Zhong comments

Results 51 comments of


                                            Yujiang Zhong

Avoid conflicts between rewrite datafiles and flink CDC writes

> This solution is similar to my early PR(https://github.com/apache/iceberg/pull/4748, and https://github.com/apache/iceberg/pull/4703). This way is deemed as dangerous so I change to another way ( https://github.com/apache/iceberg/pull/5760) I think this is different...

Parquet: Implement column index filter and update row read path to support page skipping

@rdblue @Fokko Can you help to review this?

Parquet: Implement column index filter and update row read path to support page skipping

@rdblue The description has been updated, I hope I made it clear, please let me know if something is unclear.

Parquet: Implement column index filter and update row read path to support page skipping

Hi @hengqujushi @sunchao, I rebased this and fixed the revapi failure, I think this should be ready for review now. Though I feel this PR is kind of big, maybe...

Parquet: Implement column index filter and update row read path to support page skipping

Hi @iflytek-hmwang5 , We're waiting for people in the community who are interested in it to review it.

Parquet: Implement column index filter and update row read path to support page skipping

> @zhongyujiang Hi, what can do to speed up the process? Looking forward to this feature very much. Um, I'm not sure, I think it depends on the community's priorities,...

[Feature] Paimon Spark 2025 Roadmap

@Zouxxyy Thank you for raising this, these optimizations are all highly anticipated! > [feat] Integrate Spark's DataFrame V2 API. If no one has worked on this, I would like to...

[Feature] Paimon Spark 2025 Roadmap

> especially dynamic bucket mode in your implementation Yeah, I haven't found a easy way to support this yet. In fact, I've only implemented V2 write for the fixed bucket...

[Feature] Paimon Spark 2025 Roadmap

> >[perf] Optimize table writing, including automatic repartitioning and rebalancing data and so on. > > I found that for paimon bucket table, the writer is possible distributed not evenly...

[Feature] Introduce mod bucket generator to paimon

@Aitozi @JingsongLi IMO, while MOD mode can solve the issue here, its applicable scenarios are limited. I think Iceberg's [truncate partitioning](https://iceberg.apache.org/spec/#truncate-transform-details) might be a better solution. It can also solve...