KnightChess

Results 70 comments of KnightChess

Sorry for the late reply. @jonvex I will close this pr, thank you work for it.

@bk-mz yes, mor not support parquet native bloom filter, because log file will merge on read, so native bloom filter is not the latest, is not accurate, only `cow` or...

@bk-mz yes, `set hoodie.datasource.query.type = read_optimized`

@bk-mz the cache of the operating system may also have an impact, can you provide detailed metrics for spark ui?

@bk-mz you can see scan rdd `the number of output rows` in spark sql tag ui.

![image](https://github.com/apache/hudi/assets/20125927/2dd2b745-96b2-464d-8541-1119197bed48)

we can only analyse the scan rdd. A query contains time consumption in various aspects. the result I think is normal.

@bk-mz yes, according to the indicators, it is work

There will be a variety of factor leading to the difference time in the query, like IO、cpu、dick load... in spark, like parallelism, the expand time of executor..., in hudi, snapshot...