KnightChess
KnightChess
Sorry for the late reply. @jonvex I will close this pr, thank you work for it.
@bk-mz yes, mor not support parquet native bloom filter, because log file will merge on read, so native bloom filter is not the latest, is not accurate, only `cow` or...
@bk-mz yes, `set hoodie.datasource.query.type = read_optimized`
@bk-mz the cache of the operating system may also have an impact, can you provide detailed metrics for spark ui?
@bk-mz you can see scan rdd `the number of output rows` in spark sql tag ui.
@bk-mz can you see the cost time in this point?

we can only analyse the scan rdd. A query contains time consumption in various aspects. the result I think is normal.
@bk-mz yes, according to the indicators, it is work
There will be a variety of factor leading to the difference time in the query, like IO、cpu、dick load... in spark, like parallelism, the expand time of executor..., in hudi, snapshot...