Results 365 comments of QP Hou

I also think there is room to reduce the memory usage by 10x. I would drop all of stats, stats_parsed, partition_values and parsed_partition_values in memory. Both stats and partition values...

Thanks @mgill25 , let us know if you need any help :)

Good call @dispanser for tombstone, i think we can further optimize it by remove all other fields and only keep path and deleted_timestamp. I don't think we need access for...

@mgill25 please feel free to send a draft PR directly if you prefer, we can comment and iterate there.

Extending on @dispanser 's idea, I think we could even extend it to all writers that doesn't handle checkpoints. By moving checkpoints to a dedicated writer/lambda function, we could even...

yep, it should be easy to prevent side effect of this optimization by checking for the flag in both checkpoint and vacuum code path.

How about adding a new config value to https://github.com/delta-io/delta-rs/blob/main/rust/src/delta_config.rs? Then we can pass in the full table config as an optional argument. We will have to either introduce breaking API...

https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/logical_plan/builder.rs is a good example of how the builder pattern works in real world.

There are two problems here. The first one is datafusion is not doing automatic type casting for your queries, which we really should. So please file an upstream ticket for...

micro second is the right type to use for delta table's schema, but we need to update our pyarrow and datafusion integration to work with parquets that are written with...