Felipe Pessoto comments

Results 81 comments of


                                            Felipe Pessoto

[Feature Request] Optimize common case: SELECT COUNT(*) FROM Table

@zsxwing, you assigned @vkorukanti, it means you plan to implement it?

[Feature Request] Optimize common case: SELECT COUNT(*) FROM Table

@vkorukanti do you have any example of code where the query plan is replaced by a optimized version? I think it would be a good start point.

[Feature Request] Optimize common case: SELECT COUNT(*) FROM Table

Hi @vkorukanti, I'm doing some experiments and I have two different approaches (it is very high level only, I not sure if they are feasible), I'd like to hear your...

[Feature Request] Optimize common case: SELECT COUNT(*) FROM Table

I started working on option #1. Have a PoC working

Feature Request - Auto Analyze Table

@scottsand-db do you have any updates on this? Is it expected for the next release? Thanks

Feature Request - Auto Analyze Table

I need to test it. In my experiments with Parquet and Delta, the ANALYZE TABLE made the queries ~40% faster than both Parquet (without ANALYZE TABLE) and Delta.

Feature Request - Auto Analyze Table

BTW, you mean Delta 1.2? In 1.1 changelog I don't see these changes

Feature Request - Auto Analyze Table

@scottsand-db in my test with 1.2 it didn't improve performance. Looking the query plan, they are the same as 1.1, except by PreparedDeltaFileIndex instead of TahoeLogFileIndex. Stats are expected to...

Feature Request - Auto Analyze Table

**UPDATE**: I found the stats (min, max, null count) in delta log, but not sure why they are not being used during query Yes, I regenerate it. Do you know...

Feature Request - Auto Analyze Table

Some more questions, please: 1. Would be correct to say that Delta stats are file-wise while ANALYZE are table-wise? 2. ANALYZE is a Spark feature, while Delta stats is part...