Add `auto_explain` mode
Is your feature request related to a problem or challenge?
It would be useful to collect plans from queries automatically, without needing to explicitly add EXPLAIN ANALYZE. This would ensure that the original application would not need to change, as the queries would still return data as normal, while the plans would be printed to stdout/stderr.
Some existing systems already support this feature, like auto_explain in Postgres and eqp in SQLite.
Describe the solution you'd like
Some datafusion.execution.auto_explain config that would enable this feature.
I already created a small proof of concept and this feature would be relatively easy to implement. We just need to wrap execution plans in an AnalyzeExec. This AnalyzeExec would also need a flag to know when it's in the auto_explain mode, in which case it would print the plans and return the input batches.
I would be happy to create a PR with this.
Describe alternatives you've considered
Adding EXPLAIN ANALYZE manually, but this changes the application.
Additional context
No response
@2010YOUY01 and @NGA-TRAN have also been working in this area -- they may have some ideas here
This could also potentially be a datafusion-cli only feature...
The idea sounds good to me. Thanks @nuno-faria
Hi @NGA-TRAN @2010YOUY01 , I am getting started with Datafusion, and this issue seems exciting. Can I help you guys in any way?
@carpecodeum thanks for the offer. Right now I have an implementation ready which I need to cleanup / add some tests before creating a PR, which I will try to get done soon. So for this issue there isn't much to do right now, but I will tag you once the PR is ready so you can help in the review if you want.