datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Datafusion 48 Clickbench Q6 and Q0 regression

Open robert3005 opened this issue 5 months ago • 4 comments

Describe the bug

When upgrading to Datafusion 48 our continous benchmarking infra detected a 25x regression in Q6 and 10x in Q0 of Clickbench. This query was previously answered all from stats which might not longer be the case if the slowdown is this significant

To Reproduce

No response

Expected behavior

No response

Additional context

No response

robert3005 avatar Jun 18 '25 14:06 robert3005

I think this behavior is expected, it's explicitly called out in #16080.

AdamGS avatar Jun 18 '25 18:06 AdamGS

There's also an issue (#16158) to change the default config, I'll take it.

AdamGS avatar Jun 18 '25 18:06 AdamGS

One thing that's not clear to me is whether statistics will be fetched if you have a query that could be ansewered by statistics or will they not and you will run the full query

robert3005 avatar Jun 18 '25 18:06 robert3005

For ListingTable as it is now - the only thing taken into account is the config value.

AdamGS avatar Jun 18 '25 18:06 AdamGS

https://github.com/apache/datafusion/pull/16447 is merged so closing this

blaginin avatar Jun 19 '25 18:06 blaginin

To be clear, I think the PR that changed this was:

  • https://github.com/apache/datafusion/pull/16080

alamb avatar Jun 20 '25 19:06 alamb