Add hooks to `SchemaAdapter` to add custom column generators
Closes #15220
A lot of the work of this PR is meant to resolve https://github.com/apache/datafusion/issues/15220#issuecomment-2727534085. I think I'll move that into a standalone PR.
I've moved the complex bit over to https://github.com/apache/datafusion/pull/15263. I'll let that settle first then resume work here.
Noting that in https://github.com/apache/datafusion/pull/15263#discussion_r1997816085 I realized that it might be good to have a system to report stats for columns that will be generated before they are generated (is it all nulls? is it a constant?) to be used with stats pruning.
Now that https://github.com/apache/datafusion/pull/15263 is merged I'll come back here and:
- Resolve conflicts.
- Add an API for the SchemaAdapter to declare the stats for potentially generated columns if they are known ahead of time.
Marking as ready for review. The main TODO is an API for transmitting statistics information for generated columns before they get generated, but that can even be a followup PR.
Looking at how filter pushdown interacts with partition columns I think this could improve that.
Currently the partition values get bound when the FileStream is created which is after the predicate pushdown is applied. The filtering for filters that depend both on the partition values and data happens via a FilterExec.
This means that partition values are not available in predicate pushdown, and instead happens upstream in a FilterExec.
I feel like this change could help with that... but some details are missing: we somehow need to pipe the partition values into the FileSource so that it can in turn pass in the info to generate the partition columns on the fly if needed. Or something like that...
Marking as draft until I have time to work on this
I'm proposing we replace SchemaAdapter in https://github.com/apache/datafusion/issues/16800 so I don't plan to work on this PR anymore