Added Apache Arrow provider
Added Apache Arrow provider in Airflow and implemented basic AdbcHook. The AdbcHook implements the DBApiHook, so it can be reused across all SQL related operators. I could also be used to test integration with Apache DataFusion.
@zeroshade
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.
Just for anyone looking here - this is a draft for discussion between me, @dabla and @zeroshade - we will still need to start a DISCUSSION thread for the new provider - and we think Arrow and ADBC is a good addition. But we have to first discuss the approach :)
As @potiuk mentioned, I believe this needs a devlist conversation first. Looking forward to it
As @potiuk mentioned, I believe this needs a devlist conversation first. Looking forward to it
Yep. This one is mostly to gather learnigs, get feedback from @zeroshade and see how we can turn it into a "convincing" devlist proposal - by showing some use cases and small POC of implementation and what it allows :).
We'll experiment a bit with it and gather our thoughts and see what can come out of it.
@potiuk @zeroshade This provider could become useful in combo with Apache Datafusion. As Datafusion also uses Apache Arrow underneath, a provider could become useful when trying to access different backends which have Arrow support.
Agreed. I think once we get through the "bugiixing 3.1" period we will finalize and move forward with the providers in general.