airflow
airflow copied to clipboard
Add `job_id` parameter to `BigQueryGetDataOperator`
solves: #39127
This PR adds the job_id
parameter to BigQueryGetDataOperator
to fetch data from the results of selection queries executed by BigQueryInsertJobOperator
(or any other querying interfaces). The new parameter is mutually exclusive with table_id
and its related parameters (dataset_id
and use_legacy_sql
).
After merging this PR, the original issue of fetching results from complex queries (for example, queries with ORDER BY
clauses) will be solved by running two operators sequentially:
- Running the complex selection query with
BigQueryInsertJobOperator
- Running
BigQueryGetDataOperator
while providingjob_id
from the previous step (templated):job_id = "{{ task_instance.xcom_pull(task_ids='insert_job_op', key='return_value') }}"
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst
or {issue_number}.significant.rst
, in newsfragments.
cc: @Lee-W, @eladkal
I'm thinking of merging this more these days. Please let me know if anyone want to take a deeper look 🙂