airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Add `job_id` parameter to `BigQueryGetDataOperator`

Open shahar1 opened this issue 9 months ago • 1 comments

solves: #39127

This PR adds the job_id parameter to BigQueryGetDataOperator to fetch data from the results of selection queries executed by BigQueryInsertJobOperator (or any other querying interfaces). The new parameter is mutually exclusive with table_id and its related parameters (dataset_id and use_legacy_sql).

After merging this PR, the original issue of fetching results from complex queries (for example, queries with ORDER BY clauses) will be solved by running two operators sequentially:

  • Running the complex selection query with BigQueryInsertJobOperator
  • Running BigQueryGetDataOperator while providing job_id from the previous step (templated): job_id = "{{ task_instance.xcom_pull(task_ids='insert_job_op', key='return_value') }}"

^ Add meaningful description above Read the Pull Request Guidelines for more information. In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed. In case of a new dependency, check compliance with the ASF 3rd Party License Policy. In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

shahar1 avatar Apr 29 '24 15:04 shahar1

cc: @Lee-W, @eladkal

shahar1 avatar Apr 29 '24 20:04 shahar1

I'm thinking of merging this more these days. Please let me know if anyone want to take a deeper look 🙂

Lee-W avatar May 03 '24 08:05 Lee-W