airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Fix deferrable mode for BeamRunJavaPipelineOperator

Open MaksYermak opened this issue 9 months ago • 1 comments

In this PR I have prepared a fix for an error in deferrable mode for BeamRunJavaPipelineOperator.

This error happens on a distributed system when the user has trigger and worker on different machines. BeamRunJavaPipelineOperator needs a local jar file for starting a Job. Users can specify a path to jar file which is located in GCS bucket and then the operator will download this file to the local system. In the current deferrable mode implementation operator downloads jar file before going to the deferrable mode. It means that on a distributed system the file is downloaded on the worker machine not on the trigger machine. And then when the operator tries to execute a Job on the trigger machine Airflow throws an error that the executable jar file does not exist.


^ Add meaningful description above Read the Pull Request Guidelines for more information. In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed. In case of a new dependency, check compliance with the ASF 3rd Party License Policy. In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

MaksYermak avatar May 02 '24 11:05 MaksYermak

Hi @eladkal @potiuk ! Can you please check the changes here? Thanks!

VladaZakharova avatar May 09 '24 09:05 VladaZakharova

@potiuk @Lee-W I have updated the code.

MaksYermak avatar May 14 '24 11:05 MaksYermak