tfx
tfx copied to clipboard
Ability to name dataflow jobs
Is there a way / or a plan to support custom names for job ids of dataflow jobs? right now, the components spin up something like this:
Its very hard to distinguish between pipeline runs and its hard also to collaborate and see your runs vs other people in the company.
My suggestion would be to at least by default support adding the component name to the job id. This way at least one can see at a glance if it was statisticsgen, schemagen etc. Thanks!
Thanks, I will look into adding this functionality.
I took the liberty of adding a minimally-invasion attempt at mitigating this for both the orchestrator as well as the executor pipelines.
This is a very old issue, but I leave a comment for other people.
DataFlow job can be named by job_name
parameter. (https://cloud.google.com/dataflow/docs/guides/specifying-exec-params#python_4)
Example)
pipeline.Pipeline(
...
beam_pipeline_args=[
...
"--runner=DataflowRunner",
"--job_name=job-name-for-dataflow"
],
)
This is a very old issue, but I leave a comment for other people.
DataFlow job can be named by
job_name
parameter. (cloud.google.com/dataflow/docs/guides/specifying-exec-params#python_4)Example)
pipeline.Pipeline( ... beam_pipeline_args=[ ... "--runner=DataflowRunner", "--job_name=job-name-for-dataflow" ], )
This won't work as probably not many would want all their dataflow jobs to be named the same.
You might have statisticsgen, schemagen, transform, evaluator and ... run dataflow jobs in the background and you don't want to name them the same.
Only remedy I see is to allow per component configuration of the underlying dataflow job, through custom_configs or something like that.
This is not only an issue for name, it's an issue for the job config (number of workers/machine types), labels and ... and should be solved for that IMO.
@htahir1,
Are you still looking for a resolution? We are planning on prioritising the issues based on the community interests. Please let us know if this issue still persists with the latest TFX 1.13 release so that we can work on fixing it. Thank you for your contributions.