scheduling icon indicating copy to clipboard operation
scheduling copied to clipboard

Allow duplicated tasks name

Open lpellegr opened this issue 9 years ago • 4 comments

It is currently not possible to name several tasks with the same name. It should be supported in order to have the possibility to reuse a name for same concepts (e.g. a task that is used multiple times for waiting 15 minutes at different place in a workflow). For a unique identifier task IDs should be used.

lpellegr avatar Jul 20 '15 15:07 lpellegr

While looking at TaskID implementation I noticed that TaskIdImpl#getIterationIndex and TaskIdImpl#getReplicationIndex use task name for retrieving index. If some time is spent on task names, I think we should understand why dedicated fields are not used (is it only for saving a few bytes in the database?) and maybe rework it.

lpellegr avatar Sep 02 '15 09:09 lpellegr

Because nearly all methods (killTask, getTaskResults) of the Scheduler API uses Task Names as keys.

It makes sense, the user doesn't know taskids (they are not returned when a job is submitted, and constructs such as loops will dynamically create new tasks and thus, taskids). The user only knows the names he gave to tasks when creating the job.

fviale avatar Sep 02 '15 12:09 fviale

Because nearly all methods (killTask, getTaskResults) of the Scheduler API uses Task Names as keys.

Despite your comment, it's not clear for me why task names should theoretically be used in the API instead of their ID. Once a task is created the ID exists and this one could be used for all other operations (simpler to memorize and to write. It avoids issues with special characters along with two different concepts for keeping track of task uniqueness while allowing users to reuse same names for same concepts).

As a concrete example, my name is 'pellegrino'. I am not the only one who has this name in France. That's why I have a unique id on my ID card. When I need to perform some legal actions I need to give a copy of my ID card not only my name.

In other words, I think a user manipulates an object (a task) and it should not be its responsability to ensure the task uniqueness in the world where it exists (job instance).

It makes sense, the user doesn't know taskids (they are not returned when a job is submitted, and > constructs such as loops will dynamically create new tasks and thus, taskids). The user only knows > the names he gave to tasks when creating the job.

I understand your comment about constructs such as loops which dynamically create new tasks and task ids but it also applies to task names which are auto generated and not known in advance by the users if he does not known what is the rule applied to create this new name.

lpellegr avatar Sep 02 '15 15:09 lpellegr

The only requirement is that your task name be unique for a given job. You can reuse the same task across different jobs if you want to.

The Task Names which are auto-generated for loops or replication follow always the same pattern and thus can be easily used from the user point of view.

Here for example I have a replication inside a loop, all tasks inside the loop are named : task_name#index_of_loop and for the replicated tasks : task_name*index_of_replication

Which gives for example for a replicated task inside a loop : UpdateEngines#1*4 #1 stands for first loop *4 stands for 4th replication

Task id of this task is 16510018

Why 18 ? It's not that practical from the user point of view to do this calculation, and not 100% predictable to determine which id the task will be (we can for example have faulty tasks which shortcuts some execution chain, if branching, etc, etc )

fviale avatar Sep 02 '15 16:09 fviale