DAGs have loops in 2018 traces
Hi. Thank you for sharing the cluster data. They are a great help!!!
I've been looking through the 2018 traces, and intend to use them for simulation. However, I've come across DAGs that have loops in them.
For instance, look at job 'j_1053726' in 'batch_task.csv'. I'll highlight 2 tasks:
M5_14_17_28_30_40_42_52_54_64_66_76_78_88_90_100_1,19,j_1053726,1,Terminated,349710,349750,50,0.3
R1_5,1,j_1053726,1,Terminated,349710,349754,50,0.2
Task 1 depends on Task 5 and Task 5 depends on Task 1. I'm guessing the task name was cutoff (the last 1 in the first task was probably 102). Is there a way to correct this beyond ignoring such DAGs?
Also, some tasks depend on 'Stgx' where x is a number. A few examples:
J10_7_9_Stg4,69,j_4160894,1,Terminated,348729,349024,100,0.39
M7_Stg3,9,j_3424927,1,Terminated,409967,410068,100,0.39
R2_1_Stg8,39,j_642358,1,Running,675174,675174,,
What does this mean? There are no tasks that start with 'Stgx'; There are only dependencies to them.