sos icon indicating copy to clipboard operation
sos copied to clipboard

Remove active option of tasks?

Open BoPeng opened this issue 5 years ago • 8 comments

There is an active option of task. The usage is generally as demontrated in the example

!echo "hello" > a.txt
task: active=not path('a.txt').exists()
sh:
    echo "Task executed"

or

task: active=-1
    echo "Only the last substep"

However, both usages can be replaced by the stop_if action

!echo "hello" > a.txt

stop_if(path('a.txt').exists())
task:
sh:
    echo "Task executed"

with an important difference.

For example,

input: for_each={'i': range(5)}
output: f'test_{i}.txt'

task: active=1
sh: expand=True
 touch {_output}

will generate the following

$ sos run test
INFO: Running default:
INFO: a647808578f06138 started
INFO: a647808578f06138 completed
INFO: output:   test_0.txt, test_1.txt, ... (5 items)
ERROR: [default]: Output target test_0.txt does not exist after the completion of step default

because the step expects output test_0.txt, test_1.txt etc but only test_1.txt will be generated by the task.

However, the following works:

input: for_each={'i': range(5)}
output: f'test_{i}.txt'

stop_if(_index != 1)
task: 
sh: expand=True
 touch {_output}

because action stop_if will actually allow sos to discard the entire substep and associated _output.

There are a few solutions to improve the situation:

  1. Remove active and use stop_if to avoid duplicated feature.
  2. Move active to input so that the option can be processed before _output is generated.
  3. Enhance active to discard _output as well. However,

Similar to the task option, we also have an active option for actions, which was designed for something like

input: multiple_input, group_by=1
sh:
do_something
report: active=-1
  generate a report for the entire step

So in this case, allowing active=-1 to discard _output of all other substeps does not make sense.

In summary, active is currently a questionable option that is error prone, has function overlapping with other mechanisms, and should be removed before it can be implemented better.

BoPeng avatar Sep 23 '18 02:09 BoPeng

Good move! But I did not realize we can do it for input? I have tried your example:

[1]
input: for_each={'i': range(5)}, active=1
output: f'test_{i}.txt'
sh: expand=True
 touch {_output}

Got this using sos-0.16.11, though

ERROR: [default_1]: Failed to process input statement for_each={'i': range(5)}, active=1
: Unrecognized input option active

gaow avatar Sep 23 '18 10:09 gaow

Sorry, I mixed what I planned and what is actually implemented. I will update the ticket.

BoPeng avatar Sep 23 '18 13:09 BoPeng

Okay am trying to dig up related ticket:

https://github.com/vatlab/SoS/issues/206

seems like we were once attempted that. I'm trying to find the ticket that motivated this feature in the first place and try to see if it was something we needed but cannot be implemented easily using existing features ... do you have an impression?

gaow avatar Sep 23 '18 14:09 gaow

active was introduced to allow the writing of

if _index == 1:
  sh(script)

as

sh: active=1
  script

because the former does not allow the use of script format. It was later extended to task to allow controlling entire tasks.

My major concern is

input: for_each={'i': range(5)}
output: f'test_{i}.txt'

task: active=1
sh: expand=True
 touch {_output}

because it causes a problem that is not easy to fix, and I am not particularly happy with either of the solutions.

BoPeng avatar Sep 23 '18 20:09 BoPeng

Ahh sure, I actually use this a lot:

if _index == 1:
  sh(script)

(yes, without script it is not very convenient. That's why I often wrap it with sos_run).

So is this usercase the only reason active now exists? If so, we might instead try to think of how to make this case easier, then we get rid of active

gaow avatar Sep 23 '18 22:09 gaow

My point was that task is always the last part of the substep, and there is no syntax to wrap it in if or for. Therefore,

task: active =1

can be achieved by

stop_if(_index!=1)
task:

and the latter is safer because it removes _output of _index==1 from step_output.

Basically, task is usually the main part of the substep that produces useful _output, Cancelling it with active= without removing _output from step_output makes this option more harmful than useful.

BoPeng avatar Oct 30 '18 20:10 BoPeng

Okay, but I cannot do:

stop_if(_index!=1)
R: ...

right? I'd have no hesitation to remove active if i can do this. But I guess we want to remove it anyways because as you pointed out it is more harmful than useful, given all things considered.

gaow avatar Oct 30 '18 20:10 gaow

Nothing stops you from doing

stop_if(_index!=1)
R: ...

but in this case you might not want the side effect of removing _output from step_output.

BoPeng avatar Oct 30 '18 21:10 BoPeng