pypiper icon indicating copy to clipboard operation
pypiper copied to clipboard

Checking for pipeline command requirements to fail early

Open nsheff opened this issue 7 years ago • 5 comments

This issue has come up repeatedly in several projects, this issue is to aggregate these thoughts in one place. It would be nice to have a way for a pipeline to do a gut-check and make sure all the commands it requires are at least executable. It could then fail or warn early if something is amiss.

This belongs in pypiper, I think.

These are related issues:

https://github.com/databio/pepatac/issues/68

https://github.com/pepkit/peppy/issues/221

https://github.com/databio/pararead/pull/29

https://github.com/pepkit/looper/issues/53

https://github.com/pepkit/peppy/issues/23

nsheff avatar Nov 01 '18 20:11 nsheff

Hey @jpsmith5 anything to add here? :smile:

nsheff avatar Mar 22 '19 19:03 nsheff

Whoa it's like the nexus of the universe

vreuter avatar Mar 22 '19 19:03 vreuter

A couple challenges I've dealt with so far and worth keeping track of are listed here. I'm also not sure how much of this would be modifiable based on the ngstk.check_command usage (which itself is using the system command).

  • example 1: a required tool is a jar file, which requires modifying the command to java -jar <jar_file> to call with command
  • example 2: a tool installed in python site packages required $PYTHONPATH to be explicitly set before command could properly identify it as callable. e.g. MACS2
  • example 3: an environment variable points to a tool to be called, but if the variable is never set command still returns 0, even though the tool is NOT callable. e.g. ${PICARD}

Current workaround is to check if the expected command contains a '.jar' string and modify with the java -jar prefix.

Check for presence of '$' in the command and assume it is uncallable, report it and fail.

jpsmith5 avatar Mar 22 '19 21:03 jpsmith5

https://github.com/pepkit/looper/issues/195

vreuter avatar Jun 21 '19 01:06 vreuter

the 'is_command_callable" function in the latest ubiquerg should probably be basically able to solve this...

we might want to just re-expose it via pypiper to keep things simple for the pipeline author. maybe wrap it to simplify even further so it can take a list of tools, handle jarfiles, or something

nsheff avatar Jun 21 '19 01:06 nsheff