add ability for toil-specific options to be prefixed with --toil-
It can be confusing which option in a program belongs to Toil and which to the application itself. If there was the option to add --toil- to the beginning of the toil options, this would be much less confusing.
for instance ifiddes CAT has both --workDir for toil, while --work-dir is what CAT uses for intermediate files.
┆Issue is synchronized with this Jira Story ┆Issue Number: TOIL-621
Can you describe the use-case in more detail? I'm not sure that this is the best route, but I'd still like a path forward to making this clearer.
The use case is when you have a program that embeds toil
It is not clear which options actually apply to toil and hence one should read the toil doc.
the --workDir vs --work-dir might not be that common.
--toilWorkDir would be obvious.
Generally I'm used to a separator between options. For example, if you run a program with argparse that calls another, you'd use: --, for say, the cwltest runner.
So the command would be something like `cwltest --logLevel=INFO --runner=toil -- aws:us-west-2:somejobstore --logLevel=DEBUG'
Is CAT trying to do something like this?
parser = argparse.ArgumentParser(description='Runs with toil.')
parser.add_argument('primary_file', help='A file.')
parser.add_argument('secondary_file', help='A secondary file.')
parser.add_argument("--whatever", type=str, required=False, default=None)
# extra_args_for_toil is an array containing all of the unknown arguments not
# specified by the parser in this main. All of these will be passed down later
# to toil directly as toil args.
CAT_args, extra_args_for_toil = parser.parse_known_args()
Maybe CAT could just use one --TOIL-RUN-ARGS='' to make it clearer?
CAT is just an example of where the confusion arises. I have had this with other software (although no one uses my software except me, fixing the confusion problem).
This is within a single program, not a program calling a program. The generic use cases is a program that calling
Job.Runner.addToilOptions(parser).
the cwltool example doesn't really apply here and I find it a confusing command line.
This isn't even a medium priority request, it is some to think about if the command line and config is ever revisited.
Hmmm... we can discuss it in a future meeting. Could CAT do something like the following?
parser = argparse.ArgumentParser(description='Run CAT.')
parser.add_argument('--toilWorkDir', help='A CAT option.')
parser.parse_args()
toil_config.workDir = args.toilWorkDir
CAT seems to incorporate a lot of options from multiple sources and it's confusing to me too.
Anyway, it's enough to bring up at the next meeting and discuss. Good luck CAT wrangling. ;D
yea, CAT it has got luigi in there too.
I wouldn't want to spend time on this just for this use case. However, the Toil parameters are done seems kind of ad-hoc, with some things being in only environment variables.
A good approach is to have every config available in three place: command line, environment, and a config file.
Although environment is mostly for desperation; it is often leads to "it works for me".
So the general priority hierarchy would be:
1st: command line 2nd: environment variables 3rd: config file 4th: defaults
Oh, I agree with the general priority structure and believe that would be a good issue to work on.
This would be easier to do now that we have laid the plumbing for config files.