ducttape
ducttape copied to clipboard
Remove dot parameters in favor of calling submitters/versioners with arguments
This is currently a discussion item.
The proposal is to change the current syntax:
task t :: N=5 .submitter=sge .walltime="3:00:00" .vmem=6g {}
to something more like:
task t : submitter(walltime="3:00:00", vmem=6g) :: N=5 {}
Notice that the submitter has moved into the "package" position. This would then allow us to make packages take arguments as well. For instance, it would then be more natural to request packages compiled in several different ways (e.g. SRILM compact vs not)
Do you want me to remove dot parameter support from the parser?
Only if we also add support for these Parameterized Dependencies (which would replace packages).
These parameterized things would also make having default arguments for submitters/packages a bit more intuitive. It would also enable things like requesting specific version refs of packages. This latter matter is something I'm interested in very deeply so that I can generate a Ducttape Roll workflow that should reproduce the results of some set of experiments, down to the code version used. For example, in the user's workflow, we might have the default parameter "ref=HEAD":
task t : cdec {}
But then in the automatically generated reproduction version:
task t : cdec(ref="abcdef12345") {}
Discussion with Lane, Greg, and Austin: We need to set off submitters and versioners in their own sections. Here's a proposed metacharacter for each, using thick arrows:
Submitters (evokes "sending" the task to the submitter):
task t => torque(vmem=32g, walltime="1:00:00") {}
Versioners (evokes "pulling" the version from some SCM source):
package pbzip2 <= disk(path=/home/jhclark/software/pbzip2-1.1.6) {}
As noted at https://github.com/jhclark/ducttape/issues/74#issuecomment-6210771, we want packages to use the HEAD version by default. This implies:
- Versioners have default parameters
- Tasks (automatically?) pass their parameters to their child versioner... somehow
Those 2 issues need to be sorted out as they haven't been defined in the language yet.
The proposed changes would effectively make packages into functions and so we change the package syntax to something people will intuitively identify as a function:
package lunchpy(opts="") from git(repo="git://github.com/mjdenkowski/lunchpy.git", ref=HEAD) {
python -m compileall "$opts" .
}
Functions can define default arguments as shown above, or they can require the options by not specifying a default. We also introduce the "from" keyword, which is unique to packages.
Packages can still be used with their defaults as before:
task lunchtime_with_git_ref <- lunchpy(ref="d4053f5") {
$lunchpy/lunch.py Indian Mexican Italian
}
We can override defaults parameters of the packages when calling packages as functions:
task lunchtime_with_compile_opt <- lunchpy(opts="-l") {
$lunchpy/lunch.py Indian Mexican Italian
}
And, perhaps the most useful modification is the ability to override versioner options at the same time:
task lunchtime_with_compile_opt <- lunchpy(ref="d4053f5", opts="-l") {
$lunchpy/lunch.py Indian Mexican Italian
}
This ability to override versioner parameters makes packages a bit like "functions with inheritance" so that the parameters of the versioner also become directly accessible parameters of the package. (Note: This means that their parameters must not conflict! -- we will check for this at compile time.)
REJECTED FUNCTION SYNATAX (for posterity):
# Notice that inputs, packages, etc. are all lumped together in first arg list.
# All outputs are lumped together in second arg list (similar to currying in functional programming)
func filter(cdec, in)(out) {
cat < $in > $out
}
This is desirable since we no longer force the user to say what type variables are when defining functions, but instead allow this to be put off until the function is called:
task use_func calls filter <= cdec(ref=HEAD) < in=x.txt > out
Is this issue ready to be implemented? If not, what is the roadblock?