cylc-flow
cylc-flow copied to clipboard
Support the dot character in task names and/or parameters
Describe exactly what you would like to see in an upcoming release
It would be nice to be able to have the dot character (.
) in task names and/or parameters.
My use case is to use Cylc to build and run multiple versions of a model. E.g.I want to build multiple versions X.Y
of a piece of software. The natural thing to do would be to use parameters on the version strings of the piece of software, so I can have tasks called build_software_x.y
, build_software_x.z
and so on, and then simply feed the version string from the parameters to the build scripts.
This is currently not possible due to the dot character between used as a delimiter between task name and cycle point in task ID.
If the logic can assume that cycle points never have the dot (.
) character, then it should be possible to safely split a Task ID into the task name and cycle point components without issues. However, there may be other subtle ambiguity elsewhere.
Additional context
As discussed in this Discourse thread: https://cylc.discourse.group/t/support-the-dot-character-in-task-names-and-or-parameters/217
Pull requests welcome!
I think the task/cycle delimiter is something we will need to look at soonish as we currently support a strange myriad of approaches.
- CLI
-
[CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
-
[CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
-
TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
-
FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
-
- GraphQL/protobuf
-
workflow[|cycle[|(family|task)[|job]]]
-
(personal preference: the GraphQL/protobuf approach is the most universal and arranges the components in the correct hierarchical order like ISO8601)
Sadly we are baked into the current task.cycle
approach, it will be a big hit on users to change.
Sadly we are baked into the current task.cycle approach, it will be a big hit on users to change.
Yeah that's the main problem.
But I also agree that our new GraphQL protobof approach is cleanest, and perhaps we can think about how to do it without enraging users :angry:
My only concern with |
is how it plays out in the shell world WRT CLI.. i.e.
(flow) sutherlander@cortex-vbox:~$ echo hello|goodbye
goodbye: command not found
so we'd always need quotes now (which I always do anyway):
(flow) sutherlander@cortex-vbox:~$ echo hello\|goodbye
hello|goodbye
(flow) sutherlander@cortex-vbox:~$ echo "hello|goodbye"
hello|goodbye
Of course we can change |
to anything else (as it's centralised), but the UI will be dependent soon, so if this is not desirable perhaps we can choose another (or combination of characters).
Yep, pipe character in the shell.
I don't think users would appreciate having to quote strings that don't contain spaces.
This is one of those problems that seem like it should be easier than it is!
If we can assume that cycle point does not have .
in it, then it should be possible to support the dot
character in task names.
Other considerations
I remember choosing the cycle-point/task-name/submit-number
syntax because the slash /
does not require quoting in CLI and also maps conveniently to relative directory hierarchy under the log output directory.
However, the slash /
cannot be used now because we also have hierarchical suite names that may contain /
characters?
In case no one noticed, there is yet another syntax in the suite.rc
scheduling graph: task-name[cycle-point-ish]
as in my-task<param>[-P1D] => ...
. We can, in theory, have a syntax in graph expression similar to the Task ID syntax like my-task<param>.-P1D
or the slash syntax -P1D/my-task<param>
.
(Unfortunately, the pipe |
character cannot be used here either because it is already the OR
operator.)
(We also have a syntax, which I can no longer remember, to specify inter-suite dependency in the scheduling graph.)
It would be nice if there is some rethink of all these. (But please don't let me lead you to this distraction for now while you are concentrating on other things!)
(Unfortunately, the pipe | character cannot be used here either because it is already the OR operator.)
@matthewrmshin - We/I weren't advocating for |
as a task name character but ID delimiter.. However we need a universal delimiter that works with owner|suite|cycle|task_name|submit_num
and in the shell/CLI without quoting (where |
doesn't, as you've memtioned before) .. /
was ruled out because of it's use in nest suite run dir.
@dwsutherland My comment above on the use of the /
slash and pipe |
character as delimiters was meant to be an observation. We have tried our best in a quest to find the perfect delimiter character so we can express all the relevant information in a single path-like syntax. I remembered well the debates and the decisions. It is only unfortunate that we seem to end up with a lot of inconsistency over the years. (No doubt including a lot of my bad!)
But hopefully someone will eventually have the time to reorganise and rethink.
Hmm... Maybe migrating to a two character delimiter for CLI/IDs (internally and externally) is something to look into, then we can probably jump through all shell, datetime, suite/task name hoops.
Otherwise restrictions may have to stay...
Closing as superseded by #3592 (universal delimiter).
Closing as superseded by #3592 (universal delimiter).
#3592 was closed as completed, however dots still aren't supported in task parameters. Can we reopen this issue ?
Hi @hippalectryon-0 - it was a while ago, but at first glance closing this Issue might have been an error on my part.
#3592 got rid of the .
in the task ID (Cylc 7's task.cycle
became Cylc 8's cycle/task
as part of the wider-scoped universal ID. Technically that should make supporting the dot in task names easier, although we still have back-compat support to deal with.
I'll reopen this and see what others have to say about it.
The dot character is currently reserved for future uses, IMO we should not permit its use in task names. However, the following characters can be used _
, -
, +
, %
and @
.
Maybe we should provide in the docs an official workaround example for users (like me) who have parameters with dots in their name (which is very common for, e.g. climate model names/versions).
For now I've resorted to replacing "." by "@dot", and using sed
in the parameterized scripts to replace it, but it's not great.