cue icon indicating copy to clipboard operation
cue copied to clipboard

cmd/cue cmd: unable to depend on output of another task

Open from-nibly opened this issue 3 years ago • 7 comments

I'm trying to write a command that writes the contents of kubernetes resources to disk. I'm sure there is a better way to do this, but I am also doing this to learn so I'd like to know how to accomplish this specific task. I was reading through some of the source code, and I came across $after. I'm not sure if I'm using it right though.

In this series, I'm trying to create a list of tasks that are executed one after the other to make the parent directory, then the child, then write the file. This should happen recursively, and there should only be one task for each directory.

I added $after as a prop to the tasks that depend on other tasks, but it doesn't seem to be doing anything. I get an error stating that the directory doesn't exist when trying to write the file.

Is this the correct way to make a dependency tree with tasks? Are my Mkdir tasks correct?

command: dump: {
	for nsName, ns in namespace {
		"mkdir-\(nsName)": file.Mkdir & {
			path: "./out/\(nsName)"
		}
		for kindName, kind in ns {
			"mkdir-\(nsName)-\(kindName)": file.Mkdir & {
				path:   "./out/\(nsName)/\(kindName)"
				$after: dump["mkdir-\(nsName)"].stdout
			}

			for resourceName, resource in kind {
				"write-\(nsName)-\(kindName)-\(resourceName)": file.Create & {
					filename: "./out/\(nsName)/\(kindName)/\(resourceName).yaml"
					contents: yaml.Marshal(resource)
					$after:   dump["mkdir-\(nsName)-\(kindName)"].stdout
				}
			}

		}
	}
}

from-nibly avatar Mar 04 '22 15:03 from-nibly

The problem here is related to the $after, but you are close.

To create a dependency, you have to refer to an incomplete field 👋 👋

Now, these mkdir and write don't have something like this naturally. They only have input fields for the task. To create the dependency, add an extra field like done: _ to the task which should run before, and then replace the ].stdout with ].done to your $after fields.

Not sure if this is a bug or a nuance of tasks which have no output fields they will fill.

verdverm avatar Mar 04 '22 16:03 verdverm

Ahh ok awesome! I think another issue I ran into was that my version is out of date or something, as Mkdir isn't even a field on the file tool.

from-nibly avatar Mar 04 '22 17:03 from-nibly

As a side note, I think it would probably be a good idea to have a canonical "done" field that can be used for all tasks. It makes sense that a lot of tasks just want to run in serial even if the other task has no output. But I do love that cue lang can be kind of "hacked" like this to get stuff done even without first class support for such a feature.

from-nibly avatar Mar 04 '22 17:03 from-nibly

Not sure if this is a bug or a nuance of tasks which have no output fields they will fill.

This is because the flow of custom commands is configured with IgnoreConcrete: true (https://github.com/cue-lang/cue/blob/master/cmd/cue/cmd/custom.go#L129)

My understanding is that if you reference a concrete field from another task there will be no dependency between the two in the flow.

IMO this isn't what one would expect in the general case and not practical in simple cases like this. Also not the first time I've seen people raising questions/issues because of this.

Maybe @mpvl can shed some light on why flow is configured like that for custom cue commands?

eonpatapon avatar Mar 05 '22 14:03 eonpatapon

@eonpatapon the main benefit of this approach is to improve the parallelism. But I agree it generally can't hurt to consider it as a dependency.

We were considering to fix a whole bunch of these things when doing a redesign for cue run/cuerun, but in this case it seems like it may be worth it introducing this already.

mpvl avatar Mar 21 '22 12:03 mpvl

See also #1593 , which seems to strongly indicate we should aways consider concrete values for the determination of dependencies of tasks. However, the current algorithm does seem to not ignore cases where concrete values are simply the result of a default.

mpvl avatar Mar 23 '22 12:03 mpvl

Drive-by comment after arriving here via https://github.com/cue-lang/cue/issues/1593. I've added a bullet to https://github.com/cue-lang/cue/issues/1325 to capture the fact that we should probably revisit $after and introduce means where by one task can depend on the "result" of another. I say "result" in quotes, because that could simply be another task completing (which we currently achieve via $after), or it could also include some notion of "exit code" for that task.

Coming back to the description in this issue. I can't actually reproduce the problem as described. Either that or I've misunderstood.

The classic way to use $after is simply to refer to the task itself, rather than a field in the task. That works:

package p

import (
	"tool/exec"
)

namespace: n1: k1: r1: true
namespace: n1: k1: r2: true
namespace: n1: k2: r1: true
namespace: n1: k2: r2: true
namespace: n2: k1: r1: true
namespace: n2: k1: r2: true
namespace: n2: k2: r1: true
namespace: n2: k2: r2: true

command: printer: {
	for nsName, ns in namespace {
		"first-\(nsName)": exec.Run & {
			cmd: ["bash", "-c", #"echo $(date "+%s") ./out/\#(nsName) && sleep 2"#]
		}
		for kindName, kind in ns {
			"second-\(nsName)-\(kindName)": exec.Run & {
				text: "./out/\(nsName)/\(kindName)"
				cmd: ["bash", "-c", #"echo $(date "+%s") ./out/\#(nsName)/\#(kindName) && sleep 2"#]
				$after: printer["first-\(nsName)"]
			}

			for resourceName, resource in kind {
				"third-\(nsName)-\(kindName)-\(resourceName)": exec.Run & {
					text: "./out/\(nsName)/\(kindName)/\(resourceName).yaml"
					cmd: ["bash", "-c", #"echo $(date "+%s") ./out/\#(nsName)/\#(kindName)/\#(resourceName)"#]
					$after: printer["second-\(nsName)-\(kindName)"]
				}
			}

		}
	}
}

It gives an output similar to:

1659252141 ./out/n1
1659252141 ./out/n2
1659252143 ./out/n1/k1
1659252143 ./out/n1/k2
1659252143 ./out/n2/k1
1659252143 ./out/n2/k2
1659252145 ./out/n1/k2/r1
1659252145 ./out/n1/k2/r2
1659252145 ./out/n2/k1/r2
1659252145 ./out/n2/k1/r1
1659252145 ./out/n2/k2/r1
1659252145 ./out/n2/k2/r2
1659252145 ./out/n1/k1/r2
1659252145 ./out/n1/k1/r1

If I instead refer to a non-existent field in the task via $after, it still works:

package p

import (
	"tool/exec"
)

namespace: n1: k1: r1: true
namespace: n1: k1: r2: true
namespace: n1: k2: r1: true
namespace: n1: k2: r2: true
namespace: n2: k1: r1: true
namespace: n2: k1: r2: true
namespace: n2: k2: r1: true
namespace: n2: k2: r2: true

command: printer: {
	for nsName, ns in namespace {
		"first-\(nsName)": exec.Run & {
			cmd: ["bash", "-c", #"echo $(date "+%s") ./out/\#(nsName) && sleep 2"#]
		}
		for kindName, kind in ns {
			"second-\(nsName)-\(kindName)": exec.Run & {
				text: "./out/\(nsName)/\(kindName)"
				cmd: ["bash", "-c", #"echo $(date "+%s") ./out/\#(nsName)/\#(kindName) && sleep 2"#]
				$after: printer["first-\(nsName)"].blah
			}

			for resourceName, resource in kind {
				"third-\(nsName)-\(kindName)-\(resourceName)": exec.Run & {
					text: "./out/\(nsName)/\(kindName)/\(resourceName).yaml"
					cmd: ["bash", "-c", #"echo $(date "+%s") ./out/\#(nsName)/\#(kindName)/\#(resourceName)"#]
					$after: printer["second-\(nsName)-\(kindName)"].blah
				}
			}

		}
	}
}

So I'm not clear there is an issue here.

which seems to strongly indicate we should aways consider concrete values for the determination of dependencies of tasks

@mpvl do we need to do this though? The "edge" of a dependent task or a dependent task's field moving from incomplete to complete is a nice simple rule to my mind. With list and struct values, which have a default value, isn't the solution, in light of revisions to #822, to declare the output fields using ?:? For example, tool/file.Glob would be declared as:

Glob: {
    $id: "tool/file.Glob"
    glob!: string
    files?: [...string]
}

(ignoring for one second changes in the way tasks are declared). Indeed all tasks would be declared using either ?: or !: for input/output fields.

Indeed given the concept of a concrete "result" I mention at the beginning of this comment, this potentially allows us to use that same mechanism for $after or whatever might replace it. i.e. such a field would remain incomplete until the task completes.

myitcv avatar Jul 31 '22 08:07 myitcv