cue icon indicating copy to clipboard operation
cue copied to clipboard

eval: embedded disjunction causing significant runtime increase

Open cueckoo opened this issue 4 years ago • 2 comments

Originally opened by @verdverm in https://github.com/cuelang/cue/issues/758

What version of CUE are you using (cue version)?

v0.3.0-beta.4

What did you do?

exec cue eval chaining.cue
-- chaining.cue --
// Usage
A: #steps: []

B: #steps: [
        #Fetch & { ref: "alpine" },
        #Exec & {
                args: ["sh", "-c", "echo hello world"]
                mount: "/tmp/A": from: A
        }
]

C: #steps: [
        #Fetch & { ref: "alpine" },
        #Exec & {
                args: ["sh", "-c", "echo hello world"]
                mount: "/tmp/B": from: B
        }
]

D: #steps: [
        #Fetch & { ref: "alpine" },
        #Exec & {
                args: ["sh", "-c", "echo hello world"]
                mount: "/tmp/B": from: C
        }
]

// Schema
#Task: {
        // Match structs
        #steps: #Script
        ...
} | {
         // Match embedded scalars
         // string  // (we changed this to the next line)
         bool | int | float | string | bytes
         // XXX adding options here significantly increases runtime
         // bool | int | float | string | bytes | int | int | float | bool | string
        #steps: #Script
}

#Script: [...#Op]

// Note, more op types increases runtimes
#Op: #Fetch | #Exec

#Fetch: {
        ref: string
}

#Exec: {
        args: [...string]
        // Note, more mount types increases runtimes
        mount: [string]: #MountTask | #MountScript
}

#MountTask: {
        from: #Task
}
#MountScript: {
        from: #Script
}

Notes

  • In #Task's second option, we changed string to the current disjunction. This one line change caused the runtime to explode. Each extra option shows significant extra time. What was previously < 0.1s is now trivial to make >1s or even 5s
  • There are two // Note, ... by the #Op and mount: [string]: <...> which also play into this runtime. Removing these disjunctions causes the original issue to be hidden (or maybe they are compounding together to make this happen). The impact here is less noticeable but still required to create this reproducer.

cueckoo avatar Jul 03 '21 10:07 cueckoo

Original reply by @myitcv in https://github.com/cuelang/cue/issues/758#issuecomment-778179311

@verdverm I updated your description FYI so that the repro works.

Reduced this further:

exec cue eval chaining.cue
-- chaining.cue --
A: #Task

B: #Task
B: #steps: #Script & {
	mount: [A]
}

C: #Task
C: #steps: #Script & {
	mount: [B]
}

D: #Task
D: #steps: #Script & {
	mount: [C]
}

#Task: {
	_ | {}
	#steps: #Script
	...
}

#Script: {
	mount: [...#Task]
}

Bisecting post alpha6 broadly points to https://cue-review.googlesource.com/c/cue/+/8282. However, the evaluation time before wasn't exactly "right" either. Going back that far however is tricky because much has changed since then, so really only noting that as a potential pointer to what the issue is. Of course the cause back then might not the same as the cause today, even though the symptoms (slow evaluation) are the same.

Observations

Removing the declarations B: #Task, C: #Task and D: #Task returns things to normal:

exec cue eval chaining.cue
-- chaining.cue --
A: #Task

B: #steps: #Script & {
	mount: [A]
}

C: #steps: #Script & {
	mount: [B]
}

D: #steps: #Script & {
	mount: [C]
}

#Task: {
	_ | {}
	#steps: #Script
	...
}

#Script: {
	mount: [...#Task]
}

But this is clearly not a solution.

Replacing _ | {} with _ also fixes the issue:

exec cue eval chaining.cue
-- chaining.cue --
A: #Task

B: #Task
B: #steps: #Script & {
	mount: [A]
}

C: #Task
C: #steps: #Script & {
	mount: [B]
}

D: #Task
D: #steps: #Script & {
	mount: [C]
}

#Task: {
	_
	#steps: #Script
	...
}

#Script: {
	mount: [...#Task]
}

This is a solution (because it's equivalent) but clearly the bug should be fixed.

cueckoo avatar Jul 03 '21 10:07 cueckoo

Original reply by @mpvl in https://github.com/cuelang/cue/issues/758#issuecomment-780505198

f5ac9f3 already reduces some significant strain of this issue. We plan to do a address disjunction performance in a more structured manner as part of the structure sharing+required field introductions, and the result rework of the the disjunction algorithm that is needed for that.

Will move up the milestone as a result.

cueckoo avatar Jul 03 '21 10:07 cueckoo