eval: better error messages where discriminator fields are present
What version of CUE are you using (cue version)?
$ cue version v0.11.0
Does this issue reproduce with the latest stable release?
Yes.
What did you do?
! exec cue vet x.cue
stdout 'description: field not allowed'
-- x.cue --
data: {
type: "object"
properties: grandparentAuthor: {
type: "ref"
ref: "something"
description: "When parent is a reply to another post, this is the author of that post."
}
} & #LexType
#LexType: #LexArray | #LexObject | #LexRef
#LexObject: {
type!: "object"
description?: string
properties!: [string]: #LexType
}
#LexRef: string | {
type!: "ref"
ref!: string
}
#LexArray: {
type!: "array"
}
What did you expect to see?
A passing test. The error is caused by the fact that #LexRef doesn't allow the field "description" that's present in the data, but the error message doesn't mention anything about that as a possible cause.
What did you see instead?
> ! exec cue vet x.cue
[stderr]
data: 3 errors in empty disjunction:
data: conflicting values string and {type:"object",properties:{grandparentAuthor:{type:"ref",ref:"something",description:"When parent is a reply to another post, this is the author of that post."}}} (mismatched types string and struct):
./x.cue:2:7
./x.cue:19:10
data.type: conflicting values "array" and "object":
./x.cue:3:8
./x.cue:9:5
./x.cue:11:11
./x.cue:25:9
data.type: conflicting values "ref" and "object":
./x.cue:3:8
./x.cue:9:5
./x.cue:11:36
./x.cue:20:9
[exit status 1]
> stdout 'description: field not allowed'
FAIL: /tmp/z.txtar:2: no match for `description: field not allowed` found in stdout
ISTM that a much better job could be done by detecting the type discriminator field and including only errors for the arm of the disjunction with the corresponding field. But even so, the error should at least mention the fact that description isn't allowed by #LexRef.
Note that this is related to https://github.com/cue-lang/cue/issues/2893.
Providing a specific example from the GitHub actions world:
! exec cue vet -c -d '#Workflow' cue.dev/x/[email protected] x.yaml
cmp stderr stderr.golden
-- x.yaml --
name: Go
on:
pull_request:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Go
uses: actions/setup-go@v4
rubbish: field
with:
go-version: '1.20'
- name: Build
run: go build -v ./...
- name: Test
run: go test -v ./...
-- stderr.golden --
jobs.build: invalid value {"runs-on":"ubuntu-latest",steps:[{uses:"actions/checkout@v4"},{name:"Set up Go",uses:"actions/setup-go@v4",rubbish:"field",with:{"go-version":"1.20"}},{name:"Build",run:"go build -v ./..."},{name:"Test",run:"go test -v ./..."}]} (does not satisfy matchN): 0 matched, expected 1:
../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:162:38
../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:162:45
jobs.build."runs-on": field not allowed:
../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:162:38
./x.yaml:7:14
jobs.build.steps: field not allowed:
../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:162:38
./x.yaml:9:5
jobs.build.steps.1.rubbish: field not allowed:
../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:549:9
./x.yaml:12:7
jobs is defined as follows:
"jobs"!: struct.MinFields(1) & close({
{[=~"^[_a-zA-Z][a-zA-Z0-9_-]*$"]: matchN(1, [#normalJob, #reusableWorkflowCallJob])}
})
#normalJob is defined as follows:
#normalJob: close({
"runs-on"!: matchN(>=1, [string, [string, ...] & [_, ...] & [...], {
"group"?: string
"labels"?: matchN(1, [string, [...string]])
...
}, #stringContainingExpressionSyntax, #expressionSyntax])
// Optional fields follow...
})
#reusableWorkflowCallJob is defined as follows:
#reusableWorkflowCallJob: close({
"uses"!: =~"^(.+\\/)+(.+)\\.(ya?ml)(@.+)?$"
// Optional fields follow...
})
So the presence of the "runs-on" field is sufficient to know that we are in the #normalJob "leg" of the matchN (which could equally be a disjunction because both options are closed).
And yet the error message includes:
jobs.build."runs-on": field not allowed:
../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:162:38
./x.yaml:7:14
So the presence of the
"runs-on"field is sufficient to know that we are in the#normalJob"leg" of thematchN(which could equally be a disjunction because both options are closed).
FWIW I don't consider this to be a very "strong" discriminator, because closedness is a poor way to distinguish alternatives in general: using closedness to discriminate means that one misspelled field can eliminate all alternatives, leaving the evaluator in the original position of having no "preferred" alternative to reduce the number of errors. That's why my experimental discrim tool ignores closedness and only considers required fields, at least for now.
For the record, I tried running discrim (as discrim -e 'matchN(1, [#Workflow.#normalJob, #Workflow.#reusableWorkflowCallJob])') and this is the result:
discriminator is imperfect
allOf {
notPresent(runs-on) -> {1}
notPresent(uses) -> {0}
}
To paraphrase the above result:
- it's not a strong discriminator (we cannot always distinguish alternatives)
- if
runs-onis not present, the second alternative (#reusableWorkflowCallJob) is selected - if
usesis not present, the first alternative (#normalJob) is selected - as an implication, we can't discriminate if neither field or both fields are present.
It would be nice to have an example where the discriminator is perfect (for example with a required
type field with known constant values) to illustrate the problem, as that will probably be easier to
do a good job on improving error messages for.
It would be nice to have an example where the discriminator is perfect
It might be nice, I agree, but this is the reality of a popular configuration.
Just noting our offline conversation last week regarding the "imperfect discriminator". My thesis was:
- We have to make progress in this space because currently error messages are not manageable
- Some progress will be "perfect" - i.e. where we have a "perfect" discriminator we will be able to give "perfect" error messages
- Other progress will be more piecemeal and iterative.
- Everything we do in this space should be example-driven
- We might discover that some patterns we see in the wild are "impossible" and end up "pushing back"
- In whatever we do in this space, we will almost certainly end up with a ready-made guide on how to write schemas (that can vary in this way)
Adding the example that I referenced from last week, which comes from Azure pipelines.
Let's try and encode a schema for the strategy field, and call that #jobStrategy:
#jobStrategy: {
matrix!: [string]: [string]: string
maxParallel?: int
} | {
parallel! : string
}
In context, we further imply from a reading of the docs that (and an official example) that the strategy field itself is optional. So, somewhere in the context of the wider schema we would expect to see:
strategy?: #jobStrategy
(noting that, like for GitHub Actions, Microsoft does not distribute an official JSON Schema/schema for Azure workflows, so we are left working with https://github.com/microsoft/azure-pipelines-vscode/blob/main/service-schema.json).
The relevant part of the upstream schema is:
"jobStrategy": {
"anyOf": [
{
"type": "object",
"properties": {
"matrix": {
"description": "Matrix defining the job strategy",
"$ref": "#/definitions/jobMatrix"
},
"maxParallel": {
"description": "Maximum number of jobs running in parallel",
"$ref": "#/definitions/nonEmptyString"
}
},
"additionalProperties": false
},
{
"type": "object",
"properties": {
"parallel": {
"description": "Run the job this many times",
"$ref": "#/definitions/nonEmptyString"
}
},
"additionalProperties": false
}
]
},
We have closedness with additionalProperties: false. But we don't have matrix or parallel marked as required. I wonder/suspect whether this is optionality of the strategy field that uses this schema "leaking". i.e. what someone tried to write was (in CUE terms):
#jobStrategy: {
matrix!: [string]: [string]: string
maxParallel?: int
} | {
parallel! : string
} | {} // intentional!!!!
which would allow:
strategy: {}
So in this case I believe we have a genuine case of there needing to be a fix upstream to at least make matrix and parallel required, and possibly relax the jobStrategy to allow a purely empty struct.
Either way, this is another example of structural discriminator. And whilst perhaps not as "clean" as a discriminated type, or field value, it is very much precise to my mind and we would need to support the following valid states:
strategyis unsetstrategy: {}strategywith amatrixstrategywithparallelset
If strategy were set as follows:
strategy: banana: 5
then I think we would expect and want to see an error message that says something like:
strategy: value (banana: 5) does not choose a branch in the discriminator #jobStrategy
And this leads to another point I think we need to carefully consider.
Part of the problem right now is that error messages contain hugely verbose printings of the "options" in the discriminator. This is not sustainable. Nor is it sustainable to print out the value that is in error in full (in this case we're lucky it's banana: 5).
Which I think means the error message needs to instead be clear on:
- The path of the "value" that was evaluated as "invalid"
- The path of the schema/constraints that were violated
strategy: value (banana: 5) does not choose a branch in the discriminator #jobStrategy
I think it's interesting to consider that if you'd written strategy: parallel: true, then I don't think you'd want that error message, even though from the pure-CUE viewpoint, it's equally true of that example too that no branch has been chosen.
I suspect the thing we're after here is some heuristic/strategy here that involves checking shallower constraints before deeper constraints: a kind of breadth-first search of the possible disjunction paths. And maybe that kind of thinking can guide the approach taken when reporting errors too: we can potentially guide the user towards the shallowest-level fix that might lead to a value configuration. Even better would be offering spelling suggestions based on closest matching field names...