cue icon indicating copy to clipboard operation
cue copied to clipboard

eval: better error messages where discriminator fields are present

Open rogpeppe opened this issue 1 year ago • 1 comments

What version of CUE are you using (cue version)?

$ cue version
v0.11.0

Does this issue reproduce with the latest stable release?

Yes.

What did you do?

! exec cue vet x.cue
stdout 'description: field not allowed'
-- x.cue --

data: {
	type: "object"
	properties: grandparentAuthor: {
		type:        "ref"
		ref:         "something"
		description: "When parent is a reply to another post, this is the author of that post."
	}
} & #LexType

#LexType: #LexArray | #LexObject | #LexRef

#LexObject: {
	type!:        "object"
	description?: string
	properties!: [string]: #LexType
}

#LexRef: string | {
	type!: "ref"
	ref!:  string
}

#LexArray: {
	type!: "array"
}

What did you expect to see?

A passing test. The error is caused by the fact that #LexRef doesn't allow the field "description" that's present in the data, but the error message doesn't mention anything about that as a possible cause.

What did you see instead?

> ! exec cue vet x.cue
[stderr]
data: 3 errors in empty disjunction:
data: conflicting values string and {type:"object",properties:{grandparentAuthor:{type:"ref",ref:"something",description:"When parent is a reply to another post, this is the author of that post."}}} (mismatched types string and struct):
    ./x.cue:2:7
    ./x.cue:19:10
data.type: conflicting values "array" and "object":
    ./x.cue:3:8
    ./x.cue:9:5
    ./x.cue:11:11
    ./x.cue:25:9
data.type: conflicting values "ref" and "object":
    ./x.cue:3:8
    ./x.cue:9:5
    ./x.cue:11:36
    ./x.cue:20:9
[exit status 1]
> stdout 'description: field not allowed'
FAIL: /tmp/z.txtar:2: no match for `description: field not allowed` found in stdout

ISTM that a much better job could be done by detecting the type discriminator field and including only errors for the arm of the disjunction with the corresponding field. But even so, the error should at least mention the fact that description isn't allowed by #LexRef.

rogpeppe avatar Dec 02 '24 19:12 rogpeppe

Note that this is related to https://github.com/cue-lang/cue/issues/2893.

mvdan avatar May 30 '25 17:05 mvdan

Providing a specific example from the GitHub actions world:

! exec cue vet -c -d '#Workflow' cue.dev/x/[email protected] x.yaml
cmp stderr stderr.golden

-- x.yaml --
name: Go
on:
  pull_request:
    branches: [ main ]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - name: Set up Go
      uses: actions/setup-go@v4
      rubbish: field
      with:
        go-version: '1.20'
    - name: Build
      run: go build -v ./...
    - name: Test
      run: go test -v ./...
-- stderr.golden --
jobs.build: invalid value {"runs-on":"ubuntu-latest",steps:[{uses:"actions/checkout@v4"},{name:"Set up Go",uses:"actions/setup-go@v4",rubbish:"field",with:{"go-version":"1.20"}},{name:"Build",run:"go build -v ./..."},{name:"Test",run:"go test -v ./..."}]} (does not satisfy matchN): 0 matched, expected 1:
    ../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:162:38
    ../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:162:45
jobs.build."runs-on": field not allowed:
    ../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:162:38
    ./x.yaml:7:14
jobs.build.steps: field not allowed:
    ../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:162:38
    ./x.yaml:9:5
jobs.build.steps.1.rubbish: field not allowed:
    ../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:549:9
    ./x.yaml:12:7

jobs is defined as follows:

		"jobs"!: struct.MinFields(1) & close({
			{[=~"^[_a-zA-Z][a-zA-Z0-9_-]*$"]: matchN(1, [#normalJob, #reusableWorkflowCallJob])}
		})

#normalJob is defined as follows:

	#normalJob: close({
		"runs-on"!: matchN(>=1, [string, [string, ...] & [_, ...] & [...], {
			"group"?: string
			"labels"?: matchN(1, [string, [...string]])
			...
		}, #stringContainingExpressionSyntax, #expressionSyntax])

		// Optional fields follow...
	})

#reusableWorkflowCallJob is defined as follows:

	#reusableWorkflowCallJob: close({
		"uses"!: =~"^(.+\\/)+(.+)\\.(ya?ml)(@.+)?$"

		// Optional fields follow...
	})

So the presence of the "runs-on" field is sufficient to know that we are in the #normalJob "leg" of the matchN (which could equally be a disjunction because both options are closed).

And yet the error message includes:

jobs.build."runs-on": field not allowed:
    ../../../home/myitcv/.cache/cue/mod/extract/cue.dev/x/[email protected]/Workflow.cue:162:38
    ./x.yaml:7:14

myitcv avatar Sep 08 '25 12:09 myitcv

So the presence of the "runs-on" field is sufficient to know that we are in the #normalJob "leg" of the matchN (which could equally be a disjunction because both options are closed).

FWIW I don't consider this to be a very "strong" discriminator, because closedness is a poor way to distinguish alternatives in general: using closedness to discriminate means that one misspelled field can eliminate all alternatives, leaving the evaluator in the original position of having no "preferred" alternative to reduce the number of errors. That's why my experimental discrim tool ignores closedness and only considers required fields, at least for now.

For the record, I tried running discrim (as discrim -e 'matchN(1, [#Workflow.#normalJob, #Workflow.#reusableWorkflowCallJob])') and this is the result:

discriminator is imperfect
allOf {
	notPresent(runs-on) -> {1}
	notPresent(uses) -> {0}
}

To paraphrase the above result:

  • it's not a strong discriminator (we cannot always distinguish alternatives)
  • if runs-on is not present, the second alternative (#reusableWorkflowCallJob) is selected
  • if uses is not present, the first alternative (#normalJob) is selected
  • as an implication, we can't discriminate if neither field or both fields are present.

It would be nice to have an example where the discriminator is perfect (for example with a required type field with known constant values) to illustrate the problem, as that will probably be easier to do a good job on improving error messages for.

rogpeppe avatar Sep 09 '25 15:09 rogpeppe

It would be nice to have an example where the discriminator is perfect

It might be nice, I agree, but this is the reality of a popular configuration.

myitcv avatar Sep 09 '25 16:09 myitcv

Just noting our offline conversation last week regarding the "imperfect discriminator". My thesis was:

  • We have to make progress in this space because currently error messages are not manageable
  • Some progress will be "perfect" - i.e. where we have a "perfect" discriminator we will be able to give "perfect" error messages
  • Other progress will be more piecemeal and iterative.
  • Everything we do in this space should be example-driven
  • We might discover that some patterns we see in the wild are "impossible" and end up "pushing back"
  • In whatever we do in this space, we will almost certainly end up with a ready-made guide on how to write schemas (that can vary in this way)

Adding the example that I referenced from last week, which comes from Azure pipelines.

Let's try and encode a schema for the strategy field, and call that #jobStrategy:

#jobStrategy: {
  matrix!: [string]: [string]: string
  maxParallel?: int
} | {
  parallel! : string
}

In context, we further imply from a reading of the docs that (and an official example) that the strategy field itself is optional. So, somewhere in the context of the wider schema we would expect to see:

strategy?: #jobStrategy

(noting that, like for GitHub Actions, Microsoft does not distribute an official JSON Schema/schema for Azure workflows, so we are left working with https://github.com/microsoft/azure-pipelines-vscode/blob/main/service-schema.json).

The relevant part of the upstream schema is:

        "jobStrategy": {
            "anyOf": [
                {
                    "type": "object",
                    "properties": {
                        "matrix": {
                            "description": "Matrix defining the job strategy",
                            "$ref": "#/definitions/jobMatrix"
                        },
                        "maxParallel": {
                            "description": "Maximum number of jobs running in parallel",
                            "$ref": "#/definitions/nonEmptyString"
                        }
                    },
                    "additionalProperties": false
                },
                {
                    "type": "object",
                    "properties": {
                        "parallel": {
                            "description": "Run the job this many times",
                            "$ref": "#/definitions/nonEmptyString"
                        }
                    },
                    "additionalProperties": false
                }
            ]
        },

We have closedness with additionalProperties: false. But we don't have matrix or parallel marked as required. I wonder/suspect whether this is optionality of the strategy field that uses this schema "leaking". i.e. what someone tried to write was (in CUE terms):

#jobStrategy: {
  matrix!: [string]: [string]: string
  maxParallel?: int
} | {
  parallel! : string
} | {} // intentional!!!!

which would allow:

strategy: {}

So in this case I believe we have a genuine case of there needing to be a fix upstream to at least make matrix and parallel required, and possibly relax the jobStrategy to allow a purely empty struct.

Either way, this is another example of structural discriminator. And whilst perhaps not as "clean" as a discriminated type, or field value, it is very much precise to my mind and we would need to support the following valid states:

  • strategy is unset
  • strategy: {}
  • strategy with a matrix
  • strategy with parallel set

If strategy were set as follows:

strategy: banana: 5

then I think we would expect and want to see an error message that says something like:

strategy: value (banana: 5) does not choose a branch in the discriminator #jobStrategy

And this leads to another point I think we need to carefully consider.

Part of the problem right now is that error messages contain hugely verbose printings of the "options" in the discriminator. This is not sustainable. Nor is it sustainable to print out the value that is in error in full (in this case we're lucky it's banana: 5).

Which I think means the error message needs to instead be clear on:

  • The path of the "value" that was evaluated as "invalid"
  • The path of the schema/constraints that were violated

myitcv avatar Oct 13 '25 05:10 myitcv

strategy: value (banana: 5) does not choose a branch in the discriminator #jobStrategy

I think it's interesting to consider that if you'd written strategy: parallel: true, then I don't think you'd want that error message, even though from the pure-CUE viewpoint, it's equally true of that example too that no branch has been chosen.

I suspect the thing we're after here is some heuristic/strategy here that involves checking shallower constraints before deeper constraints: a kind of breadth-first search of the possible disjunction paths. And maybe that kind of thinking can guide the approach taken when reporting errors too: we can potentially guide the user towards the shallowest-level fix that might lead to a value configuration. Even better would be offering spelling suggestions based on closest matching field names...

rogpeppe avatar Oct 20 '25 16:10 rogpeppe