cue
cue copied to clipboard
cmd/cue: add diff support
Originally opened by @mpvl in https://github.com/cuelang/cue/issues/8
Allow diffing between (snapshots of) previous version and current version.
Original reply by @jugaadi in https://github.com/cuelang/cue/issues/8#issuecomment-528032804
Any updates on this feature?
Original reply by @mpvl in https://github.com/cuelang/cue/issues/8#issuecomment-780522675
It would be good for people to give examples of how they would like to use this functionality in a CLI. There is an internal package for diffing CUE, which could potentially also be exposed as a Go API.
Original reply by @myitcv in https://github.com/cuelang/cue/issues/8#issuecomment-799134595
Commenting simply to add the word semantic into the mix 😄
I've taken (perhaps incorrectly) to referring to cue diff
as a semantic diff, distinct from cmp
and friends ("plain diff") used in unity
.
Original reply by @eonpatapon in https://github.com/cuelang/cue/issues/8#issuecomment-803282650
Would be nice to have a diff API in cue lib. Currently I'm using Value.Decode
and https://github.com/r3labs/diff to diff two cue instances in order to detect which part of the configuration has changed (based on this CI jobs are run).
However this works if the instances have only concrete values. So a similar API that can diff cue values directly would be nice (and required in my case to be able to use the flow API)
Original reply by @myitcv in https://github.com/cuelang/cue/issues/8#issuecomment-803295207
@eonpatapon take a look at https://pkg.go.dev/cuelang.org/[email protected]/internal/diff. That is internal for now, but as we shape up the API it should be made non-internal.
Original reply by @vikstrous2 in https://github.com/cuelang/cue/issues/8#issuecomment-816729814
A semantic diff would be really interesting. I think with diffing kubernetes files, the most awkward part is the way that the kubernetes API wants to be given a list of yaml objects but the identity of those objects is usually best defined by their name rather than their position in a list. My current idea for diffing them is to write them out into a directory tree where the name of the file is based on the name and type of the object and then using git diff. Then as long as the yaml and all of its fields and lists of named objects are sorted in some stable way, this is good enough for most kubernetes things.
Original reply by @myitcv in https://github.com/cuelang/cue/issues/8#issuecomment-831097159
Adding something of an experience report here from the world of unity
.
unity
tests (of cmd/cue
) generally follow this rough pattern:
- Ensure that evaluation of a given configuration semantically matches expectations. This is, in effect, a CUE semantic diff
- Verify that output in a specific format (JSON, Yaml, etc) matches expectations. This is, in effect, a semantic diff in the output format
In the case of point one this will look like:
# CUE semantic diff
cue eval -o out.cue X
cue diff out.cue ref.cue
# JSON diff
cue eval -o out.json X
cue diff out.json ref.json
Some questions:
- Is
cue eval
the right command here?cue eval
will becomecue
- but does that give the intended result, as far as concreteness etc are concerned? -
file.ref
will need to be a complete, self-contained configuration. This might well require it to be a txtar archive?
Stepping back a bit further, we should also be able to write a semantic diff for point 2, on the basis that CUE knows about the semantics of these different formats (even to some extent the different versions of, say, Yaml, JSON).
In the case of generating kubernetes configs, we use a _tool.cue file to write out the config to many different yaml files. Those files can be individually diff'd by git diff. A fancier diffing algorithm would probably just do some sorting and normalization before diffing, which would bring it pretty close to a semantic diff. There isn't anything really cue specific needed for that.
If cue eval is used for diffing, it might produce unusual looking output if most users of the cue config usually interact with it through a _tool.cue command.
A diff of cue configs is interesting, but it seems like a very open ended topic. I think my understanding of CUE is not sufficient to have an opinion on how that should work.
FWIW what we are currently doing is output the result of cue eval
that we then sorts with jq.
That's less than optimal (!) and a proper semantic diff would be much appreciated.
@vikstrous2 @PierreR I've used https://github.com/homeport/dyff for creating diffs of outputs from CUE for CloudFormation. Quite nice. Works really well for semantic diffs. Worth a look! And, hopefully we can build something similar for CUE at some point. :)
And, hopefully we can build something similar for CUE at some point. :)
The query extension offers some nice potential for good semantic diff output.
A sub-feature-request: allow the user to specify if/that certain lists' contents can be treated as being identical based on member contents, not ordering. In other words (I /think/): to mark specific lists as sets, not lists, for diff purposes.
Rationale: field ordering in a struct is always(?) semantically unimportant. Ordering in a list is /sometimes/ unimportant, but it's case-specific. The cue diff
feature would be valuable to a wider set of users if there were the ability to mark/configure (at diff
time) which lists' contents were, essentially, order-agnostic.
Example: I'm currently using CUE to emit GitHub Actions (GHA) workflow .yml files, and I needed to assert that a specific commit had only reorganised the input CUE, and hadn't change the output .yml. One list in my workflow file is "the jobs to run". Obviously, order is important there. But there's a job at the /end/ of the workflow, which waits for all other jobs, and reports if all jobs succeeded. This final job's input is a list of job names, but it doesn't care about the order, merely that they're present. Being able to cue diff
the input CUE, and not rely on git diff
of the output yaml, would have been useful; even moreso if I could have taught the diff
invocation to ignore ordering changes to the final job's input list.
In other words (I /think/): to mark specific lists as sets, not lists, for diff purposes.
Having been kindly pointed towards https://github.com/cue-lang/cue/issues/14 and onwards to https://github.com/cue-lang/cue/issues/165#associative-lists, I can amend my sub-request to be: please could diff
be capable of being associative-list aware, with the ability to (if not the default behaviour of) ignoring purely order-based changes in associative lists. TVM!
As part of the recent issue garden, we are focussing on non-feature requests. As such, I'm removing the milestone on this feature request. We will revisit feature requests in a later pass, at which point we will start to milestone and prioritise new features (in addition to those that we are already working on).