cue icon indicating copy to clipboard operation
cue copied to clipboard

encoding/yaml: allow marshaling without indenting sequences

Open herebebeasties opened this issue 1 month ago • 8 comments

I'm trying to get cue to generate more-or-less the same YAML as a large existing Kubernetes YTT templated set-up, so that we can review a transition to the tool more easily.

Although I appreciate I'm going to run into struct ordering issues and the like, the biggest stumbling block for this is indentation of arrays.

Current:

foo:
  - 1
  - 2

Desired:

foo:
- 1
- 2

The go-yaml v3 package used for YAML encoding supports calling enc.CompactSeqIndent() which cue could do straight after calling enc.SetIndent(2) in internal/encoding/yaml/encode.go

Is the cue team open to making this configurable? It would mean providing an extra parameter to yaml.Marshal(...) and yaml.MarshalStream(...), or some other way to configure this. If you are amenable I can try to work up a PR for this.

herebebeasties avatar Nov 10 '25 23:11 herebebeasties

Which one of these are you using?

  • The Go API from a Go program, i.e. https://pkg.go.dev/cuelang.org/go/encoding/yaml#Encode
  • The CUE standard library from a CUE file, i.e. https://pkg.go.dev/cuelang.org/go/pkg/encoding/yaml#Marshal
  • The cue tool, i.e. cue export --out yaml

mvdan avatar Nov 11 '25 10:11 mvdan

What's also interesting is that all examples I can see from yaml.org do not indent list elements. I'm not sure why the go-yaml package does that by default.

I can't quite tell if the YAML spec or creators encourage one form over the other. But the YAML 1.2.2 text does say:

The compact notation may be used when the entry is itself a nested block collection. In this case, both the “-” indicator and the following spaces are considered to be part of the indentation of the nested collection.

And literally the third example in the whole spec is as follows:

american:
- Boston Red Sox
- Detroit Tigers
- New York Yankees

So perhaps the solution would be to transition to not indenting these by default in our YAML encoding. We can smooth that transition via e.g. CUE_EXPERIMENT=yamlseqindent, such that it's opt-in for 0.16, opt-out for v0.17, and fully transitioned by v0.18.

I'm not opposed to adding an option either way to control this, but I'm starting to think that our default should flip as well.

mvdan avatar Nov 11 '25 10:11 mvdan

Thanks for looking at this. I'm using the CUE standard library from a tool file with yaml.MarshalStream.

Would you like me to make you a PR to add an experiment for this?

herebebeasties avatar Nov 11 '25 10:11 herebebeasties

OK thanks for the info. Do you have any further info in terms of which style should be the default and why? For example, if either is preferred by the YAML spec, or if the k8s people chose one over the other for a particular reason?

Please hold off on sending a PR, as this is more of a design decision we need to make. The code changes would be rather trivial.

mvdan avatar Nov 11 '25 10:11 mvdan

I asked the YAML team about their preference or choice of default via their Matrix chatroom, and to my surprise, they replied! @ingydotnet said:

The YAML core dev team considers it a best practice to not over indent here, but we haven't declared that publicly yet. Most YAML dumpers default that way. PyYAML does and many implementations descended from that one. go-yaml descended from PyYAML but decided to default the other way. I now lead the go-yaml dev team and may decide to change that.

We inherited the default to indent from go-yaml, but that was not a conscious choice of ours. So I'll discuss with the rest of the team to introduce a CUE_EXPERIMENT to transition this default. I would suggest that we do that without adding an option, given that the YAML org does have a preference for the no-indentation form.

mvdan avatar Nov 11 '25 13:11 mvdan

Do you have any further info in terms of which style should be the default and why?

There is no one true style here, and I submit that this is a mistake in YAML's design as it leads to this sort of debate.

I've done some proper research for us. I think the conclusion is that this should be configurable, perhaps leaving the default as-is.

The following communities prefer the zero-indent sequence style:

The following prefer indenting sequences:

Tooling:

  • yamllint has a default of indenting sequences but is configurable
  • Prettier is perhaps overly-proud of being opinionated - they had a vote and use indented style and there are many issues about this. The final conclusion feels pretty obstinate to me.
  • Mike Farah's yq has flip-flopped between the two at the whimsy of its underlying goccy/go-yaml library's default, which seems to have changed - see this commit for example
  • yamlfmt defaults to indenting, but supports -formatter indentless_arrays=true as a command line param
  • https://www.yaml.info/learn/bestpractices.html (which ingydotnet contributed to) says to not indent

Argument for making this configurable, leaving the default as indented:

  • Outside of K8s, most people seem to favour the indented style. You have to use that to nest sequences anyway, so it feels like the better default to me as it works in all cases.
  • Leaving this alone will avoid large diffs in CUE output between versions for end users.
  • Having it configurable would allow people to generate K8s-style YAML without diffs, which is a very significant use-case for CUE (any why I raised this issue).

An alternative would be some magic that notices you're marshalling a K8s YAML object (they have fields apiVersion and kind) and sets the no-indent style just for those, which would solve probably 90% of the use-case for this in CUE. Pragmatic and zero-config, but ugly, and you'll have users who need it for other use-cases or want indented YAML for their K8s config anyway, even if its not idiomatic in the k8s community.

I would warn against taking an overly go-centric view of what the best "default" is here, and I think having it configurable (at least from the CUE standard library calls) would provide a lot of utility to people.

Also, note that go-yaml v4 is the new thing from the master yaml org to use going forward. It's at rc4 stage but seems to work fine from CUE (I tried updating the go.mod deps and it passes tests and works on a pretty large CUE codebase with no diffs).

herebebeasties avatar Nov 11 '25 14:11 herebebeasties

Did you have a chance to make a decision on this one? I'm running a custom fork of CUE at the moment to solve this problem, which is going to become annoying. :-/

herebebeasties avatar Nov 21 '25 12:11 herebebeasties

@herebebeasties not yet - this is a trivial change code-wise, we just need to be careful about what change we make for end users and how we transition it. For example, we currently have no mechanism to pass options to yaml.MarshalStream. At this point I think it will happen after the holidays, as we have quite a bit of stuff in flight at the moment.

Thanks for the research on existing recommendations and defaults. It seems like the ecosystem is more evenly split than I hoped. And, even though the YAML maintainers want to move towards a "do not indent" recommendation, this official recommendation is not live yet, and I fear it might be too late at this point to become the only standard.

Hence I think the logical next step is to add a boolean YAML encoding option, and leave the default as-is for now. We can choose to flip the default later if we feel like, for example if the ecosystem does converge towards recommending no indentation.

so that we can review a transition to the tool more easily.

If you want to get unblocked on this, an alternative might be to use a simple tool to add all the indentation to the existing codebase - a noisy change, but easy to review - and then you can transition to CUE without the whitespace changes.

mvdan avatar Dec 10 '25 13:12 mvdan