terraform `yamldecode` converts certain field names to bool

Terraform Version

1.7.2

Terraform Configuration Files

Files not needed - running in terraform console instead:

> yamldecode("{on: 1}")
{
  "true" = 1
}
>

Debug Output

N/A

Expected Behavior

> yamldecode("{on: 1}")
{
  "on" = 1  // <-- field name should remain unchanged
}

### Actual Behavior

It appears that in addition to converting *values* to bools, field names are also affected. This results in unexpected behaviour when, for example, parsing a github actions workflow file (which uses `on` to declare workflow exectuion conditions/options). 

### Steps to Reproduce

1. `terraform console`
2.  `yamldecode("{on: 1}")`


### Additional Context

_No response_

### References

_No response_

Feb 02 '24 10:02 deepbrook

Hi @deepbrook, thanks for filing this.

This is an issue with our upstream type library, tracked here: https://github.com/zclconf/go-cty-yaml/issues/11.

It seems like this behaviour is expected in YAML 1.1: https://yaml.org/type/bool.html. You can see other YAML libraries have the same kind of thing filed, for example in PyYAML: https://github.com/yaml/pyyaml/issues/613.

But, the docs do say we support YAML 1.2 so I do think this an incorrect modification being made by the library as long as we are claiming YAML 1.2. Unfortunately we are constrained by the upstream library here, so will have to see how practical a fix in that library is.

Feb 02 '24 11:02 liamcervante

@liamcervante I'm not sure if the compatibility promises outlined in https://github.com/hashicorp/terraform/issues/34301#issuecomment-1828266948 are also applicable in this case as well. Hopefully with provider-defined functions, someone can create YAML 1.2 specific functions without these little quirks.

Feb 02 '24 16:02 bflad

In practice yamldecode supports a similar pragmatic blend of YAML v1.2 and allowances for older versions, because in practice existing YAML in the world tends not to be strictly written for one version, possibly in part because the differences between different YAML versions are widely misunderstood.

I think the only change we can possibly make here is to update the documentation to be clearer about what YAML versions this function actually supports. I think when I wrote this doc I intended to mean that it supports "v1.2 and earlier", which was clumsy on my part because most other formats we've supported haven't made such significant breaking changes over time.

We cannot change the current behavior of yamldecode because, as @bflad suggested, it's protected by the Terraform v1.x compatibility promises. He's also right that someone could potentially make a "YAML v1.2" provider that offers a stricter yamldecode, if desired, once that capability is supported in the forthcoming Terraform v1.8 series.

However, if you want to keep using the built-in yamldecode then it's best to write ambiguous strings in quotes -- either double or single, since YAML supports both -- to remove the ambiguity:

> yamldecode("{'on': 1}")
{
  "on" = 1
}

Feb 02 '24 18:02 apparentlymart

Just for the record, i do not believe this has anything to do with the 1.1 spec, but rather with the linked bug in the city lib.

The spec, afaiu, only states that values are converted to booleans if they match the pattern. Not the keys.

Feb 02 '24 20:02 deepbrook

As far as I understand the YAML 1.1 spec, it's built around three fundamental structural primitives: scalars, sequences, and mappings, with two different syntaxes defined for each (flow style and block style). Collectively, these three concepts are described as nodes. The sequence and mapping node types are aggregates of other nodes.

The rule about sequences like yes, no, etc being interpreted as booleans is a semantic rule that the spec describes as being applied to any scalars, in Scalar Formats.

I can't find any language in the spec that suggests that "mapping key" is a distinct concept from "node" and describes different treatment of them, but the spec is very dense and not easy to read only in portions, and it's been a long time since I read the whole thing from top to bottom, so I'm happy to be corrected if you can cite a part of the spec that explicitly says that scalar nodes in a mapping-key position are supposed to get different parsing treatment than scalar nodes in other contexts.

(There are some special rules about "simple keys" in Block Mappings, but the exceptions there seem to describe only differences in treatment of delimiters adjacent to the scalar and what characters are allowed to appear in the scalar's representation; I don't see any requirement that the scalars in that position should be interpreted into typed values differently once parsing is complete.)

Feb 02 '24 23:02 apparentlymart

Here is a workaround I am using, which only works if a) only YAML 1.1 is required and b) other you are interoperating with more completely support the spec. In my case, this works fine when reading and writing the yaml files using Python's ruamel.yamls rt (round trip) encoder:

locals {
   doc = <<EOT
%YAML 1.1
---
root:
  notaboolvalue: 'no'
  'no': no
EOT
   val1 = yamldecode(trimprefix(local.doc, "%YAML 1.1\n"))["root"]["no"]
   val2 = yamldecode(trimprefix(local.doc, "%YAML 1.1\n"))["root"]["notaboolvalue"]
}

output "notaboolkey" {
   value = local.val1
}

output "notaboolvalue" {
   value = local.val2
}

Feb 21 '24 18:02 asaba-hashi

hey @liamcervante can I take this issue up if it would be suitable for me as a beginner to this project?

Jul 02 '24 17:07 burnerlee

I suggested in an earlier comment that once Terraform v1.8 is released with support for provider-defined functions it would be possible for someone to write a provider that contains a stricter yamldecode that, for example, supports only YAML v1.2 and eschews all of the oddities of prior versions that were removed in v1.2.

Terraform v1.8 was released in the meantime, and so that is now possible. The emerging idiom for provider-defined functions suggests that the provider be named yaml and the function be named decode so that it would be called as provider::yaml::decode rather than the arguably-redundant provider::yaml::yamldecode or provider::yaml::decode_yaml, although of course anyone who writes such a provider can choose whichever naming scheme they like.

The current behavior of yamldecode is protected by the Terraform v1.x compatibility promises and so I don't think we can change it in a way that would cause a different interpretation of input that was previously accepted as valid. Therefore I think the only thing we can do to improve the situation described here is to update the documentation for yamldecode to clarify that it also supports some features from earlier versions of YAML, to fix the current implication that it supports YAML 1.2 features only. Therefore I'm going to relabel this as a documentation bug.

Jul 02 '24 17:07 apparentlymart

terraform terraform copied to clipboard

`yamldecode` converts certain field names to bool

Terraform Version

Terraform Configuration Files

Debug Output

Expected Behavior

terraform
terraform copied to clipboard