Validate doesn't correctly validate yaml files per the spec.
Describe the bug
The yaml spec requires that all keys in a mapping are unique (spec for reference: https://yaml.org/spec/1.2.2/). dasel validate does not catch this issue in a yaml file
To Reproduce Steps to reproduce the behavior:
$> cat <<EOF | tee tmp.yaml
key:
foo: "bar"
key:
foo: "baz"
EOF
$> dasel validate tmp.yaml
pass tmp.yaml
Expected behavior
Validation should fail because of the duplicate key key at the top level.
Desktop (please complete the following information):
- OS: MacOS
- Version: development-v1.27.4-0.20221207223452-2761f5761361
Additional context This would be useful for validating kubernetes yaml files as every now and again somebody unintentionally redeclares a key in a map and kubernetes doesn't error, it just takes the last assignment.
The method of validation dasel uses here is just to attempt to read the YAML and fail if it cannot be parsed, so largely the same approach kubernetes is using.
We use gopkg.in/yaml.v2 to parse YAML, perhaps there's a strict mode or something to catch these cases.
It does indeed have a strict mode it claims will catch duplicates: https://pkg.go.dev/gopkg.in/yaml.v2#UnmarshalStrict.
Happy to make a PR if that's a change you'll accept for dasel validate
Yes of course. It may be good to hide it behind a --strict flag so it doesn't catch other users off-guard
I spent some time looking at this over Christmas. Before I spend more time on it there's a bit of a refactor required and I wanted to run that by you before making the changes.
In the init methods for each of your parsers, e.g. yaml here, you instantiate actual instances of the parser. Then in your New*Parser* methods here you simply return the instance. From there, because we're reading a file, we call FromBytes which only takes the raw input []byte.
So, there are 2 options to enable configurability on the parsers:
- Don't return the instance of the parser, return a factory that can be used to generate an instance given optional configuration.
- Add a
Configureor similarly named method on the parsers that can accept options to configure the instance. - Have
FromBytestake additional options likeToBytesalready does.
In this specific case, the addition of a "strict" mode feels like parser level configuration, not method level, i.e. a parser is configured once then can be used many times with that configuration. That means we're looking at options 1 or 2.
Option 2 is probably slightly less work but it's essentially a half assed factory, i.e. Option 1. Additionally if somebody was using dasel as a library rather than a CLI tool, option 2 makes it easier to introduce bugs in sibling code paths (one changes the configuration that is then used by the other that doesn't expect the config). Enforcing this type of configuration at instantiation, option 1, makes it clear what the state of the parser is.
Let me know what your thoughts are and I can take a crack at the changes.
I agree that option 1 is probably the right way to go here.
- It allows us to instantiate a parser based on some options given to the factory method.
- Dasel would not longer instantiate all parsers regardless of which parser is used.
Thanks for looking into this 🙂