yamllint icon indicating copy to clipboard operation
yamllint copied to clipboard

yamllint allows referencing a non-existant anchor

Open mckaymatt opened this issue 3 years ago • 1 comments

yamllint 1.26.1 (latest) does not flag references to anchors that don't exist.

---
foo: &foo
    keyA: value1

bar:
    <<: *no_anchor_with_this_name
    keyB: value2

If you try to parse this using a tool like PyYAML it errors.

  File ".local/lib/python3.8/site-packages/ruamel/yaml/composer.py", line 115, in compose_node
    raise ComposerError(
ruamel.yaml.composer.ComposerError: found undefined alias 'no_anchor_with_this_name'
  in "test.yaml", line 6, column 9

mckaymatt avatar Jul 20 '21 20:07 mckaymatt

Hello,

Indeed yamllint does not currently detect inexistent anchors, or unmatching aliases.

It could be implemented in a new yamllint rule. In theory, it should be possible if aliases aren't used before anchors are declared (and this seems to be the case according to the YAML specification 6.9.2.: "An anchor marks a node for future reference").

adrienverge avatar Jul 26 '21 08:07 adrienverge

I'd also like to check if there are unused anchors in the yaml files. Is this rule being considered for yamllint?

amimas avatar Dec 20 '22 20:12 amimas

Yes, I think this is within the scope of a YAML linter. An option like forbid-unused-anchors could do the job.

In the end, a new rule like this would be great:

rules:
  anchors:
    forbid-unknown-aliases: true      # true by default, because YAML spec fordids it
    forbid-duplicated-anchors: false  # false by default, because YAML spec allows it
    forbid-unused-anchors: false      # false by default, because YAML spec allows it

adrienverge avatar Dec 21 '22 13:12 adrienverge

That's good to hear @adrienverge . I like the example you gave. It gives more flexibility depending on the use case.

Not sure when we can expect those rules. I'm not familiar with the code base in this project and haven't implemented any custom rule. I can try open a PR if you can give some hints or guides on how to build those rules.

amimas avatar Dec 21 '22 14:12 amimas

Answering https://github.com/adrienverge/yamllint/pull/420#discussion_r1096388404 from @amimas, about the forbid-unused-anchors option.

Maybe we should discuss it in the feature request issue.

Sure :+1:

I understand what you're saying and it makes sense. I was just asking because I wanted to look at the forbid-unused-anchors option. Without checking the entire file, not sure how to verify that an anchor isn't being used. Let me know if you have any suggestions.

I see. As discussed in the original comment, yamllint is designed to output errors right away, which allows: - having them sorted naturally, - being performant and avoid consuming too much RAM, - not losing past errors if the script crashes at some point. Changing this would need a major refactor.

So, maybe a (not perfect) solution would be to output forbid-unused-anchors errors at the end of the YAML stream (StreamEndToken -- note that there can be multiple YAML documents inside the same file). The error report anchor "foo" defined but unused wouldn't match the line where &foo is defined, but would be positioned at the end of the YAML stream.

adrienverge avatar Feb 08 '23 18:02 adrienverge

Wouldn't it be better to report it at the end of the document instead of stream?

perlpunk avatar Feb 09 '23 13:02 perlpunk

Wouldn't it be better to report it at the end of the document instead of stream?

Sorry, that's what I meant but I twisted my words :man_facepalming: DocumentEndToken should be the correct token where to output these reports (not StreamEndToken).

adrienverge avatar Feb 09 '23 13:02 adrienverge

Wouldn't it be better to report it at the end of the document instead of stream?

Sorry, that's what I meant but I twisted my words 🤦‍♂️ DocumentEndToken should be the correct token where to output these reports (not StreamEndToken).

I think it's probably not that straight-forward. I tried implementing the forbid-unused-anchors (see #537). I'm new to Python. Appreciate your feedbacks in that PR.

While working on this, I also learned something new about yaml. While 3 dashes ( --- ) represent start of a new document, apparently end of document is represented with 3 dots (...). See section 2.2 of this doc. I haven't come across any yaml files yet that uses the "end of document" notation, but have seen multiple documents separated by "start of document" (i.e. 3 dashes). So, I think both DocumentEndToken and StreamEndToken should be used to report whether an anchor is unused or not.

amimas avatar Feb 12 '23 20:02 amimas

That's true, ... is part of the YAML standard (there is even a yamllint rule for it), but isn't commonly used.

So, I think both DocumentEndToken and StreamEndToken should be used to report whether an anchor is unused or not.

Does this solution reliably indicates the end of a YAML document, even when there are multiple documents in a same file, and ... is not present? Maybe there's a PyYAML token that is suited for that: could you check?

adrienverge avatar Feb 25 '23 14:02 adrienverge