hcl Block's `OpenBraceRange.Start` differs if a proper block precedes it or not

I have two different HCL blocks.

variable bad
{}

variable good {}
variable bad
{}

Although both of the bad blocks look similar, their Block.OpenBraceRange.Start value is quite different. The first one seems to believe it exists at the start of the file (essentially hcl.InitialPos) but the second one seems to have something more sane as a response.

{1 1 0}
{2 1 17}

package main

import (
	"fmt"

	"github.com/hashicorp/hcl/v2"
	"github.com/hashicorp/hcl/v2/hclsyntax"
)

func Parse(filename string, bytes []byte) *hclsyntax.Body {
	file, _ := hclsyntax.ParseConfig(bytes, filename, hcl.InitialPos)
	return file.Body.(*hclsyntax.Body)
}

func main() {
	body := Parse("", []byte("variable bad\n{}"))
	fmt.Println(body.Blocks[0].OpenBraceRange.Start)

	body = Parse("", []byte("variable good {}\nvariable bad\n{}"))
	fmt.Println(body.Blocks[1].OpenBraceRange.Start)
}

Nov 06 '24 22:11 rcjsuen

Actually now that I look at it some more it seems like both ranges are wrong. And from debugging I see all the ranges are identical to each other. :P

https://github.com/hashicorp/hcl/blob/d20d07fa73707e58954493bdd600c1fb0a0d4c77/hclsyntax/parser.go#L362-L374

Nov 06 '24 23:11 rcjsuen

When there's no brace at the end of the header, the parser performs recovery and then returns a very stubby fake Block to try to at least give the caller something to analyze to find the block type and labels that were already successfully parsed.

As far as the parser is concerned, this block has no braces at all (the brace is on the next line, but the parser only uses one token of lookahead so it's currently "looking at" the newline) and so it's using the ident range as a placeholder so that at least there's something to return. Since the variable identifier is at the start of the file, it seems like this range is what this code was intended to return.

It could perhaps instead have returned the position of the newline token as a way to say where the brace ought to have been -- tok.Range, as the code is currently written -- but really anything it reports here is going to be some sort of placeholder approximation because the block isn't valid enough to parse properly.

p.recoverAfterBodyItem should hopefully have left the scanner's cursor just after the closing brace, since that recovery logic works by trying to find a balanced number of bracket-type tokens, so CloseBraceRange could also then potentially be populated with p.PrevRange() to make a best effort to mark where the closing brace appears to be. Again that can't guarantee to be 100% reliable because the recovery logic is really just a pile of heuristics, but I think it would at least typically be something more useful than marking a zero-length span at the block's identifier.

Unlike with https://github.com/hashicorp/hcl/issues/702, because this is error recovery behavior rather than ranges reported by a successful parse I don't think it's reasonable to expect it to return something sensible in all cases and so I dunno whether it's actually worth trying to improve this, but above are my ideas on how to improve it if there's a strong enough reason to justify doing it.

Nov 06 '24 23:11 apparentlymart