go-json icon indicating copy to clipboard operation
go-json copied to clipboard

[for yq] Decode Stream Tokenisation allows missing "," in JSON

Open mikefarah opened this issue 8 months ago • 2 comments

yq is using the goccy/json Decoder tokeniser, and a bug has been raised whereby invalid JSON documents that are missing a comma between map key-value pairs are parsed:

{
  "hello": "value"
  "foo": "bar"
}

Issue is (unless I'm missing something) the tokeniser does not emit a token for the comma, so in the yq code when creating my ast; I cannot determine if it's missing or not.

Looking at the goccy tokeniser code, I can see that the "," is skipped over in tokenisation:

See https://github.com/goccy/go-json/blob/f83142d838f231e825c02e0d1c6b0b8ccdeff216/internal/decoder/stream.go#L137

Not sure what the best path forward is...perhaps the streaming code should validate that there is a "," when expected?

mikefarah avatar May 04 '25 00:05 mikefarah

@goccy - any thoughts on this?

mikefarah avatar Jul 10 '25 04:07 mikefarah

@mikefarah If delimiter validation is performed in the Token() API, it would require maintaining state, which would significantly degrade performance. Therefore, we are currently unsure how best to implement it. Since it’s not something that can be handled easily, we would appreciate it if you could also consider alternative approaches.

goccy avatar Jul 10 '25 06:07 goccy