proposal: encoding/json: Return the postion for validation failures.
Proposal Details
We use json.Valid() in https://github.com/pingcap/tidb to implement the JSON_VALID() SQL function and would like to return the position on which the validation failed to the user.
https://github.com/pingcap/tidb/blob/0875abde25467897e4e81ae6c18c5e8792ca28a9/pkg/expression/builtin_json.go#L1019
Example usage:
package main
import (
"encoding/json"
"fmt"
"strings"
)
func main() {
inputs := []string{
`{}`,
`{"foo": "bar"}`,
`{"foo: "bar"}`,
`{"foo": "bar", "bar": foo}`,
`{"foo": "bar", "bar", foo}`,
}
for _, input := range inputs {
v, pos := json.ValidPos([]byte(input))
fmt.Printf("'%s': valid: %#v, pos: %d\n", input, v, pos)
if !v {
fmt.Printf(strings.Repeat(" ", int(pos)) + "^\n")
}
}
}
Output:
go run main.go
'{}': valid: true, pos: 2
'{"foo": "bar"}': valid: true, pos: 14
'{"foo: "bar"}': valid: false, pos: 9
^
'{"foo": "bar", "bar": foo}': valid: false, pos: 24
^
'{"foo": "bar", "bar", foo}': valid: false, pos: 21
^
This would make it easier to see why validation failed, especially for larger JSON documents.
Change https://go.dev/cl/585995 mentions this issue: enconding/json: add ValidPos() function
CC @dsnet @mvdan
The v2 json draft proposal includes a SyntacticError.ByteOffset field, which I suspect makes this feature redundant. I think we should defer on changing this until v2 lands.
The v2 json draft proposal includes a
SyntacticError.ByteOffsetfield, which I suspect makes this feature redundant. I think we should defer on changing this until v2 lands.
I've tried to make my code work with the v2 draft, but this doesn't seem to work:
package main
import (
"fmt"
"strings"
"github.com/go-json-experiment/json/jsontext"
)
func main() {
inputs := []string{
`{}`,
`{"foo": "bar"}`,
`{"foo: "bar"}`,
`{"foo": "bar", "bar": foo}`,
`{"foo": "bar", "bar", foo}`,
}
for _, input := range inputs {
val := jsontext.Value([]byte(input))
v := val.IsValid()
pos := 0
fmt.Printf("'%s': valid: %#v, pos: %d\n", input, v, pos)
if !v {
fmt.Printf(strings.Repeat(" ", int(pos)) + "^\n")
}
}
}
$ go run main.go
'{}': valid: true, pos: 0
'{"foo": "bar"}': valid: true, pos: 0
'{"foo: "bar"}': valid: false, pos: 0
^
'{"foo": "bar", "bar": foo}': valid: false, pos: 0
^
'{"foo": "bar", "bar", foo}': valid: false, pos: 0
^
As you can see pos is always set to 0. As this doesn't return a SyntacticError, there is no ByteOffset.
I've noticed ExampleWithUnmarshalers_recordOffsets is dealing with offsets, but it doesn't look to easily be applied to my example.
The v2 json draft looks interesting, but I'm not fully convinced this should make this one redundant. Let's first see if my example can be modified to work with v2, then this can be used as feedback on v2. After that we can see if it makes sense to apply this to v1 or not.
Try this instead: https://go.dev/play/p/ze0FqocVe-8
Try this instead: https://go.dev/play/p/ze0FqocVe-8
Thanks. That's great. This covers my usecase to get the offsets and even adds a useful error message.
I think a ValidPos() in v1 would be useful but I don't think complicating v2 with this is worth it. So let's close this for now and hope that v2 makes it in.