go icon indicating copy to clipboard operation
go copied to clipboard

proposal: encoding/json: Return the postion for validation failures.

Open dveeden opened this issue 1 year ago • 1 comments

Proposal Details

We use json.Valid() in https://github.com/pingcap/tidb to implement the JSON_VALID() SQL function and would like to return the position on which the validation failed to the user.

https://github.com/pingcap/tidb/blob/0875abde25467897e4e81ae6c18c5e8792ca28a9/pkg/expression/builtin_json.go#L1019

Example usage:

package main

import (
	"encoding/json"
	"fmt"
	"strings"
)

func main() {
	inputs := []string{
		`{}`,
		`{"foo": "bar"}`,
		`{"foo: "bar"}`,
		`{"foo": "bar", "bar": foo}`,
		`{"foo": "bar", "bar", foo}`,
	}
	for _, input := range inputs {
		v, pos := json.ValidPos([]byte(input))
		fmt.Printf("'%s': valid: %#v, pos: %d\n", input, v, pos)
		if !v {
			fmt.Printf(strings.Repeat(" ", int(pos)) + "^\n")
		}
	}
}

Output:

go run main.go 
'{}': valid: true, pos: 2
'{"foo": "bar"}': valid: true, pos: 14
'{"foo: "bar"}': valid: false, pos: 9
         ^
'{"foo": "bar", "bar": foo}': valid: false, pos: 24
                        ^
'{"foo": "bar", "bar", foo}': valid: false, pos: 21
                     ^

This would make it easier to see why validation failed, especially for larger JSON documents.

dveeden avatar May 16 '24 10:05 dveeden

Change https://go.dev/cl/585995 mentions this issue: enconding/json: add ValidPos() function

gopherbot avatar May 16 '24 10:05 gopherbot

CC @dsnet @mvdan

ianlancetaylor avatar May 20 '24 04:05 ianlancetaylor

The v2 json draft proposal includes a SyntacticError.ByteOffset field, which I suspect makes this feature redundant. I think we should defer on changing this until v2 lands.

dsnet avatar May 20 '24 22:05 dsnet

The v2 json draft proposal includes a SyntacticError.ByteOffset field, which I suspect makes this feature redundant. I think we should defer on changing this until v2 lands.

I've tried to make my code work with the v2 draft, but this doesn't seem to work:

package main

import (
	"fmt"
	"strings"

	"github.com/go-json-experiment/json/jsontext"
)

func main() {
	inputs := []string{
		`{}`,
		`{"foo": "bar"}`,
		`{"foo: "bar"}`,
		`{"foo": "bar", "bar": foo}`,
		`{"foo": "bar", "bar", foo}`,
	}
	for _, input := range inputs {
		val := jsontext.Value([]byte(input))
		v := val.IsValid()
		pos := 0
		fmt.Printf("'%s': valid: %#v, pos: %d\n", input, v, pos)
		if !v {
			fmt.Printf(strings.Repeat(" ", int(pos)) + "^\n")
		}
	}
}
$ go run main.go 
'{}': valid: true, pos: 0
'{"foo": "bar"}': valid: true, pos: 0
'{"foo: "bar"}': valid: false, pos: 0
^
'{"foo": "bar", "bar": foo}': valid: false, pos: 0
^
'{"foo": "bar", "bar", foo}': valid: false, pos: 0
^

As you can see pos is always set to 0. As this doesn't return a SyntacticError, there is no ByteOffset.

I've noticed ExampleWithUnmarshalers_recordOffsets is dealing with offsets, but it doesn't look to easily be applied to my example.

The v2 json draft looks interesting, but I'm not fully convinced this should make this one redundant. Let's first see if my example can be modified to work with v2, then this can be used as feedback on v2. After that we can see if it makes sense to apply this to v1 or not.

dveeden avatar May 22 '24 18:05 dveeden

Try this instead: https://go.dev/play/p/ze0FqocVe-8

dsnet avatar May 22 '24 18:05 dsnet

Try this instead: https://go.dev/play/p/ze0FqocVe-8

Thanks. That's great. This covers my usecase to get the offsets and even adds a useful error message.

I think a ValidPos() in v1 would be useful but I don't think complicating v2 with this is worth it. So let's close this for now and hope that v2 makes it in.

dveeden avatar May 22 '24 18:05 dveeden