pgtype
pgtype copied to clipboard
Unexpected behavior of pgtype.JSONBArray and json.RawMessage
I am setting a pgtype.JSONBArray
value from a value of type []json.RawMessage
, and encountered some unexpected behavior. I am not sure if this is a bug.
This first test fails:
func TestPgxArraySet1(t *testing.T) {
data := []json.RawMessage{
json.RawMessage(`{"lol":1}`),
json.RawMessage(`{"lol":2}`),
}
var nas pgtype.JSONBArray
err := nas.Set(data)
if err != nil {
t.Fatal(err)
}
assert.Len(t, nas.Elements, len(data))
if len(nas.Elements) != len(data) {
t.Errorf("data length: %v, nas length: %v", len(data), len(nas.Elements))
}
}
What happens is that data is interpreted as an array of single integer values, one integer for each byte in the two json values. So nas.Elements
has length 18.
But changing one of the JSON values a bit will make the test pass:
func TestPgxArraySet2(t *testing.T) {
data := []json.RawMessage{
json.RawMessage(`{"lol":2}`),
json.RawMessage(`{"lol":{"x":1}}`),
}
var nas pgtype.JSONBArray
err := nas.Set(data)
if err != nil {
t.Fatal(err)
}
assert.Len(t, nas.Elements, len(data))
if len(nas.Elements) != len(data) {
t.Errorf("data length: %v, nas length: %v", len(data), len(nas.Elements))
}
}
This is with github.com/jackc/pgtype v1.8.1
and go 1.17
. Casting the type to [][]byte
makes everything work just fine. I assume this is because [][]byte
is handled by .Set
specifically, where []json.RawMessage
will fall through to the default reflection based conversion in .Set
.
Would it be possible to add a case for []json.RawMessage
in .Set
?
Would it be possible to add a case for []json.RawMessage in .Set?
If that is the solution then it should be possible to add it to typed_array_gen.sh and regenerate the array struct.
As mentioned in the PR, your suggestion will fix the test cases for json.RawMessage
. However, the fundamental issue remains, in the sense that defining any other type alias for []byte
such as
type somebytealias []byte
and substituting json.RawMessage
with this type, will make the test cases fail once again. So it appears there might be a problem in the reflect based case in .Set
?
I am happy to help investigate this, but I do not feel I understand the ramifications of tinkering with the typed_array.go.erb
template, as this is used for several types. I have tried to debug these test cases:
type somebytealias []byte
func TestJSONBArraySetAlias(t *testing.T) {
successfulTests := []struct {
source interface{}
result pgtype.JSONBArray
}{
{source: []somebytealias{somebytealias(`{"foo":12}`), somebytealias(`{"bar":2}`)}, result: pgtype.JSONBArray{
Elements: []pgtype.JSONB{pgtype.JSONB{Bytes: []byte(`{"foo":1}`), Status: pgtype.Present}, pgtype.JSONB{Bytes: []byte(`{"bar":2}`), Status: pgtype.Present}},
Dimensions: []pgtype.ArrayDimension{pgtype.ArrayDimension{Length: 2, LowerBound: 1}},
Status: pgtype.Present,
}},
{source: []somebytealias{somebytealias(`{"foo":1}`), somebytealias(`{"bar":{"x":2}}`)}, result: pgtype.JSONBArray{
Elements: []pgtype.JSONB{pgtype.JSONB{Bytes: []byte(`{"foo":1}`), Status: pgtype.Present}, pgtype.JSONB{Bytes: []byte(`{"bar":{"x":2}}`), Status: pgtype.Present}},
Dimensions: []pgtype.ArrayDimension{pgtype.ArrayDimension{Length: 2, LowerBound: 1}},
Status: pgtype.Present,
}},
{source: []somebytealias{somebytealias(`{"foo":1}`), somebytealias(`{"bar":2}`)}, result: pgtype.JSONBArray{
Elements: []pgtype.JSONB{pgtype.JSONB{Bytes: []byte(`{"foo":1}`), Status: pgtype.Present}, pgtype.JSONB{Bytes: []byte(`{"bar":2}`), Status: pgtype.Present}},
Dimensions: []pgtype.ArrayDimension{pgtype.ArrayDimension{Length: 2, LowerBound: 1}},
Status: pgtype.Present,
}},
}
for i, tt := range successfulTests {
var d pgtype.JSONBArray
err := d.Set(tt.source)
if err != nil {
t.Errorf("%d: %v", i, err)
}
if !reflect.DeepEqual(d, tt.result) {
var xs []string
for _, e := range d.Elements {
xs = append(xs, string(e.Bytes))
}
t.Errorf("%d: expected %+v to convert to %+v, but it was %+v, raw: %v", i, tt.source, tt.result, d, xs)
}
}
}
And at least I can share some observations. The two first cases are different from the third.
- In the third case, the two elements
{"foo":1}
and{"bar":2}
both have length 9, and what is set ondst.Elements
is
[123 34 102 111 111 34 58 49 125 123 34 98 97 114 34 58 50 125]
which is the concatenation of the two inputs.
- Changing
1
to12
, which is the first case, gives us this output
["eyJmb28iOjEyfQ==" "eyJiYXIiOjJ9"]
which is the base64 encoding of the two inputs as strings. This is probably due to json.Marshal
being used in the pgtype.JSON
type at some point when calling dst.Elements[index].Set
in the setRecursive code, as this is the default behavior on []byte
types. The same happens in the second case.
The code that tries to automatically handle unknown types is hard to reason about. The JSON code path is even harder because it automatically calls json.Marshal
with anything that isn't explicitly recognized.
That conflicts with the try to figure out if the unknown type can be converted to something known. For example:
type mytype []byte
func (s mytype) MarshalJSON() ([]byte, error) {
// ...
}
Should mytype
be marshaled or should it be treated as a []byte
? And how to handle that while not breaking any of the other convertible to JSON types.
JSONBArray gets even more complicated because there is the extra layer of the array.
Personally, I'm inclined to be content with the partial solution in your PR rather than delve into that nest of complexity. But your welcome to dig into it further if you want.
FWIW, tricky issues like this were one of the primary motivations behind deciding to start development on pgx v5. I'm hoping to simplify the structure of the type conversion system such that this type issue isn't so complicated.
I think I would be hard pressed to make much progress here :) Seeing as you are working on pgx v5, I think it is better to look for the future. I am curious to look into the new type conversion code.