pgx icon indicating copy to clipboard operation
pgx copied to clipboard

pgx5 connection CopyFrom failing for JSON field where value is string

Open EfraimDim opened this issue 1 year ago • 1 comments

Hello, I am trying to using CopyFrom with pgx5 connection to copy from a table with a JSON field but the json is inserted in the original insert query as '"json_string"'. When using CopyFrom with rows pgx5.Rows and a JSON value of this string, it throws an error: ERROR: invalid input syntax for type json (SQLSTATE 22P02)

Steps to reproduce the behavior:

package main

import (
	"context"
	"fmt"
	"github.com/golang/glog"
	"github.com/jackc/pgx/v5"
	"log"
	"os"
	"strings"
)

func main() {
	var err error

	conn, err := pgx.Connect(context.Background(), os.Getenv("SOURCE_DATABASE_URL"))
	if err != nil {
		glog.Errorln("Error connecting to db: %v", err)
	}
	defer conn.Close(context.Background())

	_, err = conn.Exec(context.Background(), "CREATE TABLE testtable (id SERIAL PRIMARY KEY, json_column JSON)")
	if err != nil {
		glog.Errorln("Error creating table: %v", err)
	}

	_, err = conn.Exec(context.Background(), `INSERT INTO testtable (json_column) VALUES ('"json_string"')`)
	if err != nil {
		glog.Errorln("Error inserting data: %v", err)
	}

	if rows, err = conn.Query(context.Background(), "SELECT * FROM testtable"); err != nil {
		glog.Errorf("Error querying data from testtable: %v", err)
	}
	defer rows.Close()

	noOfColumns := len(rows.FieldDescriptions())

	columns := make([]string, noOfColumns)
	for i, col := range rows.FieldDescriptions() {
		columns[i] = string(col.Name)
	}

	table := "public.testtable"

	schemaTable := strings.SplitN(table, ".", 2)

	connTarget, _ := pgx.Connect(context.Background(), os.Getenv("TARGET_DATABASE_URL"))
	defer connTarget.Close(context.Background())

	_, err = connTarget.Exec(context.Background(), "CREATE TABLE testtable (id SERIAL PRIMARY KEY, json_column 
        JSON)")
	if err != nil {
		glog.Errorln("Error creating table: %v", err)
	}

	if _, err = connTarget.CopyFrom(context.Background(), schemaTable, columns, rows); err != nil {
		glog.Errorln("Error executing COPY SQL for table", err)
	}
}

Version

  • Go: go version go1.21.3 darwin/arm64
  • PostgreSQL: PostgreSQL 15.5 (Homebrew) on aarch64-apple-darwin23.2.0, compiled by Apple clang version 15.0.0 (clang-1500.1.0.2.5), 64-bit
  • pgx: github.com/jackc/pgx/v5 v5.5.0

**Edited code: 1/03/24

EfraimDim avatar Feb 29 '24 13:02 EfraimDim

The problem is unmarshalling the literal JSON string "foo" into a Go string removes the quotation marks. This is correct behavior.

When encoding a string to a PostgreSQL json type, pgx assumes that it is an encoded JSON document, not a literal string that should be encoded into a JSON value. This is also correct behavior. Or at least the behavior that we would almost always want when using a string as a query parameter.

Unfortunately, these two (correct) behaviors are incompatible.

I don't think there is a general solution that would allow using a pgx.Rows that has JSON documents that consist entirely of a JSON string value as a pgx.CopyFromSource. It might be possible to do a special case, but actually the entire approach of using pgx.Rows as a pgx.CopyFromSource is sub-optimal.

Instead you can directly wire CopyTo and CopyFrom together. See https://github.com/jackc/pgx/issues/867.

jackc avatar Mar 02 '24 21:03 jackc