pgtype icon indicating copy to clipboard operation
pgtype copied to clipboard

Automatically Marshall JSON/JSONB/JSONBArray etc types as strings in Query Values

Open jsnb-devoted opened this issue 2 years ago • 3 comments

I'm writing some code that selects all the data from a table and sends each row to a channel. Each row is going to eventually get put into a CSV so I need the types to eventually be strings. I've looked into defining a custom type and using RegisterDataType or QueryResultFormatsByOID but ultimately the JSON types seem to all return a byte array. Is there a quick way to override that so that the results from JSON fields are returned automatically as strings?

	sql := "SELECT * FROM my_table"
	var err error
	var values []interface{}

	rows, err := conn.Query(ctx, sql)
	if err != nil {
		return err
	}
	defer rows.Close()
	for rows.Next() {
		values, err = rows.Values()
		if err != nil {
			return err
		}

		r := RowType{
			TableName:   tableName,
			SchemaName:  schemaName,
		}
		for ix, v := range values {
			cr.Columns = append(cr.Columns, &r.Column{Value: v, Type: table.ColTypes[ix], Name: table.ColNames[ix]})
		}

		myChannel <- r

Feels like I could probably accomplish this by not using SELECT * and casting all my json type columns to strings in the query itself. I'm sure that is much fast but I'm hoping to understand a little better how to override the types in v4 (apologies if v5 totally changes the way this would be done). I had a similar issue with the uuid type until I ran into the https://github.com/jackc/pgx/wiki/UUID-Support doc

jsnb-devoted avatar Aug 08 '22 20:08 jsnb-devoted

I'm not sure all that you're trying to do, but there might be an easier way.

First, if you know the structure of your result set you could use Scan instead of values. Then you can specify the output type.

If you don't then this code may serve as a model: https://github.com/jackc/pgxutil/blob/37866e09a15bb3122ba96adef35f61129ddf55cb/pgxutil.go#L477 It's from my personal junk drawer of pgx utilities.

Failing that, then registering a data type would be possible. Values() calls Get() on the underlying type. So that is the method you would need to change.

One last idea would be to use pgconn directly. If you want everything in text anyway, then pgx's type support is superfluous.


And yes, the internals of type mapping is changing a lot in pgx v5 🤷

jackc avatar Aug 12 '22 02:08 jackc

Thanks so much for the quick response! I think something like the snippet you posted gets at what I need. Am I reading that right that the QueryResultFormats query option impacts the result format of RawValues and not Values? I think that maybe have been why the QueryResultFormatsByOID didn't do what I expected.

We are writing a change data capture tool so when the app first spins up it needs to export all of the existing data in the database. We are doing that by running SELECT * on every table in the db and exporting the data to CSVs. Since we cannot anticipate the schema of the json types (or really the schema for any arbitrary table) we have to use Values (or RawValues).

We know we could also leverage COPY for this to write the data directly to a csv but we gain a lot by treating each row as an "insert" and processing the same way we would a message coming off of the WAL.

jsnb-devoted avatar Aug 12 '22 15:08 jsnb-devoted

Am I reading that right that the QueryResultFormats query option impacts the result format of RawValues and not Values?

Yes.

We know we could also leverage COPY for this to write the data directly to a csv but we gain a lot by treating each row as an "insert" and processing the same way we would a message coming off of the WAL.

Ah. Yeah, in most cases COPY would be better.

jackc avatar Aug 13 '22 14:08 jackc