pgtyped icon indicating copy to clipboard operation
pgtyped copied to clipboard

Passing array to Json input causes error in PostgreSQL

Open bradleyayers opened this issue 4 years ago • 4 comments

When passing an array as a query parameter for a json, PostgreSQL emits an error. Here's a test case https://github.com/adelsz/pgtyped/commit/7164361bb60e8c19f6be697f09ec944251916e3d.

https://github.com/adelsz/pgtyped/blob/7164361bb60e8c19f6be697f09ec944251916e3d/packages/example/src/index.ts#L110-L119

> cd packages/example && docker-compose run test

Starting example_build_1 ... done
Creating example_test_run ... done
Checking postgres is up...
Comments: [{"id":1,"user_id":1,"book_id":1,"body":"Fantastic read, recommend it!"},{"id":2,"user_id":1,"book_id":2,"body":"Did not like it, expected much more..."}]
Inserted book ID: 7
Book: {"id":2,"rank":13,"name":"Another title","author_id":2}
Error running example code: error: invalid input syntax for type json
    at Parser.parseErrorMessage (/app/packages/example/node_modules/pg-protocol/src/parser.ts:369:69)
    at Parser.handlePacket (/app/packages/example/node_modules/pg-protocol/src/parser.ts:188:21)
    at Parser.parse (/app/packages/example/node_modules/pg-protocol/src/parser.ts:103:30)
    at Socket.<anonymous> (/app/packages/example/node_modules/pg-protocol/src/index.ts:7:48)
    at Socket.emit (node:events:365:28)
    at Socket.emit (node:domain:470:12)
    at addChunk (node:internal/streams/readable:314:12)
    at readableAddChunk (node:internal/streams/readable:289:9)
    at Socket.Readable.push (node:internal/streams/readable:228:10)
    at TCP.onStreamRead (node:internal/stream_base_commons:190:23)

The issue is https://github.com/brianc/node-postgres/issues/2012, specifically that pg encodes array and object parameters differently. Arrays are encoded as native arrays, but objects receive JSON.stringify treatment. Apparently there's a good performance reason for using native array, but there's plenty of people who think it shouldn't make this blanket assumption, and I'm guessing the behavior also pre-dates JSON support in PostgreSQL so it wasn't originally broken.

This issue is spread across both pg and pgtyped, however PgTyped is in a unique position to solve it (as it has parameter type information). Although PgTyped isn't opinionated about what PostgreSQL library to use, with pg being so popular I think it would be great if PgTyped made the experience using the two libraries together seamless.

There's a few options I can see:

  1. Update the emitted types to not include Json[] for parameters (presumably split Json into InputJson and OutputJson, or something like that. This isn't great because it's leaking a quirk of pg and degrading the ergonomics.
  2. Apply JSON.stringify to json parameters during preprocessing. This will require type information at runtime which is straight forward to add to the IR object for queries in SQL, but not for queries in TS. It would be a shame for this behavior to diverge between the SQL and TS flavours of PgTyped. The only way I can see it working in TS would be an IR object, but that will make the ergonomics of queries in TS even worse (having to import another thing, or not using a generic in the sql tag).
  3. Apply JSON.stringify to all array and object parameters, and incur the performance penalty in worse query plans.
  4. Combination of 1 and 2, for SQL-in-TS remove the Json[] type so that it's not broken, and for SQL-in-SQL add the preprocessing support so it 'just works'.

@adelsz interested in your thoughts on this, I'm not sure what direction you want to take the library.

bradleyayers avatar May 22 '21 10:05 bradleyayers

https://github.com/adelsz/pgtyped/compare/master...bradleyayers:json-array-input is where I explored option #2. Here are my thoughts after doing that:

  • A much simpler fix to all this is to just make the input type for JSON parameters just string (don't even allow number). This would and force the consumer to have to figure out JSON.stringify in their code. There's already some weirdness where in @pgtyped/query values passed along to pg are assumed to be Scalar, but that type is defined as:

    export type Scalar = string | number | null;
    

    Given the support of Json as an input already, this type would more accurately:

    export type Scalar = Json | string | number | null;
    
  • I introduced a prepareValue function in the same spirit pg's, but it takes a second parameter (the postgres type). This is information pg doesn't have, and means that the PgTyped prepareValue can actually do the right encoding without guess work. It does however expand the scope of what PgTyped is responsible for, and gets into the realm of custom parameter encoding. Going down this route I suspect it would make more sense to be properly pluggable and use io-ts codecs as the API.

  • The IR shape changed to become:

    interface ParamPgTypes {
        ast: QueryAST;
        paramPgTypes: {
            [paramName: string]: 
                | string
                | { [subParamName: string]: string }
        }
    }
    

    I'm not sure the policy on changing the IR type, especially given it's cast as any so there's no TypeScript compiler safety for consumers, they'd end up with a runtime error when PreparedValue gets an IR value it doesn't understand.

    I'm surprised the IR is cast to any, I'm guessing it was a performance optimization to save on the TS check time? It would be safer if it was just left to be checked by TS so that in PgTyped upgrades that change the IR format, the problem is caught at compile time.

  • To make this work for SQL-in-TS the API would have to change to allow type information to be available to TaggedQuery at runtime. This would be a dramatic breaking change so I left it alone pending advice.

  • There were a few nitpick general code clean-ups but I'll raise those as separate PRs

bradleyayers avatar May 23 '21 22:05 bradleyayers

I'm currently thinking the best option here is to simply change the input type in the mapper for json types to be string rather than Json, and just leave Json for return types. This would hopefully give someone the clue that they need to do JSON.stringify(…) on the value before giving it to PgTyped.

What do you think @adelsz?

bradleyayers avatar Jun 05 '21 22:06 bradleyayers

I have a similar issue:

CREATE TABLE example (f JSONB);

/* @name MyQuery */
INSERT INTO example VALUES(:f);
myQuery.run({f: 42}, db) // works
myQuery.run({f: "asdf"}, db) // broken with 22P02 | INVALID TEXT REPRESENTATION  postgresql error

even though both strings and number are valid json

I had to call JSON.stringify manually

m-ronchi avatar Feb 17 '22 17:02 m-ronchi