WIP: Remove js type inference
In v3 we rely on the PostgreSQL ParameterDescription message to decide on input types, but there is still some cases were we rely on types inferred from the js types first (https://github.com/porsager/postgres/blob/218a7d4f37dcbf76d01081413a386c00544ab63e/src/types.js#L213-L223). This causes problems when trying to use some of these JS types as raw values for eg. json PostgreSQL types (and others). Because of the order of the protocol messages we are not able to give "hints" about types so it could be better to remove the "convenience" of implicitly inferring types from the js objects and instead require the users to write explicit casts in sql - eg ::type or cast(...).
I'm still not certain this is the best way forward, and if someone sees an alternative solution that encompasses both scenarios - i am all ears.
For now, this PR can act as a test ground and discussion for the changed behaviour.
Fixes: #386, #471,
require the users to write explicit casts in sql - eg ::type or cast(...).
this is a very correct requirement, in the TS this requirement does not surprise anyone.
It is always easier when there is only one source of truth, postgres data types are needed by postgres, so at the postgres level (SQL) you need to describe types
Here are some more examples of comparing the two ways of describing types and inferring types:
bytea from JS hex string
// describe type
sql`SELECT ${'\\x27'}::bytea`;
// inference type
sql`SELECT ${Buffer.from('\\x27', 'hex')}`;
date from JS string
// describe type
sql`SELECT ${'2020-10-11'}::date`;
// inference type
sql`SELECT ${new Date('2020-10-11')}`;
bigint from JS number
// describe type
sql`SELECT ${123}::bigint`;
// inference type
sql`SELECT ${BigInt(123)}`;
As you can see, if your values have core values for postgres types, but you have to explicitly convert these values into correspondence types of JS, it's easier to just specify these types in the SQL itself
@uasan well that's what I've been trying to get at the whole time. We can only choose one over the other, but I feel like you keep suggesting that we can have both? Can you please confirm that you also don't see any way to have both ? ☺️
For instance if we switch to no inference we can't do this:
sql`
select * from x
where ${ someBoolean } or user_id = ${ user_id }
`
But will have to do this:
sql`
select * from x
where ${ someBoolean }::boolean or user_id = ${ user_id }
`
(I've seen queries like this in the wild, which is also why - if we make the change - it's a breaking change)
We can only choose one over the other, but I feel like you keep suggesting that we can have both? Can you please confirm that you also don't see any way to have both ?
My suggestion is we should first of all determine types from the describe SQL query, but if type is unknown type (as in your first example), for backwards compatibility, postgres.js can inference type on based on the JS data type, but this second source of truth is needed only for backwards compatibility, if unknown types from describe SQL, i.e. in rare cases.
But you'll never get the db to see that as anything but a unknown type. The issue is the order required of the extended query protocol. You can only tell the db about types "before" you receive the ParameterDescription message. As i mentioned in the other thread, the only way to work around that would be to create a new prepared statement with the inferred types in place of the unknowns, and I don't wanna do that.
Still, I'm leaning towards merging this change, but I want to be sure all bases are covered.
Probably because of my bad english, you did not fully understand my idea. If the data type by describe is unknown, you need to pass the data as an unknown type, but serelize to text data according to the JS data type, i.e.
true -> 't'
date -> date.toISOString()
...
You do not need to send a request for parameter descriptions again, you need to send a BIND (unknown type), but with correct text serialization, which is performed based on the JS of the input parameter value types
What about a flag in options? I mean introducing this feature as an opt-in (e.g. a optional inferTypesFrom?: 'postgres' | 'js' property in options, which default to 'js' in v3), then remove the flag in the next major (or set it to 'postgres' by default)?
@uasan - english isn't my first language either - we'll figure it out ;)
I thought you meant something else since I think what you describe is what this PR does :P So sorry if I'm being obtuse, but just want to clarify that when we send correctly serialized value, but unknown type, postgres won't know about the type anyway. Specifically the following true value (someBoolean) will be seen simply as a string.
sql`
select * from x
where ${ someBoolean } or user_id = ${ user_id }
`
@Minigugus yes - pragmatism for the win!! But yeah, now we have a new problem - the name for that option 😝 I'm not sure I think inferTypesFrom and an enum is necessary. How about strict_types: true | false? With true being the future default.
Specifically the following true values will be seen simply as a string
I didn't check your example, but I think postgres will determine the type of someBoolean variable as boolean and give this oid in the response from describe
Thanks a lot for sticking with me @uasan - You are right! In that example it'll know. So would you think it's only for the pass-through selects we'll loose the type? eg: select ${ true }
Yes, unknown type, this is quite rare, most of these query have a design-level error, it's especially fun when using such untyped parameters to call SQL functions with overloading on the types of arguments of this function )
About null value json.
UPDATE table
SET json_column = ${null}::json
now all null values are sent as SQL null, we have only one option how to send 'null' values as JSON values using helper sql.json(null)?
@uasan that's a good point too - could also be quite confusing since sql null and json null would translate to the same js null when read out again.. Got any ideas for solving it cleanly?
Got any ideas for solving it cleanly?
this is a value collision, SQL null is more like JS undefined, null is JSON values, but this breaks backwards compatibility a lot, I think developers should resolve their own value collisions