ppx_pgsql icon indicating copy to clipboard operation
ppx_pgsql copied to clipboard

Empty list as param generates incorrect sql query at runtime.

Open NightBlues opened this issue 5 years ago • 7 comments

If we pass empty list, this code https://github.com/tizoc/ppx_pgsql/blob/master/lib/runtime/ppx_pgsql_runtime.ml#L49 generates () that postgres can't understand. For example:

SELECT COUNT(*) FROM users
WHERE id IN (1,2,3)

is correct, but

SELECT COUNT(*) FROM users
WHERE id IN ()

gives syntax error.

NightBlues avatar Apr 09 '19 15:04 NightBlues

I can make a patch, but I dont see easy way to fix it - if we substitute an empty array here we have to specify types explicitly because we generate prepared statement instead of real query and Postgres can't infer it, but we don't know params types at this point.

NightBlues avatar Apr 09 '19 16:04 NightBlues

I can't think of any good solution. Other than what you mention, the alternative is to replace the whole expression <VALUE-EXPRESSION> IN () for false and <VALUE-EXPRESSION> NOT IN () for true, but doing that requires parsing SQL, which makes everything quite a bit more complicated than it is already. And thats not even a good translation because it doesn't behave the same for NULL values as the original expression.

tizoc avatar Apr 09 '19 16:04 tizoc

Thinking a bit more about it, the IN () case can be replaced by IN (NULL) while retaining the expected behaviour, right? It is NOT IN () that gets weird.

@NightBlues thoughts?

tizoc avatar Apr 09 '19 16:04 tizoc

According to postgres documentation NULL is like a blackhole - everything touched by it becomes NULL too:) 1 IN (NULL) gives NULL, while 1 IN (2,3) gives false, NULL IN (NULL) also gives NULL because NULL is not equal to itself. Seems to be appropriate solution, despite looking like a hack.

NightBlues avatar Apr 10 '19 09:04 NightBlues

@NightBlues I have an untested alternative that may solve every case in a clean way, while at the same time avoiding the need of having to create a separate prepared statement for each list length (something that is needed with the current implementation). I'm not sure how well it will work in practice because PGOCaml doesn't define arrays for every type IIRC.

The basic idea is:

SELECT COUNT(*) FROM users
WHERE id IN $@ids::int[]

gets translated into

SELECT COUNT(*) FROM users
WHERE id IN (SELECT unnest($1::int[]))

This requires the user to be explicit and cast the value to the required type. Anyway, I have to think about this more, just wanted to share the idea I had.

tizoc avatar Apr 15 '19 15:04 tizoc

@NightBlues by now you probably have this solved already, but I found another solution that seems to work fine:

-- Equivalent to IN (...)
SELECT COUNT(*) FROM users
WHERE id = ANY($user_ids_array::int[])

-- Equivalent to NOT IN (...)
SELECT COUNT(*) FROM users
WHERE id <> ALL($user_ids_array::int[])

This will work with empty arrays too, and has the advantage of requiring a single prepared statement for every list length, while the IN/NOT IN version will create a new prepared statement for each list length.

I will update the documentation with a comment mentioning this (along with other gotchas like nullable columns in views, and you COALESCE trick for outer joins)

tizoc avatar Feb 04 '20 17:02 tizoc

Modern postgres has very powerfull features, for example sometimes I use conversions to json and back with json operators. :)

NightBlues avatar Feb 05 '20 16:02 NightBlues