slick-pg icon indicating copy to clipboard operation
slick-pg copied to clipboard

Needed: `unnest WITH ORDINALITY`

Open virusdave opened this issue 7 years ago • 8 comments

Let's say I have 2 sql array literals (in scala as lists): val a = List("a", "b", "c") and val b = List(1, 2, 3)

I want to do a zipJoin on the unnest of these two, to produce:

a, 1
b, 2
c, 3

Trying something naive like: val c = Query(a.unnest).zipJoin(Query(b.unnest)) doesn't do the right thing, due to how zipJoin and unnest are implemented! Instead, it gives the cross product of these two sets.

Specifically, the construct SELECT UNNEST(...) instead of SELECT * FROM (UNNEST(...)), when combined with ROW_NUMBER gives all rows the value 1.

Investigating a bit, it seems like the "correct" method to do this in postgres is by using the syntax SELECT * FROM (UNNEST(...) WITH ORDINALITY), which will then be guaranteed to give me back the items from the input array with correctly applied indices, which could then be joined on.

How is this possible using slick-pg, or do I need to use a sql"" style query?

virusdave avatar Mar 07 '17 20:03 virusdave

@virusdave slick-pg can't resolve this. Maybe you should try sql"".

tminglei avatar Mar 09 '17 08:03 tminglei

the problem there is that sql"...".as[] returns a DBIO, which isn't really helpful, since one could always have just unnested the arrays locally. What is really needed is a way to get them unnested in the middle of a query.

For instance, consider a db table that has an index on 2 columns, A,B. What we're trying to achieve is a way to specify a discrete set of (A,B) values to a query. However, for performance reasons, the slick query needs to be precompiled, so using things like inSet are a nonstarter.

If we could accomplish this unnesting, then the entire query can be precompiled just fine. The final query would look something like:

myTable.join {
  listOfAs.unnestWithOrdinality
    .join(listOfBs.unnestWithOrdinality)
    .on(_._2 === _._2)
    .map { case (as, bs) => (as._1, bs._1)
}.on { case (tab, params) =>
  tab.a === params._1 && tab.b === params._2
}.map { case (tab, _) => tab }

// do other stuff with the now-filtered table

Such a query would be precompilable and still be parameterizable with parameter sets of cardinality unknown at compile time, which is a HUGE problem in slick now.

If this unnesting functionality isn't possible in slick-pg currently, can you add it? It would solve a critical use case with high performance.

virusdave avatar Mar 09 '17 16:03 virusdave

I'll try. But with ordinality is a general feature, I'd better to make it general.

tminglei avatar Mar 11 '17 05:03 tminglei

Well, the generated SELECT UNNEST(...) instead of SELECT * FROM (UNNEST(...)) is a problem. It must be resolved before add WITH ORDINALITY support.

tminglei avatar Mar 19 '17 09:03 tminglei

Yup, for sure! I think i could possibly add the with ordinality support myself at this point, but getting the query restructured like that was a head-scratcher for me, which is why i made this request here 😀

Does it look like it'll be doable?

virusdave avatar Mar 29 '17 05:03 virusdave

No, I can't let it generate SELECT * FROM (UNNEST(...)), then I can't implement with ordinality.

tminglei avatar Mar 29 '17 13:03 tminglei

Hello, @virusdave! Did you manage to come up with a workaround for this?

SergeyZharikhin avatar Dec 15 '20 10:12 SergeyZharikhin

Have you try using Query(a.unnest).subquery.zipJoin(Query(b.unnest)).subquery. I had a somewhat similar issue and resolved it like that: https://github.com/tminglei/slick-pg/issues/501

strokyl avatar Mar 04 '21 14:03 strokyl