persistence Allow an empty collection_valued_input

In section 4.6.9, In Expressions, it should be specified that an empty collection_valued_input_parameter must not result in an exception. More specifically:

the expression o.country IN (:countries) must be treated as false

and the expression o.country NOT IN (:countries) must be treated as true

This should be trivial for JPA implementations to implement, since they can simply replace the expression with an expression like "1 = 0" or "1 = 1".

Currently, JPA implementations throw an exception in this case, which makes it cumbersome to use collection valued input parameters.

Feb 21 '15 15:02 lukasj

Issue Imported From: https://github.com/javaee/jpa-spec/issues/100
Original Issue Raised By:@glassfishrobot
Original Issue Assigned To: @ldemichiel

Aug 31 '18 16:08 lukasj

@glassfishrobot Commented Reported by anthony_ve

Feb 21 '15 15:02 lukasj

@glassfishrobot Commented jgigov said: Having no values between the parentheses causes an SQL syntax error in MySQL and PostgreSQL. I don't have other databases to test it with. A way to simulate no results is to replace the parameter within parentheses with a subquery like this: select 1 where false. This workaround worked for both with integer fields, but PostgreSQL will never accept it if it can't resolve the types and find a conversion function. The fact that there is nothing to convert is irrelevant to it.

Dec 02 '16 18:12 lukasj

@glassfishrobot Commented datanucleus1 said: DataNucleus JPA has always supported checking for empty input lists and generating the SQL as you show. If your JPA provider doesn't then raise a bug on them (though the JPA spec should be explicit about what should happen in that situation as per your description)

Dec 05 '16 15:12 lukasj

@glassfishrobot Commented This issue was imported from java.net JIRA JPA_SPEC-100

May 05 '17 06:05 lukasj

This looks like something that could be easily looked at and resolved. Perhaps it makes sense to triage this for consideration to be worked on during the next release? I would still mark this low priority. Perhaps it can simply be independently/fairly evaluated and closed? This seems like an issue about finer grained SQL generation, which JPA generally tries to stay out of...

Reza Rahman Jakarta EE Ambassador, Author, Blogger, Speaker

Please note views expressed here are my own as an individual community member and do not reflect the views of my employer.

Apr 30 '21 13:04 m-reza-rahman

This looks like something that could be easily looked at and resolved.

I disagree. JPA implementations usually cache the generated SQL and try to leave it alone at query execution time. This is a new feature that requires runtime processing of the cached SQL.

Therefore, this change would impose a surprisingly large burden on implementers. I'm not against it per se, but it doesn't seem like a priority.

Apr 30 '21 14:04 gavinking

Just for clarity, simply closing/triaging some of this (keeping fairness in mind of course) as too low priority, too vague, too far fetched, too forward looking, out-of-date, out-of-scope, etc I think is a perfectly fine resolution. Whatever the factors, that does not seem to have been done for some time. It hopefully leads to focusing on making some tangible progress towards getting Jakarta Persistence moving again and focusing on items that make more practical sense in the near term.

Reza Rahman Jakarta EE Ambassador, Author, Blogger, Speaker

Please note views expressed here are my own as an individual community member and do not reflect the views of my employer.

Apr 30 '21 14:04 m-reza-rahman

I've encountered this issue quite often in actual projects. Collections are passed-in, and people either forget that collections can be empty (resulting in unneeded exceptions), or there are rather ugly if/else statements.

I agree it's a burden on Jakarta Persistence implementations to implement this, but not doing so shifts that burden to user applications which have to handle the empty/not-empty case.

Oct 04 '22 10:10 arjantijms

So I just tried these lines in Hibernate:

em.createQuery("select name from Author where id in :ids", String.class)
        .setParameter("ids", List.of())
        .getResultList()
em.createQuery("select name from Author where id not in :ids", String.class)
        .setParameter("ids", List.of())
        .getResultList()

and both resulted in the correct SQL.

So there's no objection on our side to requiring this. @lukasj does EclipseLink also support it?

Jan 26 '24 12:01 gavinking

To be clear, this would be done as part of #570, since "collection-valued input parameters" are currently underspecified.

Jan 26 '24 12:01 gavinking

Since this issue was closed by the PR, I should clarify that we ended up writing this:

A list of arguments may be assigned to a collection-valued parameter of a JPQL query by packaging the arguments in a non-null instance of java.util.List and passing the list as an argument to the second parameter of setParameter(). The list should contain at least one element. If the list is empty the behavior is undefined. Portable applications should not pass an empty list to a collection-valued parameter.

Which is to say, @lukasj decided that since SQL itself does not allow an empty in list, that JPQL should not require that it be allowed.

This is I guess not the resolution some people were hoping for, but it's a resolution.

[We have also done a lot to clarify the semantics in this area.]

Feb 26 '24 20:02 gavinking

since SQL itself does not allow an empty in list, that JPQL should not require that it be allowed.

Well, actually the idea was that it should be allowed. As JPQL is higher level, shouldn't it be possible to generate the lower level legal SQL?

Mar 04 '24 21:03 arjantijms

since SQL itself does not allow an empty in list, that JPQL should not require that it be allowed.

Well, actually the idea was that it should be allowed. As JPQL is higher level, shouldn't it be possible to generate the lower level legal SQL?

It is not a problem of the generation of valid SQL, it is about which valid SQL to generate in this particular case - should an empty list become an SQL NULL? Should it be a list with a NULL item? Or should it be some magic constant based on the query being evaluated so the NULL returned by the DB for sth like '2' IN (NULL) evaluates to true/false based on the current context?

Mar 04 '24 22:03 lukasj

Yeah so this is something I just really didn't appreciate before, but Lukas is correct.

if you translate foo in () to foo in (null), then you get the wrong behavior when foo is non-null
if you translate foo in () to 1=2, like our product does, then you get the wrong behavior when foo is null
if you translate it to foo in (-66666666) then you get the wrong behavior when foo=-66666666.

I can't think of a completely-correct solution here, can you?

Mar 04 '24 22:03 gavinking

I can't think of a completely-correct solution here, can you?

Actually I this works, I believe:

case when foo is null then null else 1=2 end

Mar 04 '24 22:03 gavinking

Also note that two NULL values are not equal by definition - see https://jakarta.ee/specifications/persistence/3.1/jakarta-persistence-spec-3.1#a5676

Mar 04 '24 23:03 lukasj

Maybe I'm simplifying it too much, but in practice I always write two queries now when there's 1 collection valued input. I then check if that that collection is not empty, and then I invoke query 1, otherwise query 2:

<named-query name="Website.getByUserIdAndCountries1">
  <query>
   SELECT 
    website
   FROM 
    Website website
   WHERE
    website.userId = :userId AND
    website.countryId in (:countryIds)
  </query>
 </named-query>

<named-query name="Website.getByUserIdAndCountries2">
  <query>
   SELECT 
    website
   FROM 
    Website website
   WHERE
    website.userId = :userId
  </query>
 </named-query>

The naming here is a little silly, but it's just for the example. I'd wish Jakarta Persistence would automatically omit the AND website.countryId in (:countryIds) when countryIds is empty.

It gets complicated and tedious to do manually in bigger queries and when there is more than 1 collection valued input parameter.

Mar 04 '24 23:03 arjantijms

@arjantijms That's completely the opposite of what the semantics should logically be!

Quite clearly 2=2 and 1 in () should evaluate to false. You have it evaluating to true.

Mar 04 '24 23:03 gavinking

(So if we did this, it would do the exact opposite of what you're asking for.)

Mar 04 '24 23:03 gavinking

Maybe I'm simplifying it too much, but in practice I always write two queries now when there's 1 collection valued input. I then check if that that collection is not empty, and then I invoke query 1, otherwise query 2:
<named-query name="Website.getByUserIdAndCountries1">
  <query>
   SELECT 
    website
   FROM 
    Website website
   WHERE
    website.userId = :userId AND
    website.countryId in (:countryIds)
  </query>
 </named-query>
<named-query name="Website.getByUserIdAndCountries2">
  <query>
   SELECT 
    website
   FROM 
    Website website
   WHERE
    website.userId = :userId
  </query>
 </named-query>
The naming here is a little silly, but it's just for the example. I'd wish Jakarta Persistence would automatically omit the AND website.countryId in (:countryIds) when countryIds is empty.

It gets complicated and tedious to do manually in bigger queries and when there is more than 1 collection valued input parameter.

you should use CriteriaQuery instead to construct dynamic query.

Mar 05 '24 02:03 quaff

I can't think of a completely-correct solution here, can you?

Actually I this works, I believe:
case when foo is null then null else 1=2 end

What I found that works on all major databases is to translate x in () to (1 = case x is not null then 0 end).

Mar 05 '24 07:03 gavinking

Maybe I'm simplifying it too much, but in practice I always write two queries now when there's 1 collection valued input. I then check if that that collection is not empty, and then I invoke query 1, otherwise query 2:
<named-query name="Website.getByUserIdAndCountries1">
  <query>
   SELECT 
    website
   FROM 
    Website website
   WHERE
    website.userId = :userId AND
    website.countryId in (:countryIds)
  </query>
 </named-query>
<named-query name="Website.getByUserIdAndCountries2">
  <query>
   SELECT 
    website
   FROM 
    Website website
   WHERE
    website.userId = :userId
  </query>
 </named-query>
The naming here is a little silly, but it's just for the example. I'd wish Jakarta Persistence would automatically omit the AND website.countryId in (:countryIds) when countryIds is empty. It gets complicated and tedious to do manually in bigger queries and when there is more than 1 collection valued input parameter.
you should use CriteriaQuery instead to construct dynamic query.

💯

If that predicate depends on a join that is only needed for that one predicate, the JPA provider would still have to do the join, but you can be smarter about this when you construct the query dynamically and avoid the join altogether if the input list is empty.

Mar 05 '24 08:03 beikov

you should use CriteriaQuery instead to construct dynamic query.

Well, yes and no.

Especially for somewhat more complicated queries, JPQL is much easier to read and maintain.

Exactly for this reason I've proposed more than a decade ago to have the ability to transform JPQL into Criteria. That way I can write a base query in JPQL, and then use Criteria to add or remove certain parts to it.

Mar 05 '24 12:03 arjantijms

Exactly for this reason I've proposed more than a decade ago to have the ability to transform JPQL into Criteria. That way I can write a base query in JPQL, and then use Criteria to add or remove certain parts to it.

FTR our implementation now supports this. But I'm not sure about other implementations.

Mar 05 '24 12:03 gavinking

That's completely the opposite of what the semantics should logically be!

I hear you, and I had been thinking about that a lot. It's not completely straightforward what is logical.

If you see the list as constraints and there simply are no constraints, evaluating to true might be logical. This often translates to how a choice from a UI works. If the user doesn't select any explicit country from a list, a table or whatever searches in all countries. If the user specifies one or more countries, the search is only for those specified countries.

However, if the list is seen as requirements, then evaluating to false is more logical. If my memory serves me correctly, around a decade ago exactly this was discussed on the JPA JIRA or list.

Maybe something like this could work?

<named-query name="Website.getByUserIdAndCountries" empty-collection-semantics="[true|false]">

There's maybe alternative ways to specific this, but the idea would be for the user to specify whether an empty collection should evaluate to effectively true or false.

Mar 05 '24 12:03 arjantijms

FTR our implementation now supports this. But I'm not sure about other implementations.

That's extremely cool. Would be awesome to have this in the Jakarta spec. I'm sure there were various issues for that.

Mar 05 '24 12:03 arjantijms

@arjantijms This isn't just some arbitrary thing we get to pick and choose the semantics of based on convenience. The semantics are determined by logical consistency. That is, by math.

Mar 05 '24 12:03 gavinking

Would be awesome to have this in the Jakarta spec.

Sure, but it can only go in the spec if it's implementable by others.

Mar 05 '24 12:03 gavinking

Maybe I'm simplifying it too much, but in practice I always write two queries now when there's 1 collection valued input. I then check if that that collection is not empty, and then I invoke query 1, otherwise query 2:
<named-query name="Website.getByUserIdAndCountries1">
  <query>
   SELECT 
    website
   FROM 
    Website website
   WHERE
    website.userId = :userId AND
    website.countryId in (:countryIds)
  </query>
 </named-query>
<named-query name="Website.getByUserIdAndCountries2">
  <query>
   SELECT 
    website
   FROM 
    Website website
   WHERE
    website.userId = :userId
  </query>
 </named-query>
The naming here is a little silly, but it's just for the example. I'd wish Jakarta Persistence would automatically omit the AND website.countryId in (:countryIds) when countryIds is empty.

It gets complicated and tedious to do manually in bigger queries and when there is more than 1 collection valued input parameter.

Why exactly are two queries required? Isn't sth like

<named-query name="Website.getByUserIdAndCountries">
  <query>
   SELECT 
    website
   FROM 
    Website website
   WHERE
    website.userId = :userId AND
    case when (:countryIds is not null) then website.countryId in (:countryIds) 
             else true
  </query>
 </named-query>

equivalent? Also be aware that every IN can be rewritten to JOIN (but not vice-versa).

However, if the list is seen as requirements, then evaluating to false is more logical.

Once nullable columns and/or NULLs come to play, what seems logical is not what boolean algebra defines.

Mar 05 '24 13:03 lukasj

persistence
persistence copied to clipboard

Allow an empty collection_valued_input_parameter in an "IN" expression

persistence persistence copied to clipboard

Allow an empty collection_valued_input_parameter in an "IN" expression

persistence
persistence copied to clipboard