persistence
persistence copied to clipboard
Allow an empty collection_valued_input_parameter in an "IN" expression
In section 4.6.9, In Expressions, it should be specified that an empty collection_valued_input_parameter must not result in an exception. More specifically:
the expression o.country IN (:countries) must be treated as false
and the expression o.country NOT IN (:countries) must be treated as true
This should be trivial for JPA implementations to implement, since they can simply replace the expression with an expression like "1 = 0" or "1 = 1".
Currently, JPA implementations throw an exception in this case, which makes it cumbersome to use collection valued input parameters.
- Issue Imported From: https://github.com/javaee/jpa-spec/issues/100
- Original Issue Raised By:@glassfishrobot
- Original Issue Assigned To: @ldemichiel
@glassfishrobot Commented Reported by anthony_ve
@glassfishrobot Commented jgigov said: Having no values between the parentheses causes an SQL syntax error in MySQL and PostgreSQL. I don't have other databases to test it with. A way to simulate no results is to replace the parameter within parentheses with a subquery like this: select 1 where false. This workaround worked for both with integer fields, but PostgreSQL will never accept it if it can't resolve the types and find a conversion function. The fact that there is nothing to convert is irrelevant to it.
@glassfishrobot Commented datanucleus1 said: DataNucleus JPA has always supported checking for empty input lists and generating the SQL as you show. If your JPA provider doesn't then raise a bug on them (though the JPA spec should be explicit about what should happen in that situation as per your description)
@glassfishrobot Commented This issue was imported from java.net JIRA JPA_SPEC-100
This looks like something that could be easily looked at and resolved. Perhaps it makes sense to triage this for consideration to be worked on during the next release? I would still mark this low priority. Perhaps it can simply be independently/fairly evaluated and closed? This seems like an issue about finer grained SQL generation, which JPA generally tries to stay out of...
Reza Rahman Jakarta EE Ambassador, Author, Blogger, Speaker
Please note views expressed here are my own as an individual community member and do not reflect the views of my employer.
This looks like something that could be easily looked at and resolved.
I disagree. JPA implementations usually cache the generated SQL and try to leave it alone at query execution time. This is a new feature that requires runtime processing of the cached SQL.
Therefore, this change would impose a surprisingly large burden on implementers. I'm not against it per se, but it doesn't seem like a priority.
Just for clarity, simply closing/triaging some of this (keeping fairness in mind of course) as too low priority, too vague, too far fetched, too forward looking, out-of-date, out-of-scope, etc I think is a perfectly fine resolution. Whatever the factors, that does not seem to have been done for some time. It hopefully leads to focusing on making some tangible progress towards getting Jakarta Persistence moving again and focusing on items that make more practical sense in the near term.
Reza Rahman Jakarta EE Ambassador, Author, Blogger, Speaker
Please note views expressed here are my own as an individual community member and do not reflect the views of my employer.
I've encountered this issue quite often in actual projects. Collections are passed-in, and people either forget that collections can be empty (resulting in unneeded exceptions), or there are rather ugly if/else statements.
I agree it's a burden on Jakarta Persistence implementations to implement this, but not doing so shifts that burden to user applications which have to handle the empty/not-empty case.
So I just tried these lines in Hibernate:
em.createQuery("select name from Author where id in :ids", String.class)
.setParameter("ids", List.of())
.getResultList()
em.createQuery("select name from Author where id not in :ids", String.class)
.setParameter("ids", List.of())
.getResultList()
and both resulted in the correct SQL.
So there's no objection on our side to requiring this. @lukasj does EclipseLink also support it?
To be clear, this would be done as part of #570, since "collection-valued input parameters" are currently underspecified.
Since this issue was closed by the PR, I should clarify that we ended up writing this:
A list of arguments may be assigned to a collection-valued parameter of a JPQL query by packaging the arguments in a non-null instance of
java.util.List
and passing the list as an argument to the second parameter ofsetParameter()
. The list should contain at least one element. If the list is empty the behavior is undefined. Portable applications should not pass an empty list to a collection-valued parameter.
Which is to say, @lukasj decided that since SQL itself does not allow an empty in
list, that JPQL should not require that it be allowed.
This is I guess not the resolution some people were hoping for, but it's a resolution.
[We have also done a lot to clarify the semantics in this area.]
since SQL itself does not allow an empty in list, that JPQL should not require that it be allowed.
Well, actually the idea was that it should be allowed. As JPQL is higher level, shouldn't it be possible to generate the lower level legal SQL?
since SQL itself does not allow an empty in list, that JPQL should not require that it be allowed.
Well, actually the idea was that it should be allowed. As JPQL is higher level, shouldn't it be possible to generate the lower level legal SQL?
It is not a problem of the generation of valid SQL, it is about which valid SQL to generate in this particular case - should an empty list become an SQL NULL? Should it be a list with a NULL item? Or should it be some magic constant based on the query being evaluated so the NULL returned by the DB for sth like '2' IN (NULL) evaluates to true/false based on the current context?
Yeah so this is something I just really didn't appreciate before, but Lukas is correct.
- if you translate
foo in ()
tofoo in (null)
, then you get the wrong behavior whenfoo
is non-null - if you translate
foo in ()
to1=2
, like our product does, then you get the wrong behavior whenfoo
is null - if you translate it to
foo in (-66666666)
then you get the wrong behavior whenfoo=-66666666
.
I can't think of a completely-correct solution here, can you?
I can't think of a completely-correct solution here, can you?
Actually I this works, I believe:
case when foo is null then null else 1=2 end
Also note that two NULL values are not equal by definition - see https://jakarta.ee/specifications/persistence/3.1/jakarta-persistence-spec-3.1#a5676
Maybe I'm simplifying it too much, but in practice I always write two queries now when there's 1 collection valued input. I then check if that that collection is not empty, and then I invoke query 1, otherwise query 2:
<named-query name="Website.getByUserIdAndCountries1">
<query>
SELECT
website
FROM
Website website
WHERE
website.userId = :userId AND
website.countryId in (:countryIds)
</query>
</named-query>
<named-query name="Website.getByUserIdAndCountries2">
<query>
SELECT
website
FROM
Website website
WHERE
website.userId = :userId
</query>
</named-query>
The naming here is a little silly, but it's just for the example. I'd wish Jakarta Persistence would automatically omit the AND website.countryId in (:countryIds)
when countryIds
is empty.
It gets complicated and tedious to do manually in bigger queries and when there is more than 1 collection valued input parameter.
@arjantijms That's completely the opposite of what the semantics should logically be!
Quite clearly 2=2 and 1 in ()
should evaluate to false. You have it evaluating to true.
(So if we did this, it would do the exact opposite of what you're asking for.)
Maybe I'm simplifying it too much, but in practice I always write two queries now when there's 1 collection valued input. I then check if that that collection is not empty, and then I invoke query 1, otherwise query 2:
<named-query name="Website.getByUserIdAndCountries1"> <query> SELECT website FROM Website website WHERE website.userId = :userId AND website.countryId in (:countryIds) </query> </named-query>
<named-query name="Website.getByUserIdAndCountries2"> <query> SELECT website FROM Website website WHERE website.userId = :userId </query> </named-query>
The naming here is a little silly, but it's just for the example. I'd wish Jakarta Persistence would automatically omit the
AND website.countryId in (:countryIds)
whencountryIds
is empty.It gets complicated and tedious to do manually in bigger queries and when there is more than 1 collection valued input parameter.
you should use CriteriaQuery
instead to construct dynamic query.
I can't think of a completely-correct solution here, can you?
Actually I this works, I believe:
case when foo is null then null else 1=2 end
What I found that works on all major databases is to translate x in ()
to (1 = case x is not null then 0 end)
.
Maybe I'm simplifying it too much, but in practice I always write two queries now when there's 1 collection valued input. I then check if that that collection is not empty, and then I invoke query 1, otherwise query 2:
<named-query name="Website.getByUserIdAndCountries1"> <query> SELECT website FROM Website website WHERE website.userId = :userId AND website.countryId in (:countryIds) </query> </named-query>
<named-query name="Website.getByUserIdAndCountries2"> <query> SELECT website FROM Website website WHERE website.userId = :userId </query> </named-query>
The naming here is a little silly, but it's just for the example. I'd wish Jakarta Persistence would automatically omit the
AND website.countryId in (:countryIds)
whencountryIds
is empty. It gets complicated and tedious to do manually in bigger queries and when there is more than 1 collection valued input parameter.you should use
CriteriaQuery
instead to construct dynamic query.
💯
If that predicate depends on a join that is only needed for that one predicate, the JPA provider would still have to do the join, but you can be smarter about this when you construct the query dynamically and avoid the join altogether if the input list is empty.
you should use
CriteriaQuery
instead to construct dynamic query.
Well, yes and no.
Especially for somewhat more complicated queries, JPQL is much easier to read and maintain.
Exactly for this reason I've proposed more than a decade ago to have the ability to transform JPQL into Criteria. That way I can write a base query in JPQL, and then use Criteria to add or remove certain parts to it.
Exactly for this reason I've proposed more than a decade ago to have the ability to transform JPQL into Criteria. That way I can write a base query in JPQL, and then use Criteria to add or remove certain parts to it.
FTR our implementation now supports this. But I'm not sure about other implementations.
That's completely the opposite of what the semantics should logically be!
I hear you, and I had been thinking about that a lot. It's not completely straightforward what is logical.
If you see the list as constraints and there simply are no constraints, evaluating to true might be logical. This often translates to how a choice from a UI works. If the user doesn't select any explicit country from a list, a table or whatever searches in all countries. If the user specifies one or more countries, the search is only for those specified countries.
However, if the list is seen as requirements, then evaluating to false is more logical. If my memory serves me correctly, around a decade ago exactly this was discussed on the JPA JIRA or list.
Maybe something like this could work?
<named-query name="Website.getByUserIdAndCountries" empty-collection-semantics="[true|false]">
There's maybe alternative ways to specific this, but the idea would be for the user to specify whether an empty collection should evaluate to effectively true or false.
FTR our implementation now supports this. But I'm not sure about other implementations.
That's extremely cool. Would be awesome to have this in the Jakarta spec. I'm sure there were various issues for that.
@arjantijms This isn't just some arbitrary thing we get to pick and choose the semantics of based on convenience. The semantics are determined by logical consistency. That is, by math.
Would be awesome to have this in the Jakarta spec.
Sure, but it can only go in the spec if it's implementable by others.
Maybe I'm simplifying it too much, but in practice I always write two queries now when there's 1 collection valued input. I then check if that that collection is not empty, and then I invoke query 1, otherwise query 2:
<named-query name="Website.getByUserIdAndCountries1"> <query> SELECT website FROM Website website WHERE website.userId = :userId AND website.countryId in (:countryIds) </query> </named-query>
<named-query name="Website.getByUserIdAndCountries2"> <query> SELECT website FROM Website website WHERE website.userId = :userId </query> </named-query>
The naming here is a little silly, but it's just for the example. I'd wish Jakarta Persistence would automatically omit the
AND website.countryId in (:countryIds)
whencountryIds
is empty.It gets complicated and tedious to do manually in bigger queries and when there is more than 1 collection valued input parameter.
Why exactly are two queries required? Isn't sth like
<named-query name="Website.getByUserIdAndCountries">
<query>
SELECT
website
FROM
Website website
WHERE
website.userId = :userId AND
case when (:countryIds is not null) then website.countryId in (:countryIds)
else true
</query>
</named-query>
equivalent? Also be aware that every IN
can be rewritten to JOIN
(but not vice-versa).
However, if the list is seen as requirements, then evaluating to false is more logical.
Once nullable columns and/or NULL
s come to play, what seems logical is not what boolean algebra defines.