spring-data-jpa
                                
                                
                                
                                    spring-data-jpa copied to clipboard
                            
                            
                            
                        Sorting doesn't work when using an alias on two or more functions
As per the comments on this closed issue, I'm opening a new issue, because sorting still doesn't work for aliases in all cases.
The regex patterns in org.springframework.data.jpa.repository.query.QueryUtils are not broad enough.
The FUNCTION_PATTERN in QueryUtils looks like this:
builder = new StringBuilder(); // any function call including parameters within the brackets builder.append("\\w+\\s*\\([\\w\\.,\\s'=]+\\)"); // the potential alias builder.append("\\s+[as|AS]+\\s+(([\\w\\.]+))"); FUNCTION_PATTERN = compile(builder.toString());which only considers a single pair of parenthesis, but not two or more. So, for example, the earlierstBundleStart is not detected in this query:
SELECT DISTINCT(event.id) as id, event.name as name, event.top_event as topEvent, event.ranking as ranking, MIN(bundle.base_price_amount) as cheapestBundlePrice, MIN(DATE(bundle.start)) as earliestBundleStart, ..
There was also a question on SO about this here, where many people state that they are facing the same issue.
There are and have been several issues (at least https://github.com/spring-projects/spring-data-jpa/issues/1404, https://github.com/spring-projects/spring-data-jpa/issues/1724, https://github.com/spring-projects/spring-data-jpa/issues/1919) with the regex patterns in QueryUtils. I'm wondering, isn't there any more reliable way to parse the queries than with a regex (possibly an AST that is used also to execute the query)?
Edit: here's another closely related issue: https://github.com/spring-projects/spring-data-jpa/issues/2079
I drafted this test case to check things out...
@Test
void foo() {
    String query = "SELECT MIN(bundle.base_price_amount) as cheapestBundlePrice, MIN(DATE(bundle.start)) as earliestBundleStart from example_table";
    Sort sort = Sort.by("earliestBundleStart");
    String fullQuery = applySorting(query, sort);
    assertThat(fullQuery).isEqualTo(
            "SELECT MIN(bundle.base_price_amount) as cheapestBundlePrice, MIN(DATE(bundle.start)) as earliestBundleStart from example_table order by earliestBundleStart asc");
}
Funny thing is, it passed with no issue. But I put a breakpoint inside QueryUtils to see when it would pattern match on FUNCTION_PATTERN. As is, it only catches the alias for the single-level function call (cheapestBundlePrice). Not quite sure what a proper test case would be
I tweaked FUNCTION_PATTERN to builder.append("\\w+\\s*\\(+[\\w\\.,\\s'=]+\\)+"); (notice the extra + next to open and closed parens), and the same breakpoint showed it now catching the earliestBundleStart alias.
But of course we know that this pattern is insufficient based on your hint of an AST instead of regex solution. It would indeed capture such patterns, but it won't ensure a proper grammar. Regex isn't qualified to handle stacking of symbols.
And that's where I'm not yet sold on needing an AST with a lexical analyzer and a proper grammar. I have submitted #2399 to properly handle ignoring case on aliased fields. And it doesn't involve digging into the regex patterns.
I'd frankly prefer solving as many of these aliasing issues as possible before undertaking a complete rewrite based on grammar parsing. Not sure how much bang we'd get for spending such a large buck. Let alone the new level of maintenance for anyone having to maintain it in the future.
Thank you for looking into this. I tried your test case and indeed it passes. It seems the problem only appears on a slightly more complicated case:
@Test
void foo() {
    String query = "SELECT DISTINCT(event.id) as id, event.name as name, MIN(bundle.base_price_amount) as cheapestBundlePrice, MIN(DATE(bundle.start)) as earliestBundleStart, FROM event event LEFT JOIN bundle bundle ON event.id = bundle.event_id GROUP BY event.id";
    Sort sort = Sort.by(Sort.Direction.ASC, "cheapestBundlePrice")
            .and(Sort.by(Sort.Direction.ASC, "earliestBundleStart"))
            .and(Sort.by(Sort.Direction.ASC, "name"));
    String fullQuery = QueryUtils.applySorting(query, sort);
    assertThat(fullQuery).isEqualTo(
            query + " order by cheapestBundlePrice asc, earliestBundleStart asc, name asc");
}
In this case, applySorting prepends event. to earliestBundleStart, ending up with " order by cheapestBundlePrice asc, event.earliestBundleStart asc, name asc") although earliestBundleStart is an alias and not a column of the event table.
In applySorting, where the order by is appended,
https://github.com/spring-projects/spring-data-jpa/blob/8c4123f6b2ac40dd2a97a7c267668e888c73e022/src/main/java/org/springframework/data/jpa/repository/query/QueryUtils.java#L265-L271
earliestBundleStart is not in selectionAliases. That in turn seems to mess with the logic in getOrderClause, which decides to prepend the root alias (in this case event) or not (based on some logic that I don't fully understand yet).
You want to revisit this in light of https://github.com/spring-projects/spring-data-jpa/commit/1336809b302eb1c4b4e8963e5465e53694f7497d.
Thanks for the pointer. I tried with the latest QueryUtils from that commit, unfortunately, it did not change anything about the test case posted above.
Hi everyone, Is there any new info on this issue? This is actually blocking us from upgrading to spring boot 2.7 since this breaks a lot of queries in our codebase because spring wants to sort by the wrong column. An example from our codebase (obfuscated):
SELECT 
        t1.*,
	t2join.t2_count AS t2Count,
	t3join.t3_count AS t3Count
	FROM table1 AS t1
		LEFT JOIN (
			SELECT t2.column2, count(*) AS t2_count
			FROM table2 t2
			GROUP BY t2.column2
		) AS t2join ON t1.id = t2join.column2
		LEFT JOIN (
			SELECT t3.column3, count(*) AS t3_count
			FROM table3 t3
			GROUP BY t3.column3
		) AS t3join ON t1.id = t3join.column3
	WHERE t1.column1 = :value
We call this with a Pageable which contains a Sort.by(Direction.DESC, "id"). Previously this sorted by t1.id and everything worked fine. Now it tries to sort by t3.id what failed for obvious reasons.
It would be great if this could be fixed for the next release.
No news that I know of. If you're constructing the Pageable manually, you can try this as a workaround: JpaSort.unsafe(Sort.Direction.DESC, "(id)"). This has been working for us, although, I don't know if it works with a wildcard in select, you might need t1.id AS id for that to work.
@darioseidl thanks for pointing out JpaSort.unsafe... Didn't know that yet.
But I am always hesitating to use something with the name "unsafe". :D
Ah, yes, that name doesn't sound too comforting, does it? What it means is that
the String provided is not necessarily a property but can be an arbitrary expression piped into the query execution.
(as it says here)
As long as you hard-code the string and make sure that it is just a property or alias, it is still safe. It would be unsafe to accept user input and pass it to JpaSort.unsafe, because then a user could do some SQL injection. (That's what I meant by, "if you're constructing the Pageable manually", but I could have been more clear about that).
So JpaSort.unsafe(Sort.Direction.DESC, "(someColumnOrAlias)") is safe, but not something like JpaSort.unsafe(Sort.Direction.DESC, userSortRequestParam).
it looks like there is a missing test case in commit mentioned by @gregturn , in my case i have such SQL query:
 select u  from UserAccountEntity u  join fetch u.lossInspectorLimitConfiguration lil join fetch u.companyTeam ct where exists (select iu from UserAccountEntity  iu  join iu.roles u2r  join u2r.role r  join r.rights r2r  join r2r.right rt  where rt.code = :rightCode  and iu = u   ) and ct.id = :teamId
and if I add sort by id:
Sort.by(ASC, UserAccountEntity_.LAST_NAME, UserAccountEntity_.FIRST_NAME, UserAccountEntity_.ID)
it tries to sort by iu instead of u.
The commit mentioned by @gregturn does not cover a select in where nor a select in join.
The issue happens after updating to spring boot 2.7.0
Ok I see, the QueryUtils.STARTS_WITH_PAREN does not handle newlines in subselects. I'll try to make a MR for that.
private static final Pattern PARENS_TO_REMOVE = Pattern.compile("(\\(.*\\bfrom\\b[^)]+\\))", CASE_INSENSITIVE | DOTALL); 
the DOTALL needs to be added there
We are working on a parsing solution to better handle such situations than QueryUtils accommodates. For the meantime, we are trying to avoid any more changes to QueryParser.
This looks like a candidate for #2863.
If you check https://github.com/spring-projects/spring-data-jpa/commit/39e12eae6d64ba7192835e052a082b93b287d911, you should see that aliasing is now properly handled in sorts, and that the query at the top is included as a test case.
Feel free to check out Spring Data JPA 3.1.0-SNAPSHOT and test it out for yourself.
I'm closing this as being resolved by the commit in the previous message. If this problem persists, feel free to open a new ticket.