Optimize SQL query in findByEPerson to enhance performance
Description
It updates the findByEPerson method in the GroupDAOImpl class to improve SQL query performance.
The previous HQL statement has been rewritten to use a more efficient JOIN query when retrieving groups linked to a specific EPerson.
This issue was discovered recently during a period of high user activity in an instance with a large number of users and groups, where multiple concurrent connections were observed. While monitoring the database, I noticed many active and slow-running queries similar to the following:
select g1_0.uuid, g1_1.uuid, g1_0.eperson_group_id, g1_0.name, g1_0.permanent
from public.epersongroup g1_0
join public.dspaceobject g1_1 on g1_0.uuid = g1_1.uuid
where (
select e1_0.uuid
from public.eperson e1_0
join public.dspaceobject e1_1 on e1_0.uuid = e1_1.uuid
where e1_0.uuid = $1
) in (
select e2_0.eperson_id
from public.epersongroup2eperson e2_0
join (public.eperson e2_1
join public.dspaceobject e2_2 on e2_1.uuid = e2_2.uuid)
on e2_1.uuid = e2_0.eperson_id
where g1_0.uuid = e2_0.eperson_group_id
)
These subquery-based statements caused unnecessary overhead and contributed to poor performance during peak load.
Instructions for Reviewers
List of changes in this PR:
- Replaced the old subquery-based HQL:
from Group where (from EPerson e where e.id = :eperson_id) in elements(epeople)
with a more performant join-based query:
select distinct g from Group g join g.epeople ep where ep.id = :eperson_id
Checklist
- [x] My PR is created against the
mainbranch of code (unless it is a backport or is fixing an issue specific to an older branch). - [x] My PR is small in size (e.g. less than 1,000 lines of code, not including comments & integration tests). Exceptions may be made if previously agreed upon.
- [x] My PR passes Checkstyle validation based on the Code Style Guide.
- [x] My PR includes Javadoc for all new (or modified) public methods and classes. It also includes Javadoc for large or complex private methods.
- [ ] My PR passes all tests and includes new/updated Unit or Integration Tests based on the Code Testing Guide.
- [ ] My PR includes details on how to test it. I've provided clear instructions to reviewers on how to successfully test this fix or feature.
- [x] If my PR includes new libraries/dependencies (in any
pom.xml), I've made sure their licenses align with the DSpace BSD License based on the Licensing of Contributions documentation. - [x] If my PR modifies REST API endpoints, I've opened a separate REST Contract PR related to this change.
- [x] If my PR includes new configurations, I've provided basic technical documentation in the PR itself.
- [x] If my PR fixes an issue ticket, I've linked them together.
Thanks @toniprieto. This sounds reasonable. Is this testable in any way? We also have a large number of users and groups, so it would be interesting if we could quantify the before and after. I don't know if it's possible to run HQL directly...
Great @alanorth ! I was just looking into the best way to reproduce this for testing. By enabling Hibernate’s show_sql option, I can see that the new query being executed is:
select distinct g1_0.uuid, g1_1.uuid, g1_0.eperson_group_id, g1_0.name, g1_0.permanent
from public.epersongroup g1_0
join public.dspaceobject g1_1 on g1_0.uuid = g1_1.uuid
join public.epersongroup2eperson e1_0 on g1_0.uuid = e1_0.eperson_group_id
where e1_0.eperson_id = ?
The current query is the one present in the ticket description. Using the UUID of a user who belongs to multiple groups, you should be able to test it.
This function is usually called before running a Solr search with a non-admin user. In a test instance, it’s hard to notice a big improvement because the query tends to be cached. However, I’ve observed a clearer improvement when multiple users are logging in and the system is under higher load.
@toniprieto thanks. I tested both queries with a user who belongs to 307 groups and the result set was the same, though I don't think it's easy to test the performance difference.