DSpace icon indicating copy to clipboard operation
DSpace copied to clipboard

Optimize SQL query in findByEPerson to enhance performance

Open toniprieto opened this issue 1 month ago • 3 comments

Description

It updates the findByEPerson method in the GroupDAOImpl class to improve SQL query performance. The previous HQL statement has been rewritten to use a more efficient JOIN query when retrieving groups linked to a specific EPerson.

This issue was discovered recently during a period of high user activity in an instance with a large number of users and groups, where multiple concurrent connections were observed. While monitoring the database, I noticed many active and slow-running queries similar to the following:

select g1_0.uuid, g1_1.uuid, g1_0.eperson_group_id, g1_0.name, g1_0.permanent
from public.epersongroup g1_0
join public.dspaceobject g1_1 on g1_0.uuid = g1_1.uuid
where (
    select e1_0.uuid
    from public.eperson e1_0
    join public.dspaceobject e1_1 on e1_0.uuid = e1_1.uuid
    where e1_0.uuid = $1
) in (
    select e2_0.eperson_id
    from public.epersongroup2eperson e2_0
    join (public.eperson e2_1
          join public.dspaceobject e2_2 on e2_1.uuid = e2_2.uuid)
      on e2_1.uuid = e2_0.eperson_id
    where g1_0.uuid = e2_0.eperson_group_id
)

These subquery-based statements caused unnecessary overhead and contributed to poor performance during peak load.

Instructions for Reviewers

List of changes in this PR:

  • Replaced the old subquery-based HQL:
from Group where (from EPerson e where e.id = :eperson_id) in elements(epeople)

with a more performant join-based query:

select distinct g from Group g join g.epeople ep where ep.id = :eperson_id

Checklist

  • [x] My PR is created against the main branch of code (unless it is a backport or is fixing an issue specific to an older branch).
  • [x] My PR is small in size (e.g. less than 1,000 lines of code, not including comments & integration tests). Exceptions may be made if previously agreed upon.
  • [x] My PR passes Checkstyle validation based on the Code Style Guide.
  • [x] My PR includes Javadoc for all new (or modified) public methods and classes. It also includes Javadoc for large or complex private methods.
  • [ ] My PR passes all tests and includes new/updated Unit or Integration Tests based on the Code Testing Guide.
  • [ ] My PR includes details on how to test it. I've provided clear instructions to reviewers on how to successfully test this fix or feature.
  • [x] If my PR includes new libraries/dependencies (in any pom.xml), I've made sure their licenses align with the DSpace BSD License based on the Licensing of Contributions documentation.
  • [x] If my PR modifies REST API endpoints, I've opened a separate REST Contract PR related to this change.
  • [x] If my PR includes new configurations, I've provided basic technical documentation in the PR itself.
  • [x] If my PR fixes an issue ticket, I've linked them together.

toniprieto avatar Oct 29 '25 14:10 toniprieto

Thanks @toniprieto. This sounds reasonable. Is this testable in any way? We also have a large number of users and groups, so it would be interesting if we could quantify the before and after. I don't know if it's possible to run HQL directly...

alanorth avatar Oct 30 '25 06:10 alanorth

Great @alanorth ! I was just looking into the best way to reproduce this for testing. By enabling Hibernate’s show_sql option, I can see that the new query being executed is:

select distinct g1_0.uuid, g1_1.uuid, g1_0.eperson_group_id, g1_0.name, g1_0.permanent 
from public.epersongroup g1_0 
join public.dspaceobject g1_1 on g1_0.uuid = g1_1.uuid 
join public.epersongroup2eperson e1_0 on g1_0.uuid = e1_0.eperson_group_id 
where e1_0.eperson_id = ?

The current query is the one present in the ticket description. Using the UUID of a user who belongs to multiple groups, you should be able to test it.

This function is usually called before running a Solr search with a non-admin user. In a test instance, it’s hard to notice a big improvement because the query tends to be cached. However, I’ve observed a clearer improvement when multiple users are logging in and the system is under higher load.

toniprieto avatar Oct 30 '25 14:10 toniprieto

@toniprieto thanks. I tested both queries with a user who belongs to 307 groups and the result set was the same, though I don't think it's easy to test the performance difference.

alanorth avatar Nov 04 '25 06:11 alanorth