SORMAS-Project
SORMAS-Project copied to clipboard
Rewrite database search for similar persons
Problem Description
Findings on PersonService.getSimilarPersonDtos
by #8697:
- The first query in
PersonService.getSimilarPersonDtos
fetchesPerson
entities with secondary queries for sub-entities (at leastLocation
,PersonContactDetail
) to then convert toSimilarPersonDto
. - The second query in
PersonService.getInJurisdictionIDs
checks if the givenPerson.id
s areinJurisdictionOrOwned
with where clause. Two problems: IN clause with a lot of values tend to be inefficient. If it receives more than ~32k values, it will run into the parameter limit.
Proposed Change
- Write new method as
PersonFacadeEjb.getSimilarPersons
. - Aside from fixing #8747, run person query for each Core entity and reduce duplicates in Java afterwards (
Set
) instead of joinsPerson x Case x Contact x EventParticipant x Immunization x TravelEntry
. - Use
inJurisdictionOrOwned
in the initial WHERE clause to avoid secondary query with IN clause. - Do not query for
Person
entities, directly load data intoPersonIndexDto
and removeSimilarPersonDto
. - Remove
PersonService.getSimilarPersonDtos
.
Possible Alternatives
- Keep
SimilarPersonDto
Additional Information
Objection in the refinement: Running the similarity search several times might be more costly than running it over the cross join of all references. We should do measurements of the existing behaviour.
Similarity search was likely already improved by #8747.
It's not clear whether the solution proposed above would actually improve performance.
From my point of view what is missing here is an explain analyze of the queries done, so we can see which parts exactly are facing problems.
Maybe the approach taken for #8946 can be applied here as-well.