SORMAS-Project icon indicating copy to clipboard operation
SORMAS-Project copied to clipboard

Rewrite database search for similar persons

Open StefanKock opened this issue 2 years ago • 2 comments

Problem Description

Findings on PersonService.getSimilarPersonDtos by #8697:

  1. The first query in PersonService.getSimilarPersonDtos fetches Person entities with secondary queries for sub-entities (at least Location, PersonContactDetail) to then convert to SimilarPersonDto.
  2. The second query in PersonService.getInJurisdictionIDs checks if the given Person.ids are inJurisdictionOrOwned with where clause. Two problems: IN clause with a lot of values tend to be inefficient. If it receives more than ~32k values, it will run into the parameter limit.

Proposed Change

  1. Write new method as PersonFacadeEjb.getSimilarPersons.
  2. Aside from fixing #8747, run person query for each Core entity and reduce duplicates in Java afterwards (Set) instead of joins Person x Case x Contact x EventParticipant x Immunization x TravelEntry.
  3. Use inJurisdictionOrOwned in the initial WHERE clause to avoid secondary query with IN clause.
  4. Do not query for Person entities, directly load data into PersonIndexDto and remove SimilarPersonDto.
  5. Remove PersonService.getSimilarPersonDtos.

Possible Alternatives

  • Keep SimilarPersonDto

Additional Information

StefanKock avatar May 03 '22 12:05 StefanKock

Objection in the refinement: Running the similarity search several times might be more costly than running it over the cross join of all references. We should do measurements of the existing behaviour.

StefanKock avatar May 04 '22 11:05 StefanKock

Similarity search was likely already improved by #8747.

It's not clear whether the solution proposed above would actually improve performance.

From my point of view what is missing here is an explain analyze of the queries done, so we can see which parts exactly are facing problems.

Maybe the approach taken for #8946 can be applied here as-well.

MartinWahnschaffe avatar Aug 10 '22 12:08 MartinWahnschaffe