android-fhir icon indicating copy to clipboard operation
android-fhir copied to clipboard

Bypass join to `TokenIndexEntity` table for SearchParameter `_id`

Open LZRS opened this issue 1 year ago • 8 comments

Is your feature request related to a problem? Please describe. Searching for resources given multiple ids with filter _id, does a join to the TokenIndexEntity, that may seem to incur an extra performance cost

Describe the solution you'd like When search is done using filter in the form

fhirEngine.search<Patient>{
        filter(Patient.RES_ID, { value = of("1234") }, { value = of("23456") })
}

the query generated is of the form of

SELECT a.resourceUuid, a.serializedResource
 FROM ResourceEntity a
WHERE a.resourceType = 'Patient'
         AND a.resourceUuid IN (
SELECT resourceUuid FROM TokenIndexEntity
WHERE resourceType = 'Patient' AND index_name = '_id' AND (index_value = '1234' OR index_value = '23456')
 )

The suggestion we had was if we could bypass joining to the TokenIndexEntity table for the _id , and instead filter directly from the ResourceEntity table. The resulting query would therefore look similar to

SELECT a.resourceUuid, a.serializedResource
 FROM ResourceEntity a
WHERE a.resourceType = 'Patient' 
         AND a.resourceId  IN ('1234', '23456')
 )

Additional context Link to RES_ID constant

Describe alternatives you've considered Enhancing the fhirEngine.get interface to support multiple resourceId to fetch multiple resources, although this wouldn't work with the rest of the Search DSL

Would you like to work on the issue? Please state if this issue should be assigned to you or who you think could help to solve this issue.

LZRS avatar Nov 14 '24 13:11 LZRS

Thanks for raising this @LZRS. This is a valid request. However the SQL generation logic in engine library is generic to deal with all kinds of keys. So implementing the "bypass TokenIndexEntity" above would require to write custom logic for the RES_ID key.

Let us evaluate if this is possible. @jingtang10 @aditya-07

MJ1998 avatar Nov 15 '24 10:11 MJ1998

Thanks for raising this @LZRS. This is a valid request. However the SQL generation logic in engine library is generic to deal with all kinds of keys.

@LZRS Would an overloaded suspend fun get(type: ResourceType, vararg ids: String): Resource be better fit for this usecase?

aditya-07 avatar Nov 18 '24 11:11 aditya-07

@LZRS Would an overloaded suspend fun get(type: ResourceType, vararg ids: String): Resource be better fit for this usecase?

Yeah, that would work although we have some usecases that would require Search DSL, for example getting Patient's Observations and Conditions using revInclude

Example

fhirEngine.search<Patient>{
        filter(Patient.RES_ID, { value = of("1234") }, { value = of("23456") })
        revInclude<Observation>(Observation.SUBJECT)
        revInclude<Condition>(Condition.SUBJECT)
}

LZRS avatar Nov 18 '24 14:11 LZRS

@aditya-07 @dubdabasoduba @google/labs-ai-sdk-contribs we discussed this, the change should happen here: https://github.com/google/android-fhir/blob/b59acf288b38731be1472c0523d7cd1e4828a042/engine/src/main/java/com/google/android/fhir/search/MoreSearch.kt#L368

add a shortcut so that filtering by id goes through a different codepath.

jingtang10 avatar Nov 19 '24 14:11 jingtang10

@LZRS Would an overloaded suspend fun get(type: ResourceType, vararg ids: String): Resource be better fit for this usecase?

Yeah, that would work although we have some usecases that would require Search DSL, for example getting Patient's Observations and Conditions using revInclude

Example

fhirEngine.search<Patient>{
        filter(Patient.RES_ID, { value = of("1234") }, { value = of("23456") })
        revInclude<Observation>(Observation.SUBJECT)
        revInclude<Condition>(Condition.SUBJECT)
}

have you tried not using revinclude and just search observations and conditions with a reference to the known patient?

i guess in the applicaiton you want one query to load all the observations and conditions? is that right?

jingtang10 avatar Nov 20 '24 20:11 jingtang10

@LZRS Would an overloaded suspend fun get(type: ResourceType, vararg ids: String): Resource be better fit for this usecase?

Yeah, that would work although we have some usecases that would require Search DSL, for example getting Patient's Observations and Conditions using revInclude Example

fhirEngine.search<Patient>{
        filter(Patient.RES_ID, { value = of("1234") }, { value = of("23456") })
        revInclude<Observation>(Observation.SUBJECT)
        revInclude<Condition>(Condition.SUBJECT)
}

have you tried not using revinclude and just search observations and conditions with a reference to the known patient?

i guess in the applicaiton you want one query to load all the observations and conditions? is that right?

No, we hadn't tried that. But I believe that would then require multiple queries, although they wouldn't probably be more than 5....yeah, I think we could try that

In that case, implementing this would just be okay I think suspend fun get(type: ResourceType, vararg ids: String): Resource

LZRS avatar Nov 21 '24 07:11 LZRS

@jingtang10 @aditya-07 Please re-assign this to @qiarie

dubdabasoduba avatar Nov 25 '24 07:11 dubdabasoduba

just to confirm, what we're suggesting as a fix is:

val patients = fhirEngine.get<Patient>("1", "2", ...) // this uses Simon's new API
...
val observations = fhirEngine.search<Observations> {
  filter( ... )
}

there shouldn't be a performance penalty because the search api implementation run these queries separately anyway (the joins would have been massive)

jingtang10 avatar Dec 09 '24 13:12 jingtang10