android-fhir icon indicating copy to clipboard operation
android-fhir copied to clipboard

Adds `resourceUuid` to the multi-column index in ReferenceIndexEntity

Open LZRS opened this issue 1 year ago • 3 comments

IMPORTANT: All PRs must be linked to an issue (except for extremely trivial and straightforward changes).

Fixes #[issue number]

Description Adds resourceUuid to the multi-column index in ReferenceIndexEntity table to make it a covering index

Alternative(s) considered Have you considered any alternatives? And if so, why have you chosen the approach in this PR?

Type Choose one: (Bug fix | Feature | Documentation | Testing | Code health | Builds | Releases | Other)

Screenshots (if applicable)

Checklist

  • [ ] I have read and acknowledged the Code of conduct.
  • [ ] I have read the Contributing page.
  • [ ] I have signed the Google Individual CLA, or I am covered by my company's Corporate CLA.
  • [ ] I have discussed my proposed solution with code owners in the linked issue(s) and we have agreed upon the general approach.
  • [ ] I have run ./gradlew spotlessApply and ./gradlew spotlessCheck to check my code follows the style guide of this project.
  • [ ] I have run ./gradlew check and ./gradlew connectedCheck to test my changes locally.
  • [ ] I have built and run the demo app(s) to verify my change fixes the issue and/or does not break the demo app(s).

LZRS avatar Nov 28 '24 08:11 LZRS

Previously, the sql query

SELECT *
FROM (SELECT rie.index_name, rie.index_value, re.serializedResource
      FROM ResourceEntity re
               JOIN ReferenceIndexEntity rie
                    ON re.resourceUuid = rie.resourceUuid
      WHERE rie.resourceType = 'Encounter'
        AND rie.index_name = 'service-provider'
        AND rie.index_value IN ('Organization/2c29c69f-c2d1-463f-a4b2-d90a5c2fd05d')
        AND re.resourceType = 'Encounter')

generated the query plan

QUERY PLAN
|--SEARCH rie USING INDEX index_ReferenceIndexEntity_resourceType_index_name_index_value (resourceType=? AND index_name=? AND index_value=?)
`--SEARCH re USING INDEX index_ResourceEntity_resourceUuid (resourceUuid=?)

Testing with a database that has 166293 resources and 137517 encounters, the above query could take

Run Time: real 6.171 user 0.472104 sys 0.761056

returning 137517 rows

The changes in this PR, would generate query plan

QUERY PLAN
|--SEARCH rie USING COVERING INDEX index_ReferenceIndexEntity_resourceType_index_name_index_value_resourceUuid (resourceType=? AND index_name=? AND index_value=?)
`--SEARCH re USING INDEX index_ResourceEntity_resourceUuid (resourceUuid=?)

using a covering index

Testing with a database that has 166293 resources and 137517 encounters, the above query takes around

Run Time: real 5.623 user 0.455778 sys 0.804132

returning 137517 rows

LZRS avatar Dec 18 '24 11:12 LZRS

I thought adding a non-filter key to the multi-column index would not improve the performance. This is a surprise for me.

I have more questions around why our query is like this

SELECT *
FROM (SELECT rie.index_name, rie.index_value, re.serializedResource
      FROM ResourceEntity re
               JOIN ReferenceIndexEntity rie
                    ON re.resourceUuid = rie.resourceUuid
      WHERE rie.resourceType = 'Encounter'
        AND rie.index_name = 'service-provider'
        AND rie.index_value IN ('Organization/2c29c69f-c2d1-463f-a4b2-d90a5c2fd05d')
        AND re.resourceType = 'Encounter')

Since our resourceUuid is unique across all ResourceEntity do we really need resourceType filter ?

MJ1998 avatar Dec 20 '24 07:12 MJ1998

I thought adding a non-filter key to the multi-column index would not improve the performance. This is a surprise for me.

The non-filter column resourceUuid added to the multi-column index would make the index covering, and would return resouceUuid column directly from the index, and thus avoiding the step of going back to the actual rows to get the resourceUuid which slightly improved the performance

SELECT *
FROM (SELECT rie.index_name, rie.index_value, re.serializedResource
      FROM ResourceEntity re
               JOIN ReferenceIndexEntity rie
                    ON re.resourceUuid = rie.resourceUuid
      WHERE rie.resourceType = 'Encounter'
        AND rie.index_name = 'service-provider'
        AND rie.index_value IN ('Organization/2c29c69f-c2d1-463f-a4b2-d90a5c2fd05d')
        AND re.resourceType = 'Encounter')

Since our resourceUuid is unique across all ResourceEntity do we really need resourceType filter ?

Yeah, I agree. For this type of query, it makes sense to remove the re.resourceType filter

LZRS avatar Dec 20 '24 12:12 LZRS