FHIR icon indicating copy to clipboard operation
FHIR copied to clipboard

Support canonical units for quantity search of UCUM values

Open lmsurpre opened this issue 2 years ago • 6 comments

Is your feature request related to a problem? Please describe. From https://hl7.org/fhir/search.html#quantity

The search processor may choose to perform a search based on canonical units (e.g. any value where the units can be converted to a value in mg in the case above). For example, an observation may have a value of 23 mm/hr. This is equal to 0.23 m/hr. The search processer can choose to normalise all the values to a canonical unit such as 6.4e-6 m/sec, and convert search terms to the same units (m/sec). Such conversions can be performed based on the semantics defined in UCUM

But currently, our quantity search is ucum-naive...when a system or code are provided it will only return values with those specific units.

Describe the solution you'd like

  1. Use https://github.com/FHIR/Ucum-java to normalize quantities with UCUM units during parameter extraction and always store the values with their canonical units
  2. When a search is performed with units, convert whatever the requested unit was to

Possible issue: the spec says you are still supposed to be able to respond to requests like this one GET [base]/Observation?value-quantity=5.4 and so we'd probably need to store the original value as well as the normalized one. I think we could keep our schema table exactly the same and simply write two rows instead of one (one normalized and one "original").

Describe alternatives you've considered

Acceptance Criteria

  1. GIVEN [a precondition] AND [another precondition] WHEN [test step] AND [test step] THEN [verification step] AND [verification step]

Additional context Possibly do at the same time as https://github.com/LinuxForHealth/FHIR/issues/1444

lmsurpre avatar Jul 24 '22 14:07 lmsurpre

what about storing only normalized number and convert to unit in extraction as we know the normalized unit lets say mg .. when the code stored value with g just convert up to the stored unit this also would facilitate searching the database

alihbuzaid avatar Jul 25 '22 08:07 alihbuzaid

Sorry, that idea/proposal is not clear to me.

When a resource is created/updated, we store:

  1. the resource payload as a JSON blob. other than adding server-assigned meta info, the original resource is not modified
  2. a collection of search parameter values that are inserted into tables that assist with implementing FHIR search..this process is what I refer to as "search parameter extraction" (or sometimes just "indexing")

The issue I tried to highlight in the description is that if we store only the normalized value for number 2, then implementing a query for the following case becomes impossible without consulting the original resource blob (which would be incredibly slow):

Search Description
GET [base]/Observation?value-quantity=5.4 Search for all the observations with a value of 5.4(+/-0.05) irrespective of the unit

This assumes that "irrespective of the unit" means its supposed to search on the raw value of the Quantity in the actual resource instance (and not some "normalized" variant).

So I think we'd need to either alter our schema (e.g. to add a column for the "original value" from the resource for each observation) or extract 2 different rows for each ucum quantity that is in a non-canonical unit...we just need to ensure we only use the canonical ones for the $stats operation. We try to limit schema changes which is why the latter seems more appealing to me. Will ask @punktilious to weigh in as well.

lmsurpre avatar Jul 25 '22 12:07 lmsurpre

this is the definition of the Quantity element in FHIR

{
  "value" : <decimal>, // Numerical value (with implicit precision)
  "comparator" : "<code>", // < | <= | >= | > - how to understand the value
  "unit" : "<string>", // Unit representation
  "system" : "<uri>", // C? System that defines coded unit form
  "code" : "<code>" // Coded form of the unit
}

not sure if there is a convention on which units to be standard or at least defined by the server maintainers , so when ever you save an observation to be saved with that unit in mind regardless of the input unit which I believe would also be difficult to be done !

Hemoglobin-normal-values

if we had the hg levels as an example ... if an input of g/L was posted to the server , should it be stored as is or should it be converted from the get go to the standard units of the server ? this is what I was saying if the stored Values are all the same would be easier to search doing simply something like this GET [base]/Observation?value-quantity=5.4 would not be feasible in the real life this is more like a real world example GET [base]/Observation?component-value-quantity=ge180&component-code=8480-6&patient.identifier=2801

example output : <component> <code> <coding> <system value="http://loinc.org"/> <code value="8480-6"/> <display value="Systolic blood pressure"/> </coding> <coding> <system value="http://snomed.info/sct"/> <code value="271649006"/> <display value="Systolic blood pressure"/> </coding> </code> <valueQuantity> <value value="201.0"/> <unit value="mmHg"/> <system value="http://unitsofmeasure.org"/> <code value="mm[Hg]"/> </valueQuantity> </component>

so what my proposel is when storing a value of Hemoglobin that Level in a code [value + Unit ] the server would check what is the current standard unit and covert to that unit and store the standard converted unit ... so in $stats operation would be faster to be done and the convention would be clear in which unit that search performer would do and what expected out put units are

sorry for the complicated train of thoughts

alihbuzaid avatar Jul 25 '22 13:07 alihbuzaid

I think the question for me is whether GET [base]/Observation?value-quantity=5.4 implies any units. I personally don't like the idea of an ambiguous search (given this is health data). NASA's Mars Climate Orbiter comes to mind. If we can determine the "default" unit for a particular quantity parameter then it's relatively easy to convert and invoke the search with the appropriate values (if we store in normalized form).

I really don't like the idea of storing the value twice (in original and normalized form) because this impacts the size of the database, especially for high cardinality resources like Observation.

punktilious avatar Jul 25 '22 14:07 punktilious

Assuming that we would want a search like GET [base]/Observation?value-quantity=5.4 to search the originally specified value of the resources (irregardless of which unit they use), then I agree we'll need to store both the original value AND a new "normalized unit" value. As an optimization, we can omit the second one if the unit in the resource is already the normalized unit.

I really don't like the idea of storing the value twice (in original and normalized form) because this impacts the size of the database, especially for high cardinality resources like Observation.

We'll want to introduce a new config parameter so that we can toggle this behavior on/off. That way, users that need the current behavior (or prefer it for the reduced storage) can get it.

lmsurpre avatar Oct 18 '22 12:10 lmsurpre

hint: expand AbstractQuantitySearchTest (or duplicate it to new test class) and add "normalized unit" search tests.

lmsurpre avatar Oct 18 '22 12:10 lmsurpre