The 'µ' character gets garbled after conversion from HL7 to FHIR
Describe the bug
When sending an ORU message to the waters API in staging, we're consistently seeing that all µ characters in the message get garbled into �� right after the HL7 to FHIR conversion step. The character is part of µmol/L unit in OBX-6.
We're sending the HL7 message with UTF-8 charset encoding and the µ is correctly shown in the message sent.
It's important to note that in our tests running ReportStream locally in a Docker container, the issue doesn't happen, so we're not able to reproduce the bug locally.
Impact on ReportStream
One of our partners, UCSD, who we're preparing to go live with in the coming weeks, is receiving the µmol/L unit with this issue.
Steps to reproduce
To reproduce it, you can send an ORU like the one below to the waters API in staging. You can then check that the snapshot of the converted FHIR message in the ITEM_ACCEPTED event has the garbled characters instead of µ
MSH|^~\&|Sender Application^sender.test.com^DNS|Sender Facility^0.0.0.0.0.0.0.0^ISO|Receiver Application^0.0.0.0.0.0.0.0^ISO|Receiver Facility^simulated-hospital-id^DNS|20230101010000-0000||ORU^R01^ORU_R01|111111|D|2.5.1
PID|1||80008715^^^&NPI^MR||CDPH-FOUR^GIRL A MOMFOUR^^^^^B|||F||2076-8^Native Hawaiian or Other Pacific Islander||||||||||||2186-5^Not Hispanic or Latino||Y|1
NK1|1|CDPH-FOUR^MOMFOUR|MTH^Mother|123 TOWNE-CENTER DRIVE^^SAN-DIEGO^CA^92126^USA|^^^^^619^1231234|^^^^^858^2493690
ORC|RE|7171232842^FormNumber||189403712^HospOrdNumber||||||||^WILKINSON^LESLEY|||||||||UCSD JACOBS MEDICAL CENTER^^^^^^^^^R7XX| 2961DR YLLUT^^SAN DIEGO^CA^99999-9999
OBR|1|7171232842^FormNumber||53261-4^Amino acid newborn screen panel|||202402081450|||||||||^MILLEN^MARLENE||||||20240212103049|||F
OBX|1|NM|47633-3^Glycine [Moles/volume] in Dried blood spot^LN|1|0.5|µmol/L||N|||F|||20240212103049
Expected behavior
The µ character should be correctly encoded and the ORU should be delivered to our partner without this issue.
Additional context
We're tracking this bug on our board here: https://github.com/CDCgov/trusted-intermediary/issues/1229
Specifically, the character is "MICRO SIGN". You can see the information about the encoding on Wikipedia.
Here's a screenshot of the specific character and column in question from the aforementioned Wikipedia section.
We've successfully solved the issue by adding environment variable "JAVA_OPTS" to pdhdemo1 with value "-Dfile.encoding=UTF-8"
you can't set the JAVA_OPTS in the host.json file for azure functions, it has to be set in terraform as an env variable or as an environment variable directly in azure portal
We would like to coordinate adding this to the terraform files