FHIR
FHIR copied to clipboard
Consider disabling conditional interactions when remote indexing is enabled
Describe the bug When remote (async) indexing is enabled, the server accepts and stores resources in a transaction prior to writing the corresponding search parameters. This speeds up ingestion at the cost of data consistency. Specifically, there is a window of time between data ingestion and the storing of search parameters such that a resource may exist but not be searchable.
The specification [loosely] defines a fully asynchronous mechanism for ingestion, but unless I'm mistaken it doesn't even mention the possibility servers being "eventually consistent" in this manner wrt search. Therefor it should be considered experimental at this time.
Relatedly, the specification defines (and we implement) a flavor of the normal REST interactions that is based on FHIR search and may assume data consistency.
The LinuxForHealth FHIR server (and most others') implementation for these interactions already cannot provide guarantees on uniqueness when multiple requests are processed in parallel (https://github.com/LinuxForHealth/FHIR/issues/2051). However, when remote indexing is enabled, this issue is exacerbated by this additional window of time between 'ingested' and 'searchable'.
Environment main
To Reproduce Steps to reproduce the behavior:
- configure the server for remote indexing but do not run the remote indexing agent
- POST a resource with an identifier value of "xyz123"
- perform a conditional create (e.g. POST with If-None-Exist: identifier=xyz123)
Expected behavior In an ideal implementation, it would be strongly consistent and the conditional create would not succeed. However, since we've traded that consistency away, it would be nice if we either prevent this from happening or at least provide a warning in the response.
Additional context In the case of an update, when remote indexing is enabled, I believe the current implementation removes the existing search parameters and they are not added back until the remote indexing is completed for this resource. An alternative we considered is leaving the stale search parameters around. However, unless we make a major change to the way those are stored, that would result in certain searches retrieving the new version of the resource when it should not (i.e. version 1 of the resource matched the query, but version 2 does not and yet we return it anyway).
Originally I wrote it up for a resource with id "123" and a conditional create with If-None-Exist: _id=123
but Robin pointed out that this WOULD work because _id
is a parameter that we serve from the resource tables.
We also serve _lastUpdated
from those tables and so that would work like it does today as well.
Robin also pointed out that if conditional create by identifier is the most common use-case, then we could make a compromise and store identifiers as part of the transaction but leave the other values for remote processing.
And if we wanted to get really fancy, we could allow the user to configure it (somewhat similar to search parameter filtering)
Although many resources can share the same _lastUpdated value so it is not a good candidate for testing the existence of a resource.
If we get to #3774 then that could alleviate some of these concerns (depending how its implemented)...at least for conditional interactions that rely on identifiers
Proposal: introduce a config parameter to control the behavior of the server in this scenario.
We need to remember where async indexing is configured and see if it makes sense to introduce a new parameter there.
Currently, that config is global (not per-tenant), and so we'd probably want this new parameter to be so also.
For example, fhirServer/remoteIndexService/conditionalInteractionBehavior
with allowed values allow
, warn
, or error
Alternatively, include under fhirServer/resources
like
"conditionalInteractionBehavior": "allow | warn | error | errorOnAsync"
where "errorOnAsync" is a special value that means the conditional actions are normally allowed but NOT ALLOWED if asynchronous indexing is configured.